-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement ranges::single_view
#4255
base: main
Are you sure you want to change the base?
Conversation
🟨 CI finished in 1h 33m: Pass: 97%/162 | Total: 2d 04h | Avg: 19m 18s | Max: 1h 31m | Hits: 76%/242207
|
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
CUB | |
Thrust | |
CUDA Experimental | |
stdpar | |
python | |
CCCL C Parallel Library | |
Catch2Helper |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
+/- | CUB |
+/- | Thrust |
+/- | CUDA Experimental |
+/- | stdpar |
+/- | python |
+/- | CCCL C Parallel Library |
+/- | Catch2Helper |
🏃 Runner counts (total jobs: 162)
# | Runner |
---|---|
113 | linux-amd64-cpu16 |
15 | windows-amd64-cpu16 |
12 | linux-arm64-cpu16 |
8 | linux-amd64-gpu-rtx2080-latest-1 |
6 | linux-amd64-gpu-rtxa6000-latest-1 |
5 | linux-amd64-gpu-h100-latest-1 |
3 | linux-amd64-gpu-rtx4090-latest-1 |
🟨 CI finished in 1h 08m: Pass: 87%/162 | Total: 1d 05h | Avg: 10m 46s | Max: 1h 07m | Hits: 89%/203623
|
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
CUB | |
Thrust | |
CUDA Experimental | |
stdpar | |
python | |
CCCL C Parallel Library | |
Catch2Helper |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
+/- | CUB |
+/- | Thrust |
+/- | CUDA Experimental |
+/- | stdpar |
+/- | python |
+/- | CCCL C Parallel Library |
+/- | Catch2Helper |
🏃 Runner counts (total jobs: 162)
# | Runner |
---|---|
113 | linux-amd64-cpu16 |
15 | windows-amd64-cpu16 |
12 | linux-arm64-cpu16 |
8 | linux-amd64-gpu-rtx2080-latest-1 |
6 | linux-amd64-gpu-rtxa6000-latest-1 |
5 | linux-amd64-gpu-h100-latest-1 |
3 | linux-amd64-gpu-rtx4090-latest-1 |
bac1a0e
to
f7a4ca0
Compare
🟩 CI finished in 1h 08m: Pass: 100%/162 | Total: 1d 04h | Avg: 10m 27s | Max: 1h 07m | Hits: 88%/253852
|
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
CUB | |
Thrust | |
CUDA Experimental | |
stdpar | |
python | |
CCCL C Parallel Library | |
Catch2Helper |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
+/- | CUB |
+/- | Thrust |
+/- | CUDA Experimental |
+/- | stdpar |
+/- | python |
+/- | CCCL C Parallel Library |
+/- | Catch2Helper |
🏃 Runner counts (total jobs: 162)
# | Runner |
---|---|
113 | linux-amd64-cpu16 |
15 | windows-amd64-cpu16 |
12 | linux-arm64-cpu16 |
8 | linux-amd64-gpu-rtx2080-latest-1 |
6 | linux-amd64-gpu-rtxa6000-latest-1 |
5 | linux-amd64-gpu-h100-latest-1 |
3 | linux-amd64-gpu-rtx4090-latest-1 |
template <class _Tp> | ||
_CCCL_NODISCARD _LIBCUDACXX_HIDE_FROM_ABI constexpr bool __doesnt_need_empty_state() noexcept |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was originally a ternary, but nvcc 12.0 breaks with a ternary in a concept
🟩 CI finished in 1h 08m: Pass: 100%/162 | Total: 1d 03h | Avg: 10m 08s | Max: 1h 07m | Hits: 89%/253852
|
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
CUB | |
Thrust | |
CUDA Experimental | |
stdpar | |
python | |
CCCL C Parallel Library | |
Catch2Helper |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
+/- | CUB |
+/- | Thrust |
+/- | CUDA Experimental |
+/- | stdpar |
+/- | python |
+/- | CCCL C Parallel Library |
+/- | Catch2Helper |
🏃 Runner counts (total jobs: 162)
# | Runner |
---|---|
113 | linux-amd64-cpu16 |
15 | windows-amd64-cpu16 |
12 | linux-arm64-cpu16 |
8 | linux-amd64-gpu-rtx2080-latest-1 |
6 | linux-amd64-gpu-rtxa6000-latest-1 |
5 | linux-amd64-gpu-h100-latest-1 |
3 | linux-amd64-gpu-rtx4090-latest-1 |
🟨 CI finished in 1h 59m: Pass: 0%/162 | Total: 9h 46m | Avg: 3m 37s | Max: 22m 04s
|
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
CUB | |
Thrust | |
CUDA Experimental | |
stdpar | |
python | |
CCCL C Parallel Library | |
Catch2Helper |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
+/- | CUB |
+/- | Thrust |
+/- | CUDA Experimental |
+/- | stdpar |
+/- | python |
+/- | CCCL C Parallel Library |
+/- | Catch2Helper |
🏃 Runner counts (total jobs: 162)
# | Runner |
---|---|
113 | linux-amd64-cpu16 |
15 | windows-amd64-cpu16 |
12 | linux-arm64-cpu16 |
8 | linux-amd64-gpu-rtx2080-latest-1 |
6 | linux-amd64-gpu-rtxa6000-latest-1 |
5 | linux-amd64-gpu-h100-latest-1 |
3 | linux-amd64-gpu-rtx4090-latest-1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a few nits, otherwise 👍
// Primary template - uses _CUDA_VSTD::optional and introduces an empty state in case assignment fails. | ||
template <class _Tp, bool = __movable_box_object<_Tp>> | ||
struct __movable_box; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we just refuse types that have a potentially-throwing (noexcept(false)
) move constructor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really and it would create a ton of implementation divergence. Also we do claim to be fully standard conforming
#if !defined(_CCCL_NO_CONCEPTS) | ||
template <move_constructible _Tp> | ||
requires is_object_v<_Tp> | ||
#else // ^^^ !_CCCL_NO_CONCEPTS ^^^ / vvv _CCCL_NO_CONCEPTS vvv | ||
template <class _Tp, enable_if_t<move_constructible<_Tp>, int> = 0, enable_if_t<is_object_v<_Tp>, int> = 0> | ||
#endif // _CCCL_NO_CONCEPTS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about...
_CCCL_TEMPLATE(class _Tp)
_CCCL_REQUIRES(move_constructible<_Tp> _CCCL_AND is_object_v<_Tp>)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sadly does not work because the _CCCL_TEMPLATE
macro introduces a named type that would be duplicated in the type definition and any function
}; | ||
|
||
template <class _Tp> | ||
_CCCL_HOST_DEVICE single_view(_Tp) -> single_view<_Tp>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't usually put host/device annotations on deduction guides, do we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do in all of them to support clang-cuda
|
||
struct __fn : __range_adaptor_closure<__fn> | ||
{ | ||
_CCCL_TEMPLATE(class _Range) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the argument to single_view
could be anything, not necessarily a range.
_CCCL_TEMPLATE(class _Range) | ||
_CCCL_REQUIRES(__can_single_view<_Range>) // MSVC breaks without it | ||
_LIBCUDACXX_HIDE_FROM_ABI constexpr auto operator()(_Range&& __range) const | ||
noexcept(noexcept(single_view<decay_t<_Range>>(_CUDA_VSTD::forward<_Range>(__range)))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this simpler?
noexcept(_CUDA_VSTD::is_nothrow_constructible_v<_CUDA_VSTD::decay_t<_Range>, _Range>)
not really. but this comes up so often that i usually:
template <class... Ts>
concept __nothrow_decay_copyable =
(is_nothrow_constructible_v<decay_t<Ts>, Ts> &&...);
very handy, that.
🟩 CI finished in 8h 05m: Pass: 100%/162 | Total: 1d 11h | Avg: 13m 15s | Max: 1h 13m | Hits: 70%/254540
|
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
CUB | |
Thrust | |
CUDA Experimental | |
stdpar | |
python | |
CCCL C Parallel Library | |
Catch2Helper |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
+/- | CUB |
+/- | Thrust |
+/- | CUDA Experimental |
+/- | stdpar |
+/- | python |
+/- | CCCL C Parallel Library |
+/- | Catch2Helper |
🏃 Runner counts (total jobs: 162)
# | Runner |
---|---|
113 | linux-amd64-cpu16 |
15 | windows-amd64-cpu16 |
12 | linux-arm64-cpu16 |
8 | linux-amd64-gpu-rtx2080-latest-1 |
6 | linux-amd64-gpu-rtxa6000-latest-1 |
5 | linux-amd64-gpu-h100-latest-1 |
3 | linux-amd64-gpu-rtx4090-latest-1 |
This implements ranges::single_view as described here