CUB
|
DeviceRadixSort provides device-wide, parallel operations for computing a radix sort across a sequence of data items residing within global memory.
unsigned char
, int
, double
, etc. Although the direct radix sorting method can only be applied to unsigned integral types, BlockRadixSort is able to sort signed and floating-point types via simple bit-wise transformations that ensure lexicographic key ordering.CUB_CDP
macro in your compiler's macro definitions.uint32
keys. Performance plots for other scenarios can be found in the detailed method descriptions below.Definition at line 81 of file device_radix_sort.cuh.
Static Public Methods | |
template<typename Key , typename Value > | |
static CUB_RUNTIME_FUNCTION cudaError_t | SortPairs (void *d_temp_storage, size_t &temp_storage_bytes, DoubleBuffer< Key > &d_keys, DoubleBuffer< Value > &d_values, int num_items, int begin_bit=0, int end_bit=sizeof(Key)*8, cudaStream_t stream=0, bool debug_synchronous=false) |
Sorts key-value pairs into ascending order. More... | |
template<typename Key , typename Value > | |
static CUB_RUNTIME_FUNCTION cudaError_t | SortPairsDescending (void *d_temp_storage, size_t &temp_storage_bytes, DoubleBuffer< Key > &d_keys, DoubleBuffer< Value > &d_values, int num_items, int begin_bit=0, int end_bit=sizeof(Key)*8, cudaStream_t stream=0, bool debug_synchronous=false) |
Sorts key-value pairs into descending order. More... | |
template<typename Key > | |
static CUB_RUNTIME_FUNCTION cudaError_t | SortKeys (void *d_temp_storage, size_t &temp_storage_bytes, DoubleBuffer< Key > &d_keys, int num_items, int begin_bit=0, int end_bit=sizeof(Key)*8, cudaStream_t stream=0, bool debug_synchronous=false) |
Sorts keys into ascending order. More... | |
template<typename Key > | |
static CUB_RUNTIME_FUNCTION cudaError_t | SortKeysDescending (void *d_temp_storage, size_t &temp_storage_bytes, DoubleBuffer< Key > &d_keys, int num_items, int begin_bit=0, int end_bit=sizeof(Key)*8, cudaStream_t stream=0, bool debug_synchronous=false) |
Sorts keys into ascending order. More... | |
|
inlinestatic |
Sorts key-value pairs into ascending order.
d_temp_storage
is NULL, no work is done and the required allocation size is returned in temp_storage_bytes
.CUB_CDP
macro in your compiler's macro definitions.uint32,uint32
and uint64,uint64
pairs, respectively.int
keys with associated vector of int
values. Key | [inferred] Key type |
Value | [inferred] Value type |
[in] | d_temp_storage | Device allocation of temporary storage. When NULL, the required allocation size is written to temp_storage_bytes and no work is done. |
[in,out] | temp_storage_bytes | Reference to size in bytes of d_temp_storage allocation |
[in,out] | d_keys | Reference to the double-buffer of keys whose current buffer contains the unsorted input keys and, upon return, is updated to point to the sorted output keys |
[in,out] | d_values | Double-buffer of values whose current buffer contains the unsorted input values and, upon return, is updated to point to the sorted output values |
[in] | num_items | Number of items to reduce |
[in] | begin_bit | [optional] The first (least-significant) bit index needed for key comparison |
[in] | end_bit | [optional] The past-the-end (most-significant) bit index needed for key comparison |
[in] | stream | [optional] CUDA stream to launch kernels within. Default is stream0. |
[in] | debug_synchronous | [optional] Whether or not to synchronize the stream after every kernel launch to check for errors. Also causes launch configurations to be printed to the console. Default is false . |
Definition at line 144 of file device_radix_sort.cuh.
|
inlinestatic |
Sorts key-value pairs into descending order.
d_temp_storage
is NULL, no work is done and the required allocation size is returned in temp_storage_bytes
.CUB_CDP
macro in your compiler's macro definitions.int
keys with associated vector of int
values. Key | [inferred] Key type |
Value | [inferred] Value type |
[in] | d_temp_storage | Device allocation of temporary storage. When NULL, the required allocation size is written to temp_storage_bytes and no work is done. |
[in,out] | temp_storage_bytes | Reference to size in bytes of d_temp_storage allocation |
[in,out] | d_keys | Reference to the double-buffer of keys whose current buffer contains the unsorted input keys and, upon return, is updated to point to the sorted output keys |
[in,out] | d_values | Double-buffer of values whose current buffer contains the unsorted input values and, upon return, is updated to point to the sorted output values |
[in] | num_items | Number of items to reduce |
[in] | begin_bit | [optional] The first (least-significant) bit index needed for key comparison |
[in] | end_bit | [optional] The past-the-end (most-significant) bit index needed for key comparison |
[in] | stream | [optional] CUDA stream to launch kernels within. Default is stream0. |
[in] | debug_synchronous | [optional] Whether or not to synchronize the stream after every kernel launch to check for errors. Also causes launch configurations to be printed to the console. Default is false . |
Definition at line 227 of file device_radix_sort.cuh.
|
inlinestatic |
Sorts keys into ascending order.
d_temp_storage
is NULL, no work is done and the required allocation size is returned in temp_storage_bytes
.CUB_CDP
macro in your compiler's macro definitions.uint32
and uint64
keys, respectively.int
keys. Key | [inferred] Key type |
[in] | d_temp_storage | Device allocation of temporary storage. When NULL, the required allocation size is written to temp_storage_bytes and no work is done. |
[in,out] | temp_storage_bytes | Reference to size in bytes of d_temp_storage allocation |
[in,out] | d_keys | Reference to the double-buffer of keys whose current buffer contains the unsorted input keys and, upon return, is updated to point to the sorted output keys |
[in] | num_items | Number of items to reduce |
[in] | begin_bit | [optional] The first (least-significant) bit index needed for key comparison |
[in] | end_bit | [optional] The past-the-end (most-significant) bit index needed for key comparison |
[in] | stream | [optional] CUDA stream to launch kernels within. Default is stream0. |
[in] | debug_synchronous | [optional] Whether or not to synchronize the stream after every kernel launch to check for errors. Also causes launch configurations to be printed to the console. Default is false . |
Definition at line 306 of file device_radix_sort.cuh.
|
inlinestatic |
Sorts keys into ascending order.
d_temp_storage
is NULL, no work is done and the required allocation size is returned in temp_storage_bytes
.CUB_CDP
macro in your compiler's macro definitions.int
keys. Key | [inferred] Key type |
[in] | d_temp_storage | Device allocation of temporary storage. When NULL, the required allocation size is written to temp_storage_bytes and no work is done. |
[in,out] | temp_storage_bytes | Reference to size in bytes of d_temp_storage allocation |
[in,out] | d_keys | Reference to the double-buffer of keys whose current buffer contains the unsorted input keys and, upon return, is updated to point to the sorted output keys |
[in] | num_items | Number of items to reduce |
[in] | begin_bit | [optional] The first (least-significant) bit index needed for key comparison |
[in] | end_bit | [optional] The past-the-end (most-significant) bit index needed for key comparison |
[in] | stream | [optional] CUDA stream to launch kernels within. Default is stream0. |
[in] | debug_synchronous | [optional] Whether or not to synchronize the stream after every kernel launch to check for errors. Also causes launch configurations to be printed to the console. Default is false . |
Definition at line 383 of file device_radix_sort.cuh.