CUB
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Groups
Class Hierarchy
This inheritance list is sorted roughly, but not completely, alphabetically:
[detail level 12]
oCcub::ArgIndexInputIterator< InputIteratorT, OffsetT >A random-access input wrapper for pairing dereferenced values with their corresponding indices (forming KeyValuePair tuples)
oCcub::ArgMaxArg max functor (keeps the value and offset of the first occurrence of the larger item)
oCcub::ArgMinArg min functor (keeps the value and offset of the first occurrence of the smallest item)
oCcub::BlockDiscontinuity< T, BLOCK_DIM_X, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH >The BlockDiscontinuity class provides collective methods for flagging discontinuities within an ordered set of items partitioned across a CUDA thread block.

discont_logo.png
oCcub::BlockExchange< T, BLOCK_DIM_X, ITEMS_PER_THREAD, WARP_TIME_SLICING, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH >The BlockExchange class provides collective methods for rearranging data partitioned across a CUDA thread block.

transpose_logo.png
oCcub::BlockHistogram< T, BLOCK_DIM_X, ITEMS_PER_THREAD, BINS, ALGORITHM, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH >The BlockHistogram class provides collective methods for constructing block-wide histograms from data samples partitioned across a CUDA thread block.

histogram_logo.png
oCcub::BlockLoad< InputIteratorT, BLOCK_DIM_X, ITEMS_PER_THREAD, ALGORITHM, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH >The BlockLoad class provides collective data movement methods for loading a linear segment of items from memory into a blocked arrangement across a CUDA thread block.

block_load_logo.png
oCcub::BlockRadixSort< KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH >The BlockRadixSort class provides collective methods for sorting items partitioned across a CUDA thread block using a radix sorting method.

sorting_logo.png
oCcub::BlockReduce< T, BLOCK_DIM_X, ALGORITHM, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH >The BlockReduce class provides collective methods for computing a parallel reduction of items partitioned across a CUDA thread block.

reduce_logo.png
oCcub::BlockScan< T, BLOCK_DIM_X, ALGORITHM, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH >The BlockScan class provides collective methods for computing a parallel prefix sum/scan of items partitioned across a CUDA thread block.

block_scan_logo.png
oCcub::BlockStore< OutputIteratorT, BLOCK_DIM_X, ITEMS_PER_THREAD, ALGORITHM, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH >The BlockStore class provides collective data movement methods for writing a blocked arrangement of items partitioned across a CUDA thread block to a linear segment of memory.

block_store_logo.png
oCcub::CacheModifiedInputIterator< MODIFIER, ValueType, OffsetT >A random-access input wrapper for dereferencing array values using a PTX cache load modifier
oCcub::CacheModifiedOutputIterator< MODIFIER, ValueType, OffsetT >A random-access output wrapper for storing array values using a PTX cache-modifier
oCcub::CachingDeviceAllocatorA simple caching allocator for device memory allocations
oCcub::Cast< B >Default cast functor
oCcub::ConstantInputIterator< ValueType, OffsetT >A random-access input generator for dereferencing a sequence of homogeneous values
oCcub::CountingInputIterator< ValueType, OffsetT >A random-access input generator for dereferencing a sequence of incrementing integer values
oCcub::DeviceHistogramDeviceHistogram provides device-wide parallel operations for constructing histogram(s) from a sequence of samples data residing within device-accessible memory.

histogram_logo.png
oCcub::DevicePartitionDevicePartition provides device-wide, parallel operations for partitioning sequences of data items residing within device-accessible memory.

partition_logo.png
oCcub::DeviceRadixSortDeviceRadixSort provides device-wide, parallel operations for computing a radix sort across a sequence of data items residing within device-accessible memory.

sorting_logo.png
oCcub::DeviceReduceDeviceReduce provides device-wide, parallel operations for computing a reduction across a sequence of data items residing within device-accessible memory.

reduce_logo.png
oCcub::DeviceRunLengthEncodeDeviceRunLengthEncode provides device-wide, parallel operations for demarcating "runs" of same-valued items within a sequence residing within device-accessible memory.

run_length_encode_logo.png
oCcub::DeviceScanDeviceScan provides device-wide, parallel operations for computing a prefix scan across a sequence of data items residing within device-accessible memory.

device_scan.png
oCcub::DeviceSegmentedRadixSortDeviceSegmentedRadixSort provides device-wide, parallel operations for computing a batched radix sort across multiple, non-overlapping sequences of data items residing within device-accessible memory.

segmented_sorting_logo.png
oCcub::DeviceSegmentedReduceDeviceSegmentedReduce provides device-wide, parallel operations for computing a reduction across multiple sequences of data items residing within device-accessible memory.

reduce_logo.png
oCcub::DeviceSelectDeviceSelect provides device-wide, parallel operations for compacting selected items from sequences of data items residing within device-accessible memory.

select_logo.png
oCcub::DeviceSpmvDeviceSpmv provides device-wide parallel operations for performing sparse-matrix * dense-vector multiplication (SpMV)
oCcub::EqualityDefault equality functor
oCcub::Equals< A, B >Type equality test
oCcub::If< IF, ThenType, ElseType >Type selection (IF ? ThenType : ElseType)
oCcub::InequalityDefault inequality functor
oCcub::InequalityWrapper< EqualityOp >Inequality functor (wraps equality functor)
oCcub::IsPointer< Tp >Pointer vs. iterator
oCcub::IsVolatile< Tp >Volatile modifier test
oCcub::Log2< N, CURRENT_VAL, COUNT >Statically determine log2(N), rounded up
oCcub::MaxDefault max functor
oCcub::MinDefault min functor
oCcub::PowerOfTwo< N >Statically determine if N is a power-of-two
oCcub::ReduceByKeyOp< ReductionOpT >< Binary reduction operator to apply to values
oCcub::ReduceBySegmentOp< ReductionOpT >Reduce-by-segment functor
oCcub::RemoveQualifiers< Tp, Up >Removes const and volatile qualifiers from type Tp
oCcub::SumDefault sum functor
oCcub::SwizzleScanOp< ScanOp >Binary operator wrapper for switching non-commutative scan arguments
oCcub::TexObjInputIterator< T, OffsetT >A random-access input wrapper for dereferencing array values through texture cache. Uses newer Kepler-style texture objects
oCcub::TexRefInputIterator< T, UNIQUE_ID, OffsetT >A random-access input wrapper for dereferencing array values through texture cache. Uses older Tesla/Fermi-style texture references
oCcub::TransformInputIterator< ValueType, ConversionOp, InputIteratorT, OffsetT >A random-access input wrapper for transforming dereferenced values
oCUninitialized
|oCcub::BlockDiscontinuity< T, BLOCK_DIM_X, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH >::TempStorageThe operations exposed by BlockDiscontinuity require a temporary memory allocation of this nested type for thread communication. This opaque storage can be allocated directly using the __shared__ keyword. Alternatively, it can be aliased to externally allocated memory (shared or global) or union'd with other storage allocation types to facilitate memory reuse
|oCcub::BlockExchange< T, BLOCK_DIM_X, ITEMS_PER_THREAD, WARP_TIME_SLICING, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH >::TempStorageThe operations exposed by BlockExchange require a temporary memory allocation of this nested type for thread communication. This opaque storage can be allocated directly using the __shared__ keyword. Alternatively, it can be aliased to externally allocated memory (shared or global) or union'd with other storage allocation types to facilitate memory reuse
|oCcub::BlockHistogram< T, BLOCK_DIM_X, ITEMS_PER_THREAD, BINS, ALGORITHM, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH >::TempStorageThe operations exposed by BlockHistogram require a temporary memory allocation of this nested type for thread communication. This opaque storage can be allocated directly using the __shared__ keyword. Alternatively, it can be aliased to externally allocated memory (shared or global) or union'd with other storage allocation types to facilitate memory reuse
|oCcub::BlockLoad< InputIteratorT, BLOCK_DIM_X, ITEMS_PER_THREAD, ALGORITHM, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH >::LoadInternal< BLOCK_LOAD_TRANSPOSE, DUMMY >::TempStorageAlias wrapper allowing storage to be unioned
|oCcub::BlockLoad< InputIteratorT, BLOCK_DIM_X, ITEMS_PER_THREAD, ALGORITHM, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH >::LoadInternal< BLOCK_LOAD_WARP_TRANSPOSE, DUMMY >::TempStorageAlias wrapper allowing storage to be unioned
|oCcub::BlockLoad< InputIteratorT, BLOCK_DIM_X, ITEMS_PER_THREAD, ALGORITHM, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH >::LoadInternal< BLOCK_LOAD_WARP_TRANSPOSE_TIMESLICED, DUMMY >::TempStorageAlias wrapper allowing storage to be unioned
|oCcub::BlockLoad< InputIteratorT, BLOCK_DIM_X, ITEMS_PER_THREAD, ALGORITHM, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH >::TempStorageThe operations exposed by BlockLoad require a temporary memory allocation of this nested type for thread communication. This opaque storage can be allocated directly using the __shared__ keyword. Alternatively, it can be aliased to externally allocated memory (shared or global) or union'd with other storage allocation types to facilitate memory reuse
|oCcub::BlockRadixSort< KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH >::TempStorageThe operations exposed by BlockScan require a temporary memory allocation of this nested type for thread communication. This opaque storage can be allocated directly using the __shared__ keyword. Alternatively, it can be aliased to externally allocated memory (shared or global) or union'd with other storage allocation types to facilitate memory reuse
|oCcub::BlockReduce< T, BLOCK_DIM_X, ALGORITHM, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH >::TempStorageThe operations exposed by BlockReduce require a temporary memory allocation of this nested type for thread communication. This opaque storage can be allocated directly using the __shared__ keyword. Alternatively, it can be aliased to externally allocated memory (shared or global) or union'd with other storage allocation types to facilitate memory reuse
|oCcub::BlockScan< T, BLOCK_DIM_X, ALGORITHM, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH >::TempStorageThe operations exposed by BlockScan require a temporary memory allocation of this nested type for thread communication. This opaque storage can be allocated directly using the __shared__ keyword. Alternatively, it can be aliased to externally allocated memory (shared or global) or union'd with other storage allocation types to facilitate memory reuse
|oCcub::BlockStore< OutputIteratorT, BLOCK_DIM_X, ITEMS_PER_THREAD, ALGORITHM, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH >::StoreInternal< BLOCK_STORE_TRANSPOSE, DUMMY >::TempStorageAlias wrapper allowing storage to be unioned
|oCcub::BlockStore< OutputIteratorT, BLOCK_DIM_X, ITEMS_PER_THREAD, ALGORITHM, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH >::StoreInternal< BLOCK_STORE_WARP_TRANSPOSE, DUMMY >::TempStorageAlias wrapper allowing storage to be unioned
|oCcub::BlockStore< OutputIteratorT, BLOCK_DIM_X, ITEMS_PER_THREAD, ALGORITHM, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH >::StoreInternal< BLOCK_STORE_WARP_TRANSPOSE_TIMESLICED, DUMMY >::TempStorageAlias wrapper allowing storage to be unioned
|oCcub::BlockStore< OutputIteratorT, BLOCK_DIM_X, ITEMS_PER_THREAD, ALGORITHM, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH >::TempStorageThe operations exposed by BlockStore require a temporary memory allocation of this nested type for thread communication. This opaque storage can be allocated directly using the __shared__ keyword. Alternatively, it can be aliased to externally allocated memory (shared or global) or union'd with other storage allocation types to facilitate memory reuse
|oCcub::WarpReduce< T, LOGICAL_WARP_THREADS, PTX_ARCH >::TempStorageThe operations exposed by WarpReduce require a temporary memory allocation of this nested type for thread communication. This opaque storage can be allocated directly using the __shared__ keyword. Alternatively, it can be aliased to externally allocated memory (shared or global) or union'd with other storage allocation types to facilitate memory reuse
|\Ccub::WarpScan< T, LOGICAL_WARP_THREADS, PTX_ARCH >::TempStorageThe operations exposed by WarpScan require a temporary memory allocation of this nested type for thread communication. This opaque storage can be allocated directly using the __shared__ keyword. Alternatively, it can be aliased to externally allocated memory (shared or global) or union'd with other storage allocation types to facilitate memory reuse
oCcub::WarpReduce< T, LOGICAL_WARP_THREADS, PTX_ARCH >The WarpReduce class provides collective methods for computing a parallel reduction of items partitioned across a CUDA thread warp.

warp_reduce_logo.png
\Ccub::WarpScan< T, LOGICAL_WARP_THREADS, PTX_ARCH >The WarpScan class provides collective methods for computing a parallel prefix scan of items partitioned across a CUDA thread warp.

warp_scan_logo.png