GPU data and methods.
More...
#include <card.hh>
|
| card () |
| Create a new card object. Does not initialize any GPU data.
|
|
| card (const int id=0) |
| Create a new card object and set up internal CUDA data. More...
|
|
| card (const card &x) |
|
void | set (const int id) |
| Sets up the existing card object. More...
|
|
void | info () const |
| Print some brief information about the GPU. More...
|
|
void * | mem_alloc (const size_t len) |
| Allocate device memory. More...
|
|
void | mem_set (void *ptr, const int value, const size_t len) |
| Set device memory. More...
|
|
void | mem_free (void *ptr) |
| Free device memory. More...
|
|
void | mem_cpu2gpu (void *dst, const void *src, const size_t len) |
| Copy host (CPU) data to device (GPU) memory. More...
|
|
void | mem_gpu2cpu (void *dst, const void *src, const size_t len) |
| Copy device (GPU) data to host (CPU) memory. More...
|
|
void | mem_gpu2gpu (void *dst, const void *src, const size_t len) |
| Copy device (GPU) data to other device (GPU) memory. More...
|
|
void | synch () |
| Copy device (GPU) data to other device (GPU) memory. More...
|
|
void | check () |
| Check for (and throw if found) a CUDA error. More...
|
|
|
int | get_id () |
|
int | get_id () const |
|
gpublas_handle_t | blas_handle () |
| GPU BLAS handle.
|
|
gpublas_handle_t | blas_handle () const |
|
gpulapack_handle_t | lapack_handle () |
| GPU LAPACK handle.
|
|
gpulapack_handle_t | lapack_handle () const |
|
bool | valid_card () const |
| Is the gpu data valid?
|
|
|
int | _id |
|
gpublas_handle_t | _blas_handle |
|
gpulapack_handle_t | _lapack_handle |
|
GPU data and methods.
- Implementation Details\n Stores GPU ordinal and BLAS/LAPACK handles. Methods are wrappers
- around core GPU operations, allowing GPU malloc, memset, etc.
You probably should not use these methods directly unless you know what you are doing (in which case you probably do not even need them). Simply pass a card object to a GPU object constructor and move on.
◆ card()
card::card |
( |
const int |
id = 0 | ) |
|
|
inline |
Create a new card object and set up internal CUDA data.
Sets the current device to the provided GPU id and initializes GPU BLAS and LAPACK handles.
- Parameters
-
[in] | id | Ordinal number corresponding to the desired GPU device. |
- Exceptions\n If the GPU can not be initialized, or if the allocation of one of the
- handles fails, the method will throw a 'runtime_error' exception.
◆ check()
Check for (and throw if found) a CUDA error.
- Implementation Details\n Wrapper around GPU error lookup, e.g. cudaGetLastError().
- Exceptions\n If a CUDA error is detected, this throws a 'runtime_error' exception.
◆ get_id()
The ordinal number corresponding to the GPU device.
◆ info()
void card::info |
( |
| ) |
const |
|
inline |
Print some brief information about the GPU.
- Implementation Details\n Uses NVML.
◆ mem_alloc()
void * card::mem_alloc |
( |
const size_t |
len | ) |
|
|
inline |
Allocate device memory.
- Parameters
-
[in] | len | Number of bytes of memory to allocate. |
- Returns
- Pointer to the newly allocated device memory.
- Implementation Details\n Wrapper around GPU malloc, e.g. cudaMalloc().
- Exceptions\n If the allocation fails, this throws a 'runtime_error' exception.
◆ mem_cpu2gpu()
void card::mem_cpu2gpu |
( |
void * |
dst, |
|
|
const void * |
src, |
|
|
const size_t |
len |
|
) |
| |
|
inline |
Copy host (CPU) data to device (GPU) memory.
- Parameters
-
[in,out] | dst | The device memory you want to copy TO. |
[in] | src | The host memory you want to copy FROM. |
[in] | len | Number of bytes of each array to use. |
- Implementation Details\n Wrapper around GPU memcpy, e.g. cudaMemcpy().
- Exceptions\n If the function fails (e.g., being by improperly using device
- memory), this throws a 'runtime_error' exception.
◆ mem_free()
void card::mem_free |
( |
void * |
ptr | ) |
|
|
inline |
Free device memory.
- Parameters
-
[in] | ptr | The device memory you want to un-allocate. |
- Implementation Details\n Wrapper around GPU free, e.g. cudaFree().
- Exceptions\n If the function fails (e.g., being by given non-device memory), this
- throws a 'runtime_error' exception.
◆ mem_gpu2cpu()
void card::mem_gpu2cpu |
( |
void * |
dst, |
|
|
const void * |
src, |
|
|
const size_t |
len |
|
) |
| |
|
inline |
Copy device (GPU) data to host (CPU) memory.
- Parameters
-
[in,out] | dst | The host memory you want to copy TO. |
[in] | src | The device memory you want to copy FROM. |
[in] | len | Number of bytes of each array to use. |
- Implementation Details\n Wrapper around GPU memcpy, e.g. cudaMemcpy().
- Exceptions\n If the function fails (e.g., being by improperly using device
- memory), this throws a 'runtime_error' exception.
◆ mem_gpu2gpu()
void card::mem_gpu2gpu |
( |
void * |
dst, |
|
|
const void * |
src, |
|
|
const size_t |
len |
|
) |
| |
|
inline |
Copy device (GPU) data to other device (GPU) memory.
- Parameters
-
[in,out] | dst | The device memory you want to copy TO. |
[in] | src | The device memory you want to copy FROM. |
[in] | len | Number of bytes of each array to use. |
- Implementation Details\n Wrapper around GPU memcpy, e.g. cudaMemcpy().
- Exceptions\n If the function fails (e.g., being by improperly using device
- memory), this throws a 'runtime_error' exception.
◆ mem_set()
void card::mem_set |
( |
void * |
ptr, |
|
|
const int |
value, |
|
|
const size_t |
len |
|
) |
| |
|
inline |
Set device memory.
- Parameters
-
[in,out] | ptr | On entrance, the already-allocated block of memory to set. On exit, blocks of length 'len' will be set to 'value'. |
[in] | value | The value to set. |
[in] | len | Number of bytes of the input 'ptr' to set to 'value'. |
- Returns
- Pointer to the newly allocated device memory.
- Implementation Details\n Wrapper around GPU memset, e.g. cudaMemset().
- Exceptions\n If the function fails (e.g., being by given non-device memory), this
- throws a 'runtime_error' exception.
◆ set()
void card::set |
( |
const int |
id | ) |
|
|
inline |
Sets up the existing card object.
For use with the no-argument constructor. Frees any existing GPU data already allocated and stored in the object. Misuse of this could lead to some seemingly strange errors.
- Parameters
-
[in] | id | Ordinal number corresponding to the desired GPU device. |
- Exceptions\n If the GPU can not be initialized, or if the allocation of one of the
- handles fails, the method will throw a 'runtime_error' exception.
◆ synch()
Copy device (GPU) data to other device (GPU) memory.
Blocks further GPU execution until the device completes all previously executed kernels.
- Implementation Details\n Wrapper around GPU synchronize, e.g. cudaDeviceSynchronize().
- Exceptions\n If a CUDA error is detected, this throws a 'runtime_error' exception.
The documentation for this class was generated from the following file: