fml  0.1-0
Fused Matrix Library
card Class Reference

GPU data and methods. More...

#include <card.hh>

Public Member Functions

 card ()
 Create a new card object. Does not initialize any GPU data.
 
 card (const int id=0)
 Create a new card object and set up internal CUDA data. More...
 
 card (const card &x)
 
void set (const int id)
 Sets up the existing card object. More...
 
void info () const
 Print some brief information about the GPU. More...
 
void * mem_alloc (const size_t len)
 Allocate device memory. More...
 
void mem_set (void *ptr, const int value, const size_t len)
 Set device memory. More...
 
void mem_free (void *ptr)
 Free device memory. More...
 
void mem_cpu2gpu (void *dst, const void *src, const size_t len)
 Copy host (CPU) data to device (GPU) memory. More...
 
void mem_gpu2cpu (void *dst, const void *src, const size_t len)
 Copy device (GPU) data to host (CPU) memory. More...
 
void mem_gpu2gpu (void *dst, const void *src, const size_t len)
 Copy device (GPU) data to other device (GPU) memory. More...
 
void synch ()
 Copy device (GPU) data to other device (GPU) memory. More...
 
void check ()
 Check for (and throw if found) a CUDA error. More...
 
int get_id ()
 
int get_id () const
 
gpublas_handle_t blas_handle ()
 GPU BLAS handle.
 
gpublas_handle_t blas_handle () const
 
gpulapack_handle_t lapack_handle ()
 GPU LAPACK handle.
 
gpulapack_handle_t lapack_handle () const
 
bool valid_card () const
 Is the gpu data valid?
 

Protected Attributes

int _id
 
gpublas_handle_t _blas_handle
 
gpulapack_handle_t _lapack_handle
 

Detailed Description

GPU data and methods.

Implementation Details\n Stores GPU ordinal and BLAS/LAPACK handles. Methods are wrappers
around core GPU operations, allowing GPU malloc, memset, etc.

You probably should not use these methods directly unless you know what you are doing (in which case you probably do not even need them). Simply pass a card object to a GPU object constructor and move on.

Constructor & Destructor Documentation

◆ card()

card::card ( const int  id = 0)
inline

Create a new card object and set up internal CUDA data.

Sets the current device to the provided GPU id and initializes GPU BLAS and LAPACK handles.

Parameters
[in]idOrdinal number corresponding to the desired GPU device.
Exceptions\n If the GPU can not be initialized, or if the allocation of one of the
handles fails, the method will throw a 'runtime_error' exception.

Member Function Documentation

◆ check()

void card::check ( )
inline

Check for (and throw if found) a CUDA error.

Implementation Details\n Wrapper around GPU error lookup, e.g. cudaGetLastError().
Exceptions\n If a CUDA error is detected, this throws a 'runtime_error' exception.

◆ get_id()

int card::get_id ( )
inline

The ordinal number corresponding to the GPU device.

◆ info()

void card::info ( ) const
inline

Print some brief information about the GPU.

Implementation Details\n Uses NVML.

◆ mem_alloc()

void * card::mem_alloc ( const size_t  len)
inline

Allocate device memory.

Parameters
[in]lenNumber of bytes of memory to allocate.
Returns
Pointer to the newly allocated device memory.
Implementation Details\n Wrapper around GPU malloc, e.g. cudaMalloc().
Exceptions\n If the allocation fails, this throws a 'runtime_error' exception.

◆ mem_cpu2gpu()

void card::mem_cpu2gpu ( void *  dst,
const void *  src,
const size_t  len 
)
inline

Copy host (CPU) data to device (GPU) memory.

Parameters
[in,out]dstThe device memory you want to copy TO.
[in]srcThe host memory you want to copy FROM.
[in]lenNumber of bytes of each array to use.
Implementation Details\n Wrapper around GPU memcpy, e.g. cudaMemcpy().
Exceptions\n If the function fails (e.g., being by improperly using device
memory), this throws a 'runtime_error' exception.

◆ mem_free()

void card::mem_free ( void *  ptr)
inline

Free device memory.

Parameters
[in]ptrThe device memory you want to un-allocate.
Implementation Details\n Wrapper around GPU free, e.g. cudaFree().
Exceptions\n If the function fails (e.g., being by given non-device memory), this
throws a 'runtime_error' exception.

◆ mem_gpu2cpu()

void card::mem_gpu2cpu ( void *  dst,
const void *  src,
const size_t  len 
)
inline

Copy device (GPU) data to host (CPU) memory.

Parameters
[in,out]dstThe host memory you want to copy TO.
[in]srcThe device memory you want to copy FROM.
[in]lenNumber of bytes of each array to use.
Implementation Details\n Wrapper around GPU memcpy, e.g. cudaMemcpy().
Exceptions\n If the function fails (e.g., being by improperly using device
memory), this throws a 'runtime_error' exception.

◆ mem_gpu2gpu()

void card::mem_gpu2gpu ( void *  dst,
const void *  src,
const size_t  len 
)
inline

Copy device (GPU) data to other device (GPU) memory.

Parameters
[in,out]dstThe device memory you want to copy TO.
[in]srcThe device memory you want to copy FROM.
[in]lenNumber of bytes of each array to use.
Implementation Details\n Wrapper around GPU memcpy, e.g. cudaMemcpy().
Exceptions\n If the function fails (e.g., being by improperly using device
memory), this throws a 'runtime_error' exception.

◆ mem_set()

void card::mem_set ( void *  ptr,
const int  value,
const size_t  len 
)
inline

Set device memory.

Parameters
[in,out]ptrOn entrance, the already-allocated block of memory to set. On exit, blocks of length 'len' will be set to 'value'.
[in]valueThe value to set.
[in]lenNumber of bytes of the input 'ptr' to set to 'value'.
Returns
Pointer to the newly allocated device memory.
Implementation Details\n Wrapper around GPU memset, e.g. cudaMemset().
Exceptions\n If the function fails (e.g., being by given non-device memory), this
throws a 'runtime_error' exception.

◆ set()

void card::set ( const int  id)
inline

Sets up the existing card object.

For use with the no-argument constructor. Frees any existing GPU data already allocated and stored in the object. Misuse of this could lead to some seemingly strange errors.

Parameters
[in]idOrdinal number corresponding to the desired GPU device.
Exceptions\n If the GPU can not be initialized, or if the allocation of one of the
handles fails, the method will throw a 'runtime_error' exception.

◆ synch()

void card::synch ( )
inline

Copy device (GPU) data to other device (GPU) memory.

Blocks further GPU execution until the device completes all previously executed kernels.

Implementation Details\n Wrapper around GPU synchronize, e.g. cudaDeviceSynchronize().
Exceptions\n If a CUDA error is detected, this throws a 'runtime_error' exception.

The documentation for this class was generated from the following file: