Bolt is a C++ template library optimized for GPUs. Bolt provides high-performance library implementations for common algorithms such as scan, reduce, transform, and sort. The Bolt interface resembles the C++ Standard Template Library (STL) so that developers who are familiar with STL will recognize many of the Bolt APIs and customization techniques. In some cases, developers can benefit from GPU acceleration simply by changing the namespace for STL algorithm calls from "std" to "bolt::cl".
C++ templates can be used to customize the algorithms with new types (for example,
the Bolt sort
can operate on ints, float, or any custom type defined by the user).
Additionally, Bolt allows users to customize the template routines using function objects (functors)
written in OpenCL - for example, to provide a custom comparison operation for sort
,
or a custom reduction operation.
The Bolt interfaces can directly interface with host memory structure such as
std::vector
or host arrays (ie float*
). On today's GPU systems the host
memory will be automatically mapped or copied to the GPU. On future systems which support the
Heterogeneous System Architecture, the GPU will directly access the host data structures. Bolt also
provides a bolt::cl::device_vector container which can be used to allocate and manage device-local memory
for higher performance on discrete GPU systems. Bolt APIs can accept both STL container or device_vector
iterators.
Further documentation can be found with either
Bolt uses the liberal Apache License, Version 2.0
Download cmake to generate your personal build files and start stepping through the example programs