vSMC
vSMC: Scalable Monte Carlo
|
OpenCL Manager. More...
#include <vsmc/opencl/cl_manager.hpp>
Public Types | |
typedef ID | cl_id |
Public Member Functions | |
const ::cl::CommandQueue & | command_queue () const |
The command queue currently being used. More... | |
const ::cl::Context & | context () const |
The context currently being used. More... | |
template<typename CLType > | |
void | copy_buffer (const ::cl::Buffer &src, const ::cl::Buffer &dst, std::size_t num, std::size_t src_offset=0, std::size_t dst_offset=0, const std::vector< ::cl::Event > *events=nullptr,::cl::Event *event=nullptr, bool block=true) const |
Copy an OpenCL buffer into another of a given type and number of elements. More... | |
template<typename CLType > | |
::cl::Buffer | create_buffer (std::size_t num,::cl_mem_flags flag=CL_MEM_READ_WRITE, void *host_ptr=nullptr) const |
Create an OpenCL buffer of a given type and number of elements. More... | |
::cl::Program | create_program (const std::string &source) const |
Create a program given the source within the current context. More... | |
::cl::Program | create_program (const std::vector< std::string > &source) const |
Create a program given a vector of sources within the current context. More... | |
::cl::Program | create_program (const std::vector< std::string > &binary, const std::vector< ::cl::Device > *devices, std::vector< ::cl_int > *status=nullptr) const |
Create a program given binaries within the current context. More... | |
const ::cl::Device & | device () const |
The device currently being used. More... | |
const std::vector< ::cl::Device > & | device_vec () const |
The vector of all device that is in the context of this manager. More... | |
int | opencl_c_version () const |
The minimum OpenCL C version supported by all devices in the context of this manager. More... | |
int | opencl_version () const |
The minimum OpenCL version supported by all devices in the context of this manager. More... | |
const ::cl::Platform & | platform () const |
The platform currently being used. More... | |
template<typename Func > | |
cxx11::enable_if< !cxx11::is_same< Func, std::size_t >::value &&!cxx11::is_convertible< Func, std::size_t >::value, std::size_t >::type | profile_kernel (::cl::Kernel &kern, std::size_t N, const Func &func, std::size_t lmin=0, std::size_t repeat=10) |
Run the kernel with all local size that are multiples of the preferred factor, return the local size that is the fatest. More... | |
std::size_t | profile_kernel (::cl::Kernel &kern, std::size_t N, std::size_t lmin=0, std::size_t repeat=3) |
template<typename CLType , typename OutputIter > | |
void | read_buffer (const ::cl::Buffer &buf, std::size_t num, OutputIter first, std::size_t offset=0, const std::vector< ::cl::Event > *events=nullptr,::cl::Event *event=nullptr, bool block=true) const |
Read an OpenCL buffer of a given type and number of elements into an iterator. More... | |
template<typename CLType > | |
void | read_buffer (const ::cl::Buffer &buf, std::size_t num, CLType *first, std::size_t offset=0, const std::vector< ::cl::Event > *events=nullptr,::cl::Event *event=nullptr, bool block=true) const |
Read an OpenCL buffer of a given type and number of elements into a pointer. More... | |
void | run_kernel (const ::cl::Kernel &kern, std::size_t N, std::size_t local_size=0, const std::vector< ::cl::Event > *events=nullptr,::cl::Event *event=nullptr, bool block=true) const |
Run a given kernel with one dimensional global size and local size on the current command queue. More... | |
bool | setup () const |
Whether the platform, context, device and command queue has been setup correctly. More... | |
bool | setup (::cl_device_type dev) |
Try to setup the platform, context, device and command queue using the given device type. More... | |
bool | setup (const ::cl::Platform &plat, const ::cl::Context &ctx, const ::cl::Device &dev, const ::cl::CommandQueue &cmd) |
Set the platform, context, device and command queue manually. More... | |
template<typename CLType , typename InputIter > | |
void | write_buffer (const ::cl::Buffer &buf, std::size_t num, InputIter first, std::size_t offset=0, const std::vector< ::cl::Event > *events=nullptr,::cl::Event *event=nullptr, bool block=true) const |
Write an OpenCL buffer of a given type and number of elements from an iterator. More... | |
template<typename CLType > | |
void | write_buffer (const ::cl::Buffer &buf, std::size_t num, const CLType *first, std::size_t offset=0, const std::vector< ::cl::Event > *events=nullptr,::cl::Event *event=nullptr, bool block=true) const |
Write an OpenCL buffer of a given type and number of elements from a pointer. More... | |
template<typename CLType > | |
void | write_buffer (const ::cl::Buffer &buf, std::size_t num, CLType *first, std::size_t offset=0, const std::vector< ::cl::Event > *events=nullptr,::cl::Event *event=nullptr, bool block=true) const |
Write an OpenCL buffer of a given type and number of elements from a pointer. More... | |
Static Public Member Functions | |
static CLManager< ID > & | instance () |
Get an instance of the manager singleton. More... | |
OpenCL Manager.
Each instance of CLManager is an singleton. Different ID
template parameter create distinct singletons. Each singleton manages a specific OpenCL device. However, it is possible for different singletons to manage the same device.
The ID
template parameter, apart from ensuring that different IDs create distinct singletons, it can also provide additional information about which device CLManager shall choose by default through the singleton CLSetup with the same ID
template argument.
It is important to configure the platform and device to be used through CLSetup before calling CLManager::instance for the first time. If nothing is done by the user, the default behavior is to use CL_DEVICE_TYPE_DEFAULT
type device, and set the platform to be the first one that contain such as device, and the device to the first one that is of such a type. The user can change the platform name, device vendor name, device name, and device type through CLSetup. In case of names, only partial match is requried. For example,
If compiled on a recent MacBook Pro (late 2013 model), then the Iris Pro GPU from Intel will be used. Note that in this case, actually specify
or
is enough. However, if one specify
Then the setup will fail, since there is no device with the specified combinations. Also note that, specification such as
may not be enough to lead to successful setup. The default device type CL_DEVICE_TYPE_DEFAULT
may not be GPU. To be safe, if one need to use CLSetup, at least specify the device type. It can be set through values of type cl_device_type
or a string with values "GPU", "CPU", "Accelerator". Other string values are silently ignored and the default is used.
Before using a CLManager, it is important to check that CLManager::setup returns true
.
Definition at line 129 of file cl_manager.hpp.
typedef ID vsmc::CLManager< ID >::cl_id |
Definition at line 133 of file cl_manager.hpp.
|
inline |
The command queue currently being used.
Definition at line 168 of file cl_manager.hpp.
|
inline |
The context currently being used.
Definition at line 159 of file cl_manager.hpp.
|
inline |
Copy an OpenCL buffer into another of a given type and number of elements.
Definition at line 315 of file cl_manager.hpp.
|
inline |
Create an OpenCL buffer of a given type and number of elements.
Definition at line 206 of file cl_manager.hpp.
|
inline |
Create a program given the source within the current context.
Definition at line 448 of file cl_manager.hpp.
|
inline |
Create a program given a vector of sources within the current context.
Definition at line 453 of file cl_manager.hpp.
|
inline |
Create a program given binaries within the current context.
binary | A vector of binaries. The binary buffers are stored in std::string |
devices | The devices for which the program shall be created. If it is NULL , then the program will be created for all devices in the current context |
status | Return the status of loading the binaries for each device. It is ignored if NULL . |
Definition at line 471 of file cl_manager.hpp.
|
inline |
The device currently being used.
Definition at line 162 of file cl_manager.hpp.
|
inline |
The vector of all device that is in the context of this manager.
Definition at line 165 of file cl_manager.hpp.
|
inlinestatic |
Get an instance of the manager singleton.
Definition at line 136 of file cl_manager.hpp.
|
inline |
The minimum OpenCL C version supported by all devices in the context of this manager.
Definition at line 153 of file cl_manager.hpp.
|
inline |
The minimum OpenCL version supported by all devices in the context of this manager.
Definition at line 147 of file cl_manager.hpp.
|
inline |
The platform currently being used.
Definition at line 156 of file cl_manager.hpp.
|
inline |
Run the kernel with all local size that are multiples of the preferred factor, return the local size that is the fatest.
kern | The kernel to be profiled |
N | The global size |
func | A functor that has the following signature, void func (::cl::Kernel &kern)
kern before it is run each time |
lmin | The minimum local size to be considered. This function will consider all local sizes that are a multiple of this value. If lmin = 0 (the default), then the preferred multiplier queried from the device is used. If its value is bigger than the allowed maximum local size, then it is treated as if it is set to zero. |
repeat | The number of repeatition of runs. The profiling is done by run the kernel once to heat it up, and then repeat runs for this given value. The time of the later is measured and compared for each considered local size. |
Definition at line 389 of file cl_manager.hpp.
|
inline |
Definition at line 443 of file cl_manager.hpp.
|
inline |
Read an OpenCL buffer of a given type and number of elements into an iterator.
Definition at line 221 of file cl_manager.hpp.
|
inline |
Read an OpenCL buffer of a given type and number of elements into a pointer.
Definition at line 240 of file cl_manager.hpp.
|
inline |
Run a given kernel with one dimensional global size and local size on the current command queue.
OpenCL requires that global_size
is a multiple of local_size
. This function will round N
if it is not already a multiple of local_size
. In the kernel it is important to check that get_global_id(0)
is not out of range.
For example, say we have kernel that should be applied to N
elements. But the most efficient local size K
does not divide N
. Instead of calculate the correct global size yourself, you can simple call run_kernel(kern, N, K)
. But within the kernel, you need to check get_global_id(0) < N
Definition at line 348 of file cl_manager.hpp.
|
inline |
Whether the platform, context, device and command queue has been setup correctly.
Definition at line 172 of file cl_manager.hpp.
|
inline |
Try to setup the platform, context, device and command queue using the given device type.
Definition at line 176 of file cl_manager.hpp.
|
inline |
Set the platform, context, device and command queue manually.
After this member function call setup() will return true
in future calls
Definition at line 189 of file cl_manager.hpp.
|
inline |
Write an OpenCL buffer of a given type and number of elements from an iterator.
Definition at line 257 of file cl_manager.hpp.
|
inline |
Write an OpenCL buffer of a given type and number of elements from a pointer.
Definition at line 281 of file cl_manager.hpp.
|
inline |
Write an OpenCL buffer of a given type and number of elements from a pointer.
Definition at line 298 of file cl_manager.hpp.