vSMC
vSMC: Scalable Monte Carlo
|
Optimized <cstring>
functions.
More...
Classes | |
class | vsmc::CStringNonTemporalThreshold |
The threshold of buffer size above which memcpy use non-temporal instructions. More... | |
Functions | |
void * | vsmc::memcpy (void *dst, const void *src, std::size_t n) |
SIMD optimized memcpy with non-temporal store for large buffers. More... | |
void * | vsmc::memcpy_avx (void *dst, const void *src, std::size_t n) |
AVX optimized memcpy with non-temporal store for large buffers. More... | |
void * | vsmc::memcpy_avx_nt (void *dst, const void *src, std::size_t n) |
AVX optimized memcpy with non-temporal store regardless of size. More... | |
void * | vsmc::memcpy_nt (void *dst, const void *src, std::size_t n) |
SIMD optimized memcpy with non-temporal store regardless of size. More... | |
void * | vsmc::memcpy_sse2 (void *dst, const void *src, std::size_t n) |
SSE2 optimized memcpy with non-temporal store for large buffers. More... | |
void * | vsmc::memcpy_sse2_nt (void *dst, const void *src, std::size_t n) |
SSE2 optimized memcpy with non-temporal store regardless of size. More... | |
void * | vsmc::memcpy_std (void *dst, const void *src, std::size_t n) |
Direct call to std::memcpy More... | |
void * | vsmc::memset (void *dst, int ch, std::size_t n) |
SIMD optimized memset with non-temporal store for large buffers. More... | |
void * | vsmc::memset_avx (void *dst, int ch, std::size_t n) |
AVX optimized memset with non-temporal store for large buffers. More... | |
void * | vsmc::memset_avx_nt (void *dst, int ch, std::size_t n) |
AVX optimized memset with non-temporal store regardless of size. More... | |
void * | vsmc::memset_nt (void *dst, int ch, std::size_t n) |
SIMD optimized memset with non-temporal store regardless of size. More... | |
void * | vsmc::memset_sse2 (void *dst, int ch, std::size_t n) |
SSE2 optimized memset with non-temporal store for large buffers. More... | |
void * | vsmc::memset_sse2_nt (void *dst, int ch, std::size_t n) |
SSE2 optimized memset with non-temporal store regardless of size. More... | |
void * | vsmc::memset_std (void *dst, int ch, std::size_t n) |
Direct call to std::memset More... | |
Optimized <cstring>
functions.
This module implement the memcpy
, etc., functions in the vsmc
namespace. The implementaions are optimzied with SIMD instructions. Three groups of functions are provided.
memcpy_std
etc., they simply call std::memcpy
etc.memcpy_sse2
etc., they are avialable if at least SSE2 is supported and are optimized with SSE2 instructionsmemcpy_avx
etc., they are avialable if at least AVX is supported and are optimized with AVX instructionsThere are also generic vsmc::memcpy
etc. They dispatch the call based on the following rules.
memcpy_avx
etc.memcpy_sse2
etc.memcpy_std
.This dispatch can be done at compile time if the configuration macro VSMC_CSTRING_RUNTIME_DISPATCH
is zero. If the macro is non-zero, then it will be done at runtime using CPUID
information.
Before using any of these vSMC provided functions. A few factors shall be considered.
VSMC_CSTRING_NON_TEMPORAL_THRESHOLD
or at runtime via CStringNonTemporalThreshold singleton.In any case, most systems's standard C library is likely to be optimized enough to suffice most usage situations. And even when there is a noticeable perforamnce difference, unless all the program do is copy and moving memories, the difference is not likely to be big enough to make a difference. And taking caching into consideration, those difference seem in memory dedicated benchmarks might well not exist at all in real programs.
In summary, do some benchmark of real programs before deciding if using these functions are beneficial.
|
inline |
SIMD optimized memcpy
with non-temporal store for large buffers.
Definition at line 923 of file cstring.hpp.
|
inline |
AVX optimized memcpy
with non-temporal store for large buffers.
Definition at line 820 of file cstring.hpp.
|
inline |
AVX optimized memcpy
with non-temporal store regardless of size.
Definition at line 834 of file cstring.hpp.
|
inline |
SIMD optimized memcpy
with non-temporal store regardless of size.
Definition at line 964 of file cstring.hpp.
|
inline |
SSE2 optimized memcpy
with non-temporal store for large buffers.
Definition at line 792 of file cstring.hpp.
|
inline |
SSE2 optimized memcpy
with non-temporal store regardless of size.
Definition at line 806 of file cstring.hpp.
|
inline |
Direct call to std::memcpy
Definition at line 780 of file cstring.hpp.
|
inline |
SIMD optimized memset
with non-temporal store for large buffers.
Definition at line 906 of file cstring.hpp.
|
inline |
AVX optimized memset
with non-temporal store for large buffers.
Definition at line 815 of file cstring.hpp.
|
inline |
AVX optimized memset
with non-temporal store regardless of size.
Definition at line 827 of file cstring.hpp.
|
inline |
SIMD optimized memset
with non-temporal store regardless of size.
Definition at line 944 of file cstring.hpp.
|
inline |
SSE2 optimized memset
with non-temporal store for large buffers.
Definition at line 787 of file cstring.hpp.
|
inline |
SSE2 optimized memset
with non-temporal store regardless of size.
Definition at line 799 of file cstring.hpp.
|
inline |
Direct call to std::memset
Definition at line 775 of file cstring.hpp.