Skip to content

Releases: ddemidov/vexcl

1.4.3

09 Nov 12:01
29c2132
Compare
Choose a tag to compare
  • C++ OpenCL wrappers are now included via CL/opencl.hpp (recommended by Khronos) or CL/cl2.hpp (deprecated).
  • Minor fixes

1.4.2

27 Apr 05:33
0d8abbf
Compare
Choose a tag to compare
  • Two years worth of minor fixes and improvements.
  • Added source_generator::num_groups() returning the number of
    workgroups on the compute device.
  • Make push_compile_options, push_program_header behave in a cumulative way.
  • Added profiler::reset().
  • Added vector::at().
  • Support mixed precision in vex::copy().

1.4.1

04 May 14:45
Compare
Choose a tag to compare

A bug fix release.

  • Improvements for cmake scripts.
  • Bug fixes.

1.4.0

19 Apr 18:33
Compare
Choose a tag to compare
  • Modernize cmake build system.
    Provide VexCL::OpenCL, VexCL::Compute, VexCL::CUDA, VexCL::JIT
    imported targets, so that users may just
    add_executable(myprogram myprogram.cpp)
    target_link_libraries(myprogram VexCL::OpenCL)
    
    to build a program using the corresponding VexCL backend.
    Also stop polluting global cmake namespace with things like
    add_definitions(), include_directories(), etc.
    See http://vexcl.readthedocs.io/en/latest/cmake.html.
  • Make vex::backend::kernel::config() return reference to the kernel. So
    that it is possible to config and launch the kernel in a single line:
    K.config(nblocks, nthreads)(queue, prm1, prm2, prm3);.
  • Implement vector<T>::reinterpret<U>() method. It returns a new vector that
    reinterprets the same data (no copies are made) as the new type.
  • Implemented new backend: JIT. The backend generates and compiles at runtime
    C++ kernels with OpenMP support. The code will not be more effective that
    hand-written OpenMP code, but allows to easily debug the generated code with
    host-side debugger. The backend also may be used to develop and test new code
    when other backends are not available.
  • Let VEX_CONSTANTS to be casted to their values in the host code. So that a
    constant defined with VEX_CONSTANT(name, expr) could be used in host code
    as name. Constants are still useable in vector expressions as name().
  • Allow passing generated kernel args for each GPU (#202).
    Kernel args packed into std::vector will be unpacked and passed
    to the generated kernels on respective devices.
  • Reimplemented vex::SpMat as vex::sparse::ell, vex::sparse::crs,
    vex::sparse::matrix (automatically chooses one of the two formats based on
    the current compute device), and vex::sparse::distributed<format> (this one
    may span several compute devices). The new matrix-vector products are now
    normal vector expressions, while the old vex::SpMat could only be used in
    additive expressions. The old implementation is still available.
    vex::sparse::ell is now converted from host-side CRS format on compute
    device, which makes the conversion faster.
  • Bug fixes and minor improvements.

1.3.3

06 Apr 06:48
Compare
Choose a tag to compare
  • Added vex::tensordot() operation. Given two tensors (arrays of dimension greater than or equal to one), A and
    B, and a list of axes pairs (where each pair represents corresponding axes from two tensors), sums the products of A's and B's elements over the given axes. Inspired by python's numpy.tensordot operation.
  • Expose constant memory space in OpenCL backend.
  • Provide shortcut filters vex::Filter::{CPU,GPU,Accelerator} for OpenCL backend.
  • Added Boost.Compute backend. Core functionality of the Boost.Compute library is used as a replacement to Khronos C++ API which seems to become more and more outdated. The Boost.Compute backend is still based on OpenCL, so there are two OpenCL backends now. Define VEXCL_BACKEND_COMPUTE to use this backend and make sure Boost.Compute headers are in include path.

1.3.2

04 Sep 06:22
Compare
Choose a tag to compare
  • Improved thread safety
  • Implemented any_of and all_of primitives
  • Minor bugfixes and improvements

1.3.1

14 May 17:53
Compare
Choose a tag to compare
  • Adopted scan_by_key algorithm from HSA-Libraries/Bolt.
  • Minor improvements and bug fixes.

1.3.0

14 Apr 11:55
Compare
Choose a tag to compare
  • API breaking change: vex::purge_kernel_caches() family of functions is
    renamed to vex::purge_caches() as the online cache now may hold objects of
    arbitrary type. The overloads that used to take
    vex::backend::kernel_cache_key now take const vex::backend::command_queue&.
  • The online cache is now purged whenever vex::Context is destroyed. This
    allows for clean release of OpenCL/CUDA contexts.
  • Code for random number generators has been unified between OpenCL and CUDA
    backends.
  • Fast Fourier Transform is now supported both for OpenCL and CUDA backends.
  • vex::backend::kernel constructor now takes optional parameter with command
    line options.
  • Performance of CLOGS algorithms has been improved.
  • VEX_BUILTIN_FUNCTION macro has been made public.
  • Minor bug fixes and improvements.

1.2.0

02 Apr 10:19
Compare
Choose a tag to compare
  • API breaking change: the definition of VEX_FUNCTION family of macros has changed. The previous versions are available as VEX_FUNCTION_V1.
  • Wrapping code for clogs library is added by @bmerry
    (the author of clogs).
  • vector/multivector iterators are now standard-conforming iterators.
  • Other minor improvements and bug fixes.

1.1.2

24 Dec 12:35
Compare
Choose a tag to compare
  • reduce_by_key() may take several tied keys (see e09d249).
  • It is possible to reduce OpenCL vector types (cl_float2, cl_double4, etc).
  • VEXCL_SHOW_KERNELS may be an environment variable as well as a preprocessor macro. This allows to control kernel source output without program recompilation.
  • Added compute capability filter for the CUDA backend (vex::Filter::CC(major, minor)).
  • Fixed compilation errors and warnings generated by Visual Studio.