New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Aviad/blake2s #576

Open

aviadingo wants to merge 15 commits into V2 from aviad/blake2s

aviadingo commented Aug 11, 2024

Describe the changes

adds Blake2s cuda capability.

aviadingo added 14 commits

June 9, 2024 16:40


          init blake2s

dcf70e6


          added tests

75614f2


          fixed import

9c2d9ab


          added makefile

081218f


          Merge remote-tracking branch 'origin/main' into aviad/blake2s

1ff9942


          added namespace blake2s

256f8fa


          blake2s working for single hash using Hasher


          temp test tree code

076f82f


          added merkle tree test

4229b9d


          moved initialization to device

6ce4a6c


          added sequential test

d203c1e


          fixed deq test

050c70d


          batched test working

30301f2


          all tests pass

0678a2f

aviadingo requested review from ChickenLover, mickeyasa and LeonHibnik

August 11, 2024 10:12


          ran clang-format

340f962

ChickenLover reviewed

View reviewed changes

Contributor

ChickenLover left a comment

Good job with the PR. I left a bunch of style related comments. In theory they can be ignored as we are merging this to V2

icicle/include/hash/blake2s/blake2s.cuh

+               */
+              #pragma once
+              typedef unsigned char BYTE;

Contributor

ChickenLover Sep 5, 2024

please move these inside blake2s namespace

icicle/src/merkle-tree/merkle.cu

-                    THROW_ICICLE_ERR(
-                      IcicleError_t::InvalidArgument,
-                      "Hash max preimage length does not match merkle tree arity multiplied by digest elements");
+                  // if (compression.preimage_max_length < tree_config.arity * tree_config.digest_elements)

Contributor

ChickenLover Sep 5, 2024

You can just delete those at this point

icicle/src/hash/blake2s/blake2s.cu

+                __device__ __forceinline__ void cuda_blake2s_init_state(cuda_blake2s_ctx_t* ctx)
+                {
+                  memcpy(ctx->state, ctx->chain, BLAKE2S_CHAIN_LENGTH);
+                  // ctx->state[8] = ctx->t0;

Contributor

ChickenLover Sep 6, 2024

Why are these commented?

icicle/src/hash/blake2s/blake2s.cu

+                  return a;
+                }
+                __device__ uint32_t cuda_blake2s_ROTR32(uint32_t a, uint8_t b) { return (a >> b) | (a << (32 - b)); }

Contributor

ChickenLover Sep 6, 2024

Maybe worth to add __inline__

icicle/src/hash/blake2s/blake2s.cu

+                  cudaMalloc(&cuda_outdata, BLAKE2S_BLOCK_SIZE * n_batch);
+                  assert(keylen <= 32);
+                  // CUDA_BLAKE2S_CTX ctx;

Contributor

ChickenLover Sep 6, 2024

These should be removed

icicle/src/hash/blake2s/blake2s.cu

+                  WORD block = (n_batch + thread - 1) / thread;
+                  kernel_blake2s_hash<<<block, thread>>>(cuda_indata, inlen, cuda_outdata, n_batch, BLAKE2S_BLOCK_SIZE);
+                  cudaMemcpy(out, cuda_outdata, BLAKE2S_BLOCK_SIZE * n_batch, cudaMemcpyDeviceToHost);
+                  cudaDeviceSynchronize();

Contributor

ChickenLover Sep 6, 2024

Current implementation does not support async. All of our other primitives do. So maybe worth adding HashConfig as an input and changing all the functions to their async alternatives

icicle/src/hash/blake2s/blake2s.cu

+                  kernel_blake2s_hash<<<block, thread>>>(cuda_indata, inlen, cuda_outdata, n_batch, BLAKE2S_BLOCK_SIZE);
+                  cudaMemcpy(out, cuda_outdata, BLAKE2S_BLOCK_SIZE * n_batch, cudaMemcpyDeviceToHost);
+                  cudaDeviceSynchronize();
+                  cudaError_t error = cudaGetLastError();

Contributor

ChickenLover Sep 6, 2024

Please use our error-management functions (you can find an example in any of our primitives)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet