Skip to content

Commit

Permalink
[Feat] Vanilla zkEVM SHA-256 circuit (#160)
Browse files Browse the repository at this point in the history
* feat(zkevm-sha256): Initial commit from Brechtpd/zkevm-circuits@ef90cf0

Copied `sha256_bit` from Brecht's repo

* chore: rename crate zkevm-keccak to zkevm-hashes

* fix: add `input_len` back to `KeccakTable`

* chore: move keccak specific constants to `keccak_packed_multi/util`

* feat: SHA-256 circuit with 2-phase challenge passes MockProver

* feat(halo2-base): add `GateChip::pow_var` (#103)

* make ShaTable public

* make more sha stuff public

* Use halo2curves v0.4.0 and ff v0.13 (#107)

* wip: change import to ff v0.13

* feat: remove `GateInstructions::get_field_element`

halo2curves now has `bn256-table` which creates table of small field
elements at compile time, so we should just use `F::from` always. This
also improves readability.

* chore: fix syntax and imports after update

* chore: add asm feature

* chore: workspace.resolver = 2

* chore: update ethers-core

* chore: add jemallocator feature to zkevm-keccak crate

* test: add bigger test case to keccak prover

* feat: use `configure_with_params`

remove `thread_local!` usage

* chore: bump zkevm-keccak version to 0.1.1

* feat: add `GateThreadBuilder::from_stage` for convenience

* chore: fixes

* fix: removed `lookup_bits` from `GateThreadBuilder::config`

* fix: debug_assert_false should load witness for debugging

* chore: use unreachable to document that Circuit::configure is never used

* chore: fix comment

* feat(keccak): use configure_with_params

* chore: fix halo2-pse errors

* chore: doc comments and folder restructure

* chore(zkevm_hashes): Bump version to v0.2.0

* feat(wip): more folder restructuring

- Move `Sha256CircuitConfig` to `columns.rs`
- Move constants to `param.rs`
- Rename `witness_gen.rs` to `witness.rs`

* feat(sha256): removed RLC from circuit

* feat(sha256): add real prover test

* feat(sha256): more tests

* feat: add readme

* fix: compatibility with halo2-pse

* fix: remove unnecessary `is_final` in `length` update (#166)

* chore: remove use of `Box::leak` for string concat (#167)

* feat: move `q_enable` to `ShaTable` (#168)

* [fix] typo in comment: 32 bytes -> 32 bits (#185)

fix: typo in comment: 32 bytes -> 32 bits

* [feat] Add comment in readme about circuit input limit (#186)

feat: add note in readme about input size limit

* fix: more byte -> bit typos (#187)

* [chore] change gate annotation for better debugging (#188)

chore: change gate annotation for better debugging

* [feat] rename `d_64, h_64` to `d_68, h_68` (#189)

feat: rename `d_64, h_64` to `d_68, h_68`

* [feat] avoid double `meta.query` to same cells (#190)

* feat: avoid double `meta.query` to same cells

* chore: fix fmt (cargo fmt isn't working)

* [chore] use constant instead of hardcoded number (#191)

chore: use constant instead of hardcoded number

* nit: `Rotation(-0)` to `Rotation::cur()` (#192)

* feat: constrain all unused cells to be zero (#193)

For extra security, we constrain every unused cell in the circuit
explicitly to be zero. This also serves as a nice exposition of all the
unused cells in the circuit.

* [feat] reduce num columns (#194)

* feat: combine `word_value`, `output` (hi-lo) columns into one

Previously: Proving time for 14562 SHA256 blocks: 91.113416291s
test sha256::vanilla::tests::bit_sha256_prover::k_20 ... ok

Now: Proving time for 14562 SHA256 blocks: 88.943400583s
test sha256::vanilla::tests::bit_sha256_prover::k_20 ... ok

* feat: remove `is_enabled` from `ShaTable`

It seems extraneous since we have `is_final` and `q_squeeze` is fixed.

* chore: move `is_final` to `ShaTable` (#200)

since it is part of the overall input recovery data
  • Loading branch information
jonathanpwang authored Dec 4, 2023
1 parent d80d4f7 commit 5a9f4b1
Show file tree
Hide file tree
Showing 16 changed files with 1,538 additions and 8 deletions.
1 change: 1 addition & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ jobs:
working-directory: "hashes/zkevm"
run: |
cargo test packed_multi_keccak_prover::k_14
cargo test bit_sha256_prover::k_10
cargo t test_vanilla_keccak_kat_vectors
lint:
Expand Down
5 changes: 3 additions & 2 deletions hashes/zkevm/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
[package]
name = "zkevm-hashes"
version = "0.1.4"
version = "0.2.0"
authors = ["Privacy Scaling Explorations Team", "Taiko Labs", "Intrinsic Technologies"]
license = "MIT"
license = "MIT OR Apache-2.0"
edition = "2021"
repository = "https://github.com/axiom-crypto/halo2-lib"
readme = "README.md"
Expand Down Expand Up @@ -34,6 +34,7 @@ rand_core = "0.6.4"
rand_xorshift = "0.3"
env_logger = "0.10"
test-case = "3.1.0"
sha2 = "0.10.7"

[features]
default = ["halo2-axiom", "display"]
Expand Down
4 changes: 4 additions & 0 deletions hashes/zkevm/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,7 @@
## Keccak

See [readme](./src/keccak/README.md).

## SHA-256

See [readme](./src/sha256/README.md).
7 changes: 7 additions & 0 deletions hashes/zkevm/src/keccak/mod.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,10 @@
//! The zkEVM keccak circuit implementation, with some modifications.
//! Credit goes to https://github.com/privacy-scaling-explorations/zkevm-circuits/tree/main/zkevm-circuits/src/keccak_circuit
//!
//! This is a lookup table based implementation, where bytes are packed into big field elements as efficiently as possible.
//! The circuits can be configured to use different numbers of columns, by specifying the number of rows per internal
//! round of the keccak_f permutation.

/// Module for component circuits.
pub mod component;
/// Module for Keccak circuits in vanilla halo2.
Expand Down
5 changes: 1 addition & 4 deletions hashes/zkevm/src/lib.rs
Original file line number Diff line number Diff line change
@@ -1,9 +1,6 @@
//! The zkEVM keccak circuit implementation, with some minor modifications
//! Credit goes to <https://github.com/privacy-scaling-explorations/zkevm-circuits/tree/main/zkevm-circuits/src/keccak_circuit>

use halo2_base::halo2_proofs;

/// Keccak packed multi
pub mod keccak;
pub mod sha256;
/// Util
pub mod util;
72 changes: 72 additions & 0 deletions hashes/zkevm/src/sha256/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# ZKEVM SHA-256

## Vanilla

SHA-256 circuit in vanilla halo2. This implementation is largely based on [Brechtpd](https://github.com/Brechtpd)'s [PR](https://github.com/privacy-scaling-explorations/zkevm-circuits/pull/756) to the PSE `zkevm-circuits`. His implementation of SHA-256 is in turn based on his implementation of Keccak using the "Bits" approach: one can read more about it [here](https://hackmd.io/NaTuIvmaQCybaOYgd-DG1Q?view#Bit-implementation).

The major differences is that this version directly represent raw inputs and SHA-256 digests as witnesses, while the original version only has RLCs (random linear combination) of raw inputs and outputs. Because this version doesn't need RLCs, it doesn't have the 2nd phase or use challenge APIs.

### Logical Input/Output

Logically the circuit takes a variable length array of variable length bytes as inputs and SHA-256 digests of these bytes as outputs.
While these logical inputs are variable, what is fixed in a given circuit is max number of _total number of SHA-256 input blocks_ that can be processed (see below). We refer to this as the capacity of the circuit.

`sha256::vanilla::witness::generate_witnesses_multi_sha256` generates the witnesses of the ciruit for a given input.

### Background Knowledge

- Given a variable length byte array, one first pads as follows (taken from [Wikipedia](https://en.wikipedia.org/wiki/SHA-2#Pseudocode)):

```
begin with the original message of length L bits
append a single '1' bit
append K '0' bits, where K is the minimum number >= 0 such that (L + 1 + K + 64) is a multiple of 512
append L as a 64-bit big-endian integer, making the total post-processed length a multiple of 512 bits
such that the bits in the message are: <original message of length L> 1 <K zeros> <L as 64 bit integer> , (the number of bits will be a multiple of 512)
```

- The SHA-256 algorithm processes padded input data in _blocks_ of 512 bits or 64 bytes.
- The hashing process comprises a series of `NUM_ROUNDS` (64) rounds.
- The algorithm can be organized so that the 64 bytes are divided into `NUM_WORDS_TO_ABSORB` (16) _words_ of 32 bits each, and one new word is ingested in each of the first `NUM_WORDS_TO_ABSORB` rounds.

### Circuit Overview

- The circuit operates on one 512 bit input block at a time.
- For each block, `SHA256_NUM_ROWS` (72) are used. This consists of `NUM_START_ROWS` (4) + `NUM_ROUNDS` (64) + `NUM_END_ROWS` (4) rows.
- As described above, the input is "absorbed" in 32 bit words, one in each row of rows `NUM_START_ROWS..NUM_START_ROWS + NUM_WORDS_TO_ABSORB`. These are the rows in which a selector `q_input` is turned on.
- We store inputs and outputs for external use in columns inside the `ShaTable` struct. These are:
- `is_enabled`: a boolean indicating if it is the last row of the block and also this is the last input block of a full input (i.e., this is the block with the finalized digest).
- `length`: the running length in bytes of input data "absorbed" so far, including the current block, excluding padding. This is only constrained when `q_input` is true. One recovers the length of the unpadded input by reading this value on the last "absorb" row in a block with `is_enabled` true.
- `word_value`: 32 bits of the input, as described above. We use the following slightly funny conversion: we consider the 4 byte chunk of the input, replace the padding with 0s, and then convert to a 32-bit integer by considering the 4 bytes _in little endian_. This choice was chosen for consistency with the Keccak circuit, but is arbitrary.
- Only constrained when `q_input` is true.
- `output` (2): the hash digest the SHA-256 algorithm on the input bytes (32 bytes). We represent this as two field elements in hi-lo form - we split 32 bytes into two 16 byte chunks, and convert them to `u128` as _big endian_.
- Only constrained when the last row of a block. Should only be considered meaningful when `is_enabled` is true.
- We convenient store the relevant cells for the above data, per input block, in the struct `AssignedSha256Block`.
- This circuit has a hard constraint that the input array has length up to `2^32 - 1` bits, whereas the official SHA-256 spec supports up to `2^64 - 1`. (In practice it is likely impossible to create a circuit that can handle `2^32 - 1` bit inputs.)
- Details are provided in inline comments.

### Example

To illustrate, let's consider `inputs = [[], [0x00, 0x01, ..., 0x37]]`. The corresponding table will look like (input idx is not a real column, provided for viewing convenience):

| row | input idx | word_value | length | is_enabled | hash_lo | hash_hi |
| --- | --------- | ------------ | ------ | ---------- | ------- | ------- |
| 0 | 0 | - | ... | false | |
| ... | 0 | ... | ... | ... | |
| 4 | 0 | `0` | 0 | false | |
| ... | 0 | `0` | 0 | false | |
| 71 | 0 | - | 0 | true | RESULT | RESULT |
| 72 | 1 | - | ... | ... | |
| ... | 1 | ... | ... | false | |
| 76 | 1 | `0x03020100` | 4 | false | |
| ... | 1 | ... | ... | false | |
| 91 | 1 | `0x0` | 56 | false | |
| 143 | 1 | - | - | false | | |
| 144 | 1 | - | ... | ... | |
| ... | 1 | ... | ... | false | |
| 148 | 1 | `0x0` | 56 | false | |
| ... | 1 | ... | ... | false | |
| 163 | 1 | `0x0` | 56 | false | |
| 215 | 1 | - | - | true | RESULT | RESULT |

Here the second input has length 56 (in bytes) and requires two blocks due to padding: `56 * 8 + 1 + 64 > 512`.
10 changes: 10 additions & 0 deletions hashes/zkevm/src/sha256/mod.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
//! Brecht's SHA-256 circuit implementation, which he modified from the Keccak bit implementation.
//! Note this circuit does **not** use lookup tables, only custom gates.
//! The number of columns are fixed (~130). Unlike keccak, it is not configurable.
//!
//! More details here: https://github.com/privacy-scaling-explorations/zkevm-circuits/pull/756
//!
//! Note: this circuit only supports SHA256 of a bit array of length up to 2^32 - 1, unlike the spec which supports up
//! to 2^64 - 1.

pub mod vanilla;
73 changes: 73 additions & 0 deletions hashes/zkevm/src/sha256/vanilla/columns.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
//! The columns of the Sha256 circuit
use std::marker::PhantomData;

use halo2_base::halo2_proofs::plonk::{Advice, Column, ConstraintSystem, Fixed};

use crate::util::eth_types::Field;

use super::param::*;

/// ShaTable, copied from KeccakTable. However note that `NUM_BYTES_PER_WORD` is different for SHA256
#[derive(Clone, Debug)]
pub struct ShaTable {
/// Selector always turned on except in blinding rows.
pub(super) q_enable: Column<Fixed>,
/// Single shared column containing different IO data depending on the `offset` within
/// a SHA256 input block ([SHA256_NUM_ROWS] = 72 rows): If offset is in
/// Encoded input:
/// - [NUM_START_ROWS]..[NUM_START_ROWS] + [NUM_WORDS_TO_ABSORB]: Raw SHA256 word([NUM_BYTES_PER_WORD] bytes) of inputs
/// SHA256 hash of input in hi-lo format:
/// - [SHA256_NUM_ROWS] - 2: output.hi()
/// - [SHA256_NUM_ROWS] - 1: output.lo()
pub io: Column<Advice>,
/// Length in bytes of the input processed so far. Does not include padding.
pub length: Column<Advice>,
/// Advice to represent if this input block is the last one for a variable length input.
/// The advice value should only be used in the last row of each [SHA256_NUM_ROWS] block.
pub(super) is_final: Column<Advice>,
}

impl ShaTable {
/// Construct a new ShaTable
pub fn construct<F: Field>(meta: &mut ConstraintSystem<F>) -> Self {
let q_enable = meta.fixed_column();
let io = meta.advice_column();
let length = meta.advice_column();
let hash_lo = meta.advice_column();
let hash_hi = meta.advice_column();
meta.enable_equality(io);
meta.enable_equality(length);
meta.enable_equality(hash_lo);
meta.enable_equality(hash_hi);
let is_final = meta.advice_column();
Self { q_enable, io, length, is_final }
}
}

/// Columns for the Sha256 circuit
#[derive(Clone, Debug)]
pub struct Sha256CircuitConfig<F> {
pub(super) q_first: Column<Fixed>,
pub(super) q_extend: Column<Fixed>,
pub(super) q_start: Column<Fixed>,
pub(super) q_compression: Column<Fixed>,
pub(super) q_end: Column<Fixed>,
// Bool. True on rows NUM_START_ROWS..NUM_START_ROWS + NUM_WORDS_TO_ABSORB per input block.
// These are the rounds when input might be absorbed.
// It "might" contain inputs because it's possible that a round only have paddings.
pub(super) q_input: Column<Fixed>,
// Bool. True on row NUM_START_ROWS + NUM_WORDS_TO_ABSORB - 1 for each input block.
// This is the last round when input is absorbed.
pub(super) q_input_last: Column<Fixed>,
pub(super) q_squeeze: Column<Fixed>,
pub(super) word_w: [Column<Advice>; NUM_BITS_PER_WORD_W],
pub(super) word_a: [Column<Advice>; NUM_BITS_PER_WORD_EXT],
pub(super) word_e: [Column<Advice>; NUM_BITS_PER_WORD_EXT],
pub(super) is_paddings: [Column<Advice>; ABSORB_WIDTH_PER_ROW_BYTES],
pub(super) round_cst: Column<Fixed>,
pub(super) h_a: Column<Fixed>,
pub(super) h_e: Column<Fixed>,
/// The columns for other circuits to lookup hash results
pub hash_table: ShaTable,
pub(super) _marker: PhantomData<F>,
}
Loading

0 comments on commit 5a9f4b1

Please sign in to comment.