Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature request: batch opening of 4 ML polys into one in spartan-parallel #63

Open
kunxian-xia opened this issue Dec 17, 2024 · 6 comments
Assignees

Comments

@kunxian-xia
Copy link
Collaborator

Goal

The goal is to reduce the size of proof generated by spartan-parallel.

Background

Currently we need to open four different set of multilinear polynomials at three different points in spartan-parallel (details). This requires us to generate 4 polynomial opening proofs. However, we know multilinear polynomial opening proof's size is $O(n)$ (where $n$ is the number of variables). This means the opening proof of ML poly with 16 variables is not significantly shorter than that with 25 variables.

  1. This implies opening them individually incurs very big overhead.
  2. As a comparison, FRI based proof system like plonky3, it batch the opening of hundreds of polynomials into one.

See more discussion at slack

Proposal (TODO) - Batching

@yczhangsjtu
Copy link
Collaborator

yczhangsjtu commented Dec 18, 2024

To batch open multiple multivariate polynomials with different number of variables at different points, we need to run a sum-check instance that reduces the openings to a single point. Formally, suppose we are given

  1. commitments to $m$ multilinear polynomials $f_i(X_0, \cdots, X_{n_i-1})$ where $i$ ranges from $0$ to $m-1$;
  2. $m$ points $\vec{z_i}$ where $i$ ranges from $0$ to $m-1$ and the length of $i$-th point $\vec{z_i}$ is $n_i$.

We want to prove that $f_i(\vec{z_i})= \sum_{\vec{b}} \textrm{eq}(\vec{z_i}, \vec{b}) * f_i(\vec{b}) = y_i$ for every $i$ from $0$ to $m-1$. Instead of producing $m$ opening proofs,

we want to batch them together into one opening proof.

In order to achieve that goal, we can use this particular sum-check for $\sum_{i=0}^{m-1}\beta_i y_i = \sum_{\vec{b} \in B_n }g(\vec{b})$, where $n=\max(n_i)$, $B_n = \{0,1\}^n$, $\beta_i$ are random challenge scalars sampled by the verifier, and

$$g(X_0,\cdots,X_{n-1})=\sum_{i=0}^{m-1} \beta_i \cdot \textrm{eq}(X_0,\cdots,X_{n_i-1}; \vec{z_i})\cdot f_i(X_0,\cdots,X_{n_i-1})$$

At the end of the sum-check, we will need to open $f_i$ at $r$, where $r$ is the random challenge vector generated during the sum-check. Note that different $f_i$ need to be evaluated with different prefixes of $r$.

The above is the big idea. Now we need to solve two problems:

  1. A sum-check that involves polynomials with different sizes.
  2. Does the existing polynomial commitments (particularly WHIR) support batch opening of polynomials at different prefixes of the same point?

We already have an implementation of sum-check for (1) in the Ceno repo. So the only problem left is (2).

@yczhangsjtu
Copy link
Collaborator

Regarding problem (2), there are two potential approaches:

  1. Exploit the recursive nature of WHIR. In each iteration of WHIR, the codeword is smaller and smaller. So we start from the largest polynomials, start iterating. As the codeword size decreases, other polynomials will gradually join in (randomly linear combined). Finally, all the polynomials will join when we reach the final round.
  2. Merge all the polynomials together at the very first round. Then we can continue the iteration with a single codeword.

@yczhangsjtu
Copy link
Collaborator

yczhangsjtu commented Dec 20, 2024

The challenge in approach (1): in WHIR, each iteration shrinks the codeword size by 2, but the polynomial number of variables by 2^k. So do we merge two codewords with the same size but with different polynomial sizes, or with the same polynomial sizes but different codeword sizes?

The workable solution is to keep the codeword size the same, because when we add the codewords, the polynomials are also added correctly. The problem is that whenever a new polynomial is merged in, the code rate is pushed back to the original rate, and the number of queries are back to the original. So the proof size would be larger than the original estimated proof size of WHIR.

Anyway, this challenge is solvable, the only issue is with the efficiency.

@yczhangsjtu
Copy link
Collaborator

The challenge in approach (2): suppose we have two polynomials $f(X)$ and $g(X)$ with $20$ and $10$ variables respectively. Then the first round of WHIR should produce a new polynomial commitment $f'(X)$ with $18$ variables (assume $k=2$). The verifier wants to check the consistency between $f'$ and $f,g$ at a random point. However, because the commitment to $g(X)$ is a codeword $g(L)$ for a much smaller domain $L$ than the domain on which $f'$ is evaluated, it's not sure if we can check the consistency by opening only a few points in $g(L)$.

Approach (2) is not workable until this challenge is resolved. However, if this approach does work, the protocol is conceptually simpler than approach (1), since it only affects the first round and the rest stay the same as WHIR, the change to the current code would also be minimal.

@yczhangsjtu
Copy link
Collaborator

Summary of approach (1)

Modify the non-batched WHIR

First, we need to modify the original WHIR to run two instances in parallel. Originally, a WHIR instance is to prove that a given codeword is committing to some polynomial $f$ whose inner product with weight function $w$ is target $\sigma$. We split it into: prove that the inner product between $f,w_0$ is $\sigma_0$, and $f,w_1$ is $\sigma_1$.

For the beginning, the second instance is dummy, i.e., $w_1=0$ and $\sigma_1=0$. After each round, we let the new $w_0'$ be the partially evaluated version of $w_0$, i.e., $w_0'=w_0(\alpha,\cdot)$, but add the additional constraints to the second weight function $w_1$.

Batched version of WHIR

We are now ready to describe the batched version of WHIR, for batching polynomials with different sizes. The original weight function is $w_0=eq(z,X)$, and $w_1=0$.

  1. Pick the largest polynomials in the polynomial set for batching.
  2. Execute (the simple batched) WHIR round protocol, until the running codeword size is the same as the largest codeword in the rest polynomials.
  3. Now the running instance is proving the given committed polynomial $f$ inner product with a weight function $w_0$ is a target value $\sigma_0$, and with $w_1$ the target is $\sigma_1$.
  4. Let the rest of the polynomials be $f_1,f_2,\cdots,f_k$ whose codeword size is the same as the codeword size of the running $f$.
  5. We next proceed with two WHIR instances running in parallel:
    • The first instance is the random linear combination of $f,f_1,\cdots,f_k$, with weight function $w_0$ and the target value is the linear combination of $\sigma_0$ and the claimed evaluations of $f_1,\cdots,f_k$.
    • The second instance is $f$ with weight function $w_1$ and target value $\sigma_1$.

@yczhangsjtu
Copy link
Collaborator

The challenge in approach (1): in WHIR, each iteration shrinks the codeword size by 2, but the polynomial number of variables by 2^k. So do we merge two codewords with the same size but with different polynomial sizes, or with the same polynomial sizes but different codeword sizes?

The workable solution is to keep the codeword size the same, because when we add the codewords, the polynomials are also added correctly. The problem is that whenever a new polynomial is merged in, the code rate is pushed back to the original rate, and the number of queries are back to the original. So the proof size would be larger than the original estimated proof size of WHIR.

Anyway, this challenge is solvable, the only issue is with the efficiency.

There is a more severe challenge than this, rendering this approach unusable.

In short, after one round (or several rounds) of reduction, the weight function of the running polynomial is no longer the same as the rest polynomials yet to batch. Therefore, the situation becomes: trying to batch prove $f_1 w_1$ produces $\sigma_1$ and $f_2 w_2$ produces $\sigma_2$. We can't batch $f_1$ and $f_2$ directly because $w_1$ and $w_2$ are different.

The above "Summary of approach (1)" works but is only slightly better than the naive approach: run a separate batched WHIR for each different polynomial size.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Todo
Development

No branches or pull requests

2 participants