Zero-Knowledge Proofs

Introduction to SNARK

Interactive proofs are complete and (statistically) sound, while interactive arguments are sound w.r.t computationally bounded prover (computational soundness).

Definition (SNARK).
A succinct (preprocessing) non-interactive argument of knowledge (SNARK) is a triple of algorithms $(S, P, V)$ :
Setup $S(C;r)=(pp,vp)$ : Generates public parameters $(pp, vp)$ for the prover and verifier respectively, given a circuit $C$ .
Prove $P(pp, x, w)=\pi$ : Produces a short proof $\pi$ , where $\mathsf{len}(\pi) = \mathsf{sublinear}(|w|)$
Verify $V(vp, x, \pi)$ : Fast to verify, with verification time: $\mathsf{time}(V) = O_{\lambda}(|x|, \mathsf{sublinear}(|C|))$ ,
that satisfy:
Completeness: For all $x,w$ such that $C(x,w)=0$ , $\mathbf{Pr}\left[\pi \leftarrow P(pp, x, w): V(vp, x, \pi) = 1\right] = 1$ .
Knowledge Soundness: If $V$ accepts, then $P$ “knows” a witness $w$ such that $C(x,w)=0$ .
If the algorithms satisfy:
Zero-Knowledge: The $(C,pp,vp,x,\pi)$ “reveals nothing new” about the witness $w$ .
then the scheme is called a zk-SNARK.

Strong SNARK: all sublinear terms are $\log(|C|)$ .

So in SNARK, prover can not simply send a $w$ , for $w$ might be long ( $|w|\leq |C|$ ) and it may be hard to verify $C(x,w)=0$ .

Observe that $S$ is randomized for setup phase. If $r$ is revealed to $P$ , it can prove false statements. This is because while $vp$ is visible from $P$ , it can hide some key structures invisible from $P$ without knowing the random $r$ .

The pre-processing has different types:

transparent setup: $S(C)$ uses no random bit.

trusted but universal setup: $S=(S_{\text{init}}, S_{\text{index}})$ , $S_{\text{init}}(r)=gp$ generates global parameters with secret $r$ , $S_{\text{index}}(gp,C)=(pp,vp)$ is deterministic.

trusted setup per circuit: $S(C;r)$ uses random bits for every circuit $C$ .

Argument of Knowledge

Here we use trusted but universal setup.

Definition (Adaptive Knowledge Soundness). A preprocessing NARK $(S, P, V)$ is (adaptively) knowledge sound for a circuit $C$ if for every polynomial-time adversary $A = (A_0, A_1)$ such that:
$\begin{aligned} & gp \leftarrow S_{\text{init}}(1^\lambda;r), \\ & (C, x, \text{st}) \leftarrow A_0(gp), \\ & (pp, vp) \leftarrow S_{\text{index}}(C), \\ & \pi \leftarrow A_1(pp, x, \text{st}) \end{aligned}$
and
$\Pr[V(vp, x, \pi) = \text{accept}] > 1/10^6 \quad (\text{non-negligible w.r.t. } \lambda)$
there exists an efficient extractor $E$ (that uses $A$ ) such that:
$\begin{aligned} & gp \leftarrow S_{\text{init}}(1^\lambda;r), \\ & (C, x, \text{st}) \leftarrow A_0(gp), \\ & w \leftarrow E^A(gp, C, x) \end{aligned}$
and
$\Pr[C(x, w) = 0] > 1/10^6 - \epsilon \quad (\text{for a negligible } \varepsilon\text{ w.r.t. } \lambda).$

Adversary chooses $(C, x)$ after seeing $gp$ (adaptive), then produces a proof $\pi$ . If $A$ produces an accepting proof with non-negligible probability, then $E$ can extract a valid witness $w$ with essentially the same probability (up to negligible $\varepsilon$ ).

After intro, we proceed to constructing a SNARK, which uses 2 building blocks, one is functional commitment scheme, and the other is interactive oracle proof. The former is a cryptographic object and the latter is an information-theoretical object.

Building Blocks 1: functional commitment scheme

Definition (Commitment Scheme). A commitment scheme consists of
$\mathsf{setup}(1^\lambda)\to gp$ , outputs public parameters $gp$
$\mathsf{commit}(gp, f, r)\to \mathsf{com}$ , commitment to $f \in \mathcal{F}$ with $r \in \mathcal{R}$ randomly chosen
$\mathsf{open}(\mathsf{com}, f, r)$ : if $\mathsf{com} = \mathsf{commit}(gp, f, r)$ , then $\mathsf{open}(\mathsf{com}, f, r) = 1$ ; otherwise $\mathsf{open}(\mathsf{com}, f, r) = 0$
satisfying hiding and binding properties:
Binding: for all PPT adversary $\mathcal{A}$ ：
$\Pr\left[ \begin{array}{l} gp \leftarrow \mathsf{setup}(1^\lambda) \\ (\mathsf{com}_f, f_0, r_0, f_1, r_1) \leftarrow \mathcal{A}(gp) \end{array} : \begin{array}{l} (f_0, r_0) \neq (f_1, r_1) \\ \wedge \; \mathsf{com}_f = \mathsf{Commit}(gp, f_0, r_0) \\ \wedge \; \mathsf{com}_f = \mathsf{Commit}(gp, f_1, r_1) \end{array} \right] \le \mathsf{negl}(\lambda)$
Hiding: for all PPT adversary $\mathcal{A}=(\mathcal{A}_1,\mathcal{A}_2)$ :
$\left| \Pr\left[ \begin{array}{l} gp \leftarrow \mathsf{Setup}(1^\lambda) \\ (f_0, f_1, st) \leftarrow \mathcal{A}_1(gp) \\ b \leftarrow \{0,1\}, \; r \leftarrow \mathcal{R} \\ com_f \leftarrow \mathsf{Commit}(gp, f_b, r) \\ b' \leftarrow \mathcal{A}_2(st, com_f) \end{array} : b' = b \right] - \frac{1}{2} \right| \le \mathsf{negl}(\lambda)$
also defined as computational indistinguishability.
Note: the $r$ in the open algorithm is the same as that in the commit algorithm.

After sender’s commitment, the sender should also send $(x,r)$ for the receiver to verify. For example, if I want to play paper-scissors-rock with a friend remotely: first we determine our choices; then we send the commitments to each other; finally we send the choice and the random bits used and verify the commitments sent by one another.

To hide the message along the interaction (also, to make the proof short instead of $(x,r)$ ), we define functional commitment scheme.

Definition (Functional Commitment Scheme). A functional commitment scheme for $\mathcal{F}=\{f:X\mapsto Y\}$ :
$\mathsf{setup}(1^\lambda)\to gp$ , outputs public parameters $gp$
$\mathsf{commit}(gp, f, r)\to \mathsf{com}_f$ , commitment to $f \in \mathcal{F}$ with $r \in \mathcal{R}$ , which is a binding (and hiding for zk-SNARK) commitment scheme for $\mathcal{F}$
$\mathsf{eval}(P, V)$ : for a given $\mathsf{com}_f$ and all $x \in X$ :
$\mathsf{open}(gp, f, x, r) \to \text{short proof } \pi\text{ and value }y\in Y$
$\mathsf{verify}(gp, \mathsf{com}_f, x, y, \pi) \to \text{accept/reject}$
satisfying
Completeness: If $f \in \mathcal{F}$ and $x \in X$ with $f(x)=y$ , then $\Pr[V(gp, \mathsf{com}_f, x, y, \pi) = \mathsf{accept}] = 1$ .
Knowledge Soundness: $(\mathsf{setup}\;\|\;\mathsf{commit}, \mathsf{open}, \mathsf{verify})$ is knowledge sound.
The interaction within $\mathsf{eval}(P,V)$ is actually a SNARK for the relation (circuit)
$\{(gp,\mathsf{com}_f,x,y): \exists (f,r),f(x)=y, f\in\mathcal{F},\mathsf{com}_f=\mathsf{commit}(gp, f, r)\}$
whose witness is essentially $(f,r)$ . The proof $\pi$ actually proves that $f\in\mathcal{F}$ , $f(x)=y$ and $\mathsf{com}_f$ is a commitment to $f$ .

The definition above essentially illustrates an interactive process. It is like the verifier “queries” the oracle $f$ at $x$ with an additional cost of verifying.

\begin{align*} &\text{Prover}&&&\text{Verifier}\\ &r\xleftarrow{\$}R,\mathsf{com}_f\leftarrow\mathsf{commit}(gp, f, r)&\xrightarrow{\mathsf{com}_f}&&\\ &&\xleftarrow{x}&\;&x\xleftarrow{\$} X\\ &y\leftarrow f(x),\pi\leftarrow\mathsf{open}(gp, f, x, y, r)&\xrightarrow{y,\pi}&\;&\mathsf{verify}(gp, \mathsf{com}_f, x, y, \pi) \to \text{accept/reject} \end{align*}

With the scheme above we commit to a function. The $\mathcal{F}$ vary from many different function classes.

Polynomial commitments: $\mathcal{F} = \mathbb{F}^{\leq d}[X]$

Multilinear commitments: $\mathcal{F} = \mathbb{F}^{\leq 1}[X_1, \ldots, X_n]$ , where the degree is defined the maximum degree of the polynomial in every variable ( $\leq1$ means multilinear)

Vector commitments: $\mathcal{F} = \{f_u: [n] \to \mathbb{F}:f(i) = u_i,u\in\mathbb{F}^n\}$

Inner product commitments: $\mathcal{F} = \{f_{u}: \mathbb{F}^n \to \mathbb{F}:f_{u}(v) = \langle u,v\rangle, u\in\mathbb{F}^n\}$

Building Blocks 2: $\mathcal{F}$ -Interactive Oracle Proof

IOP is a proof system that proves the prover knows $w$ such that $C(x,w)=0$ . The verifier uses oracles of $f\in\mathcal{F}$ , which is later replaced by functional commitments to construct a SNARK; here we use oracles.

Definition ( $\mathcal{F}$ -IOP): An $\mathcal{F}$ -IOP is a proof system that proves $\exists w,\;C(x,w)=0$ and consists of
$\mathsf{setup}(1^\lambda)\to pp,vp=(\mathsf{Oracle}_{f_{-s}},\dots,\mathsf{Oracle}_{f_{0}})$
At round $i\in[t]$ , prover $P$ sends $\mathsf{Oracle}_{f_i}$ where $f_i$ is based on the interaction history from view of $P$ ; verifier sends $r_i\xleftarrow{\$} R$ unless $i=t$ .
$\mathsf{verify}^{\mathsf{Oracle}_{f_{-s}},\dots,\mathsf{Oracle}_{f_t}}(vp,x,r_1,\dots,r_{t-1})$ uses $(f_i)_{i=-s}^t$ as oracles and output the result.
which satisfies
Completeness: If $\exists w,\;C(x,w)=0$ , then the verifier is bound to accept
Knowledge Soundness: if we use $\mathsf{com}_f$ ’s for $\mathsf{Oracle}_f$ ’s, the extractor can use $(f_{i})_{i=-s}^t$ to compute $w$ because the functional commitment scheme is knowledge sound (functional commitment scheme is actually a SNARK).
(Optional) Zero-Knowledge

Polynomial IOP for Circuit SAT

In this lecture we are going to construct a poly-IOP for circuit satisfiability. $C$ is an arithmetic circuit with size $S$ . The prover wants to prove that it knows the solution $w$ to $C(w)=y$ .

First, we label every gate of $C$ with a bit string of length $\log S$ . Thus the computation of $C(w)=y$ can be expressed in a form of function $T:\{0,1\}^{\log S}\mapsto\mathbb{F}$ , which maps the label of some gate to the output of the gate with its output (notice that $T(\text{root})=y$ ). Let $h:\mathbb{F}^{\log S}\mapsto\mathbb{F}$ be the unique multilinear extension of $T$ , satisfying $h(x)=T(x)$ for all $x \in \{0,1\}^{\log S}$ (the uniqueness is trivial by dynamic programming).

Why shall we extend the function $T$ to a multilinear polynomial $h$ ? The necessity is shown in the part of the sum-check protocol.

Here $V$ should have verified the $h$ and $T$ are the same on $\{0,1\}^{\log S}$ . However $V$ only has the commitment of $h$ , so $V$ verifies this by another approach shown below.

We use labels to denote gates, and define the polynomial $g_h:\mathbb{F}^{3\log S}\mapsto\mathbb{F}$ as follows:

g_h(a,b,c)=\begin{cases} h(a)+h(b)-h(c), & \text{if }c\text{ is an add gate with input }a,b, \\ h(a)h(b)-h(c), & \text{if }c\text{ is an mult gate with input }a,b,\\ 0,& \text{otherwise}. \end{cases}

which satisfies:

T\text{ is a correct assignment }\iff \forall (a,b,c)\in\{0,1\}^{3\log S},\;g_h(a,b,c)=0\;(\text{in }\mathbb{F}).

We shall modify a little in $g_h$ : we embed the result $g_h(a,b,c)$ into $\mathbb{Z}$ . So the condition above is equivalent to $\sum_{x\in\{0,1\}^{3\log S}} \tilde{g}_h(x)=0$ where $\tilde{g}_h:\mathbb{F}^{3\log S}\mapsto\mathbb{Z}$ has the same values as $g_h$ on all inputs. Then $P$ and $V$ interact to let $V$ believe that the sum is $0$ . If the sum is $0$ , $V$ believes that $P$ knows the correct $T$ .

Here comes the usage of the sum-check protocol.

Sum-Check Protocol

The goal of the protocol is to check the answer $C$ provided by prover satisfies:

C=\sum_{x\in\{0,1\}^n}g(x)

where $g$ is an $n$ -variate polynomial over the field $\mathbb{F}$ and the sum is computed in $\mathbb{Z}$ (i.e. add them up without taking the modulo).

In setup phase, $P$ sends the commitment of $g$ to $V$ . Then they interact for $n$ rounds, and in the end $V$ checks the final claim by querying the oracle of $g$ for one time at a random point.

\begin{align*} &\text{Prover}&&&\text{Verifier}\\ \text{start}:&\text{ ``I know }s_0=\sum_{x\in\{0,1\}^n}g(x)\text{'' }&\xrightarrow{s_0}&&\\ \text{round 1}:&\text{``Please verify the last step to compute }s_0\text{''}&\xrightarrow{s_1(X_1)}&&\text{ verify }s_1(0)+s_1(1)=s_0\text{ and the next goal is to verify }s_1(X_1)=\sum_{x\in\{0,1\}^{n-1}}g(X_1,x)\\ &&\xleftarrow{r_1}&&\text{``If indeed, then for most }r\in\mathbb{F},s_1(r)=\sum_{x\in\{0,1\}^{n-1}}g(r,x)\text{. So I send a random $r_1$.''}\\ \text{round 2}:&\text{``I will proof }s_1(r_1)=\sum_{x\in\{0,1\}^{n-1}}g(r_1,x)\text{, please verify...'' }&\xrightarrow{s_2(X_2)}&\;&...\\ &&...&&\\ \text{round }n:&\text{ ``I will proof }s_{n-1}(r_{n-1})=\sum_{x\in\{0,1\}}g(r_1,\dots,r_{n-1},x)\text{, please verify...'' }&\xrightarrow{s_{n}(X_{n})}&\;&\text{ verify }s_n(0)+s_n(1)=s_{n-1}(r_{n-1})\text{ and the next goal is to verify }s_{n}(X_n)=g(r_1,\dots,r_{n-1},X_{n})\\ &&&\;&r_n\xleftarrow{\$}\mathbb{F},\text{ obtaining }s_n(r_n).\text{ Then query the oracle of }g\text{ at }(r_i)_{i=1}^n.\text{ Accept iff }g(r_1,\dots,r_n)=s_n(r_n). \end{align*}

The probability that a cheating prover can make the verifier accept a false claim is at most $\frac{n\deg(g)}{|\mathbb{F}|}$ (by Schwartz-Zippel lemma, $\deg$ is the maximum degree in total, regardless of different variables). So we can make the soundness error negligible by choosing a large enough field.

Note that every polynomial from $P$ is sent in its coefficients. Let $d=\deg(g)$ and it takes $T_g$ time to verify/evaluate any $g(x)$ , we have

T_V=O(nd+T_g),T_P=O(2^ndT_g)

and the proof length is $O(nd)$ .

For dense polynomial $g$ , the time to evaluate its multilinear extension $\tilde g$ on point $x$ is at most $O(2^n)$ . Thus our prover has quadratic time. The [Libra] puts forwards a linear time sum-check prover.

This sum-check protocol is also used in the proof that $\mathbf{IP}=\mathbf{PSPACE}$ . Define the decisional counting problem of 3CNF as:

(\#\mathsf{SAT})_D:=\left\{\langle\varphi,k\rangle:k=\sum_{b_1,\dots,b_n\in\{0,1\}}\varphi(b_1,\dots,b_n)\right\}

so by translating the boolean formula into a multilinear polynomial, with the sum-check protocol we have $(\#\mathsf{SAT})_D\in\mathbf{IP}$ .

Suppose we have a $\mathsf{TQBF}$ formula $\psi=\forall x_1\exists x_2\dots Qx_n.\varphi(x_1,x_2,\dots,x_n)$ , if we directly make $\forall$ into multiplication and $\exists$ into addition, the degree will skyrocket. Define for a $p\in\mathbb{F}[X_1,\dots,X_n]$ ,

\begin{align*} &\mathsf{L}_ip:=(1-X_i)p^{i\leftarrow0}+X_ip^{i\leftarrow1}\in\mathbb{F}[X_1\dots,X_n],\\ &\mathsf{A}_ip:=p^{i\leftarrow0}p^{i\leftarrow1}\in\mathbb{F}[X_1\dots X_{i-1},X_{i+1},\dots,X_n],\\ &\mathsf{E}_ip:=1-(1-p^{i\leftarrow0})(1-p^{i\leftarrow1})\in\mathbb{F}[X_1\dots X_{i-1},X_{i+1},\dots,X_n], \end{align*}

these polynomials are all in $\mathbb{F}[X_1\dots,X_n]$ .

In analogy to the sum-check protocol, we can also define $\mathsf{X}_ip:=p^{i\leftarrow0}+p^{i\leftarrow1}\in\mathbb{F}[X_1\dots X_{i-1},X_{i+1},\dots,X_n]$ .

The idea is to linearize the polynomial after each multiplication. So the formula is described by the polynomial

\mathsf{A}_1\mathsf{L}_1\mathsf{E}_2\mathsf{L}_1\mathsf{L}_2\mathsf{A}_3\mathsf{L}_1\mathsf{L}_2\mathsf{L}_3\dots\mathsf{Q}_n\mathsf{L}_1\mathsf{L}_2\dots\mathsf{L}_np_{\varphi}=1

then the prover and verifier go through a $O(n^2)$ interaction to establish the protocol. Thus $\mathsf{TQBF}\in\mathbf{IP}$ and $\mathbf{IP}=\mathbf{PSPACE}$ .

PLONK IOP

Based on the lecture 4, we can encode a circuit problem into a polynomial. The sum-check protocol uses multi-variable polynomials with a large proof size (linear to $n\sim\log |C|$ even if we send commitments of polynomials instead of coefficients). PLONK is also used for circuit SAT with a different encoding, which uses univariate polynomials and has a proof size independent of $n$ .

Small gadgets to build PLONK IOP

Let $\Omega=\langle\omega\rangle\subset\mathbb{F}_p$ be a multiplicative subgroup of size $k\mid\varphi(p)$ . And we want to provide protocols to prove that a committed polynomial $f$ has some properties on $\Omega$ . The properties of given $f,g\in\mathbb{F}_p[X]$ include:

Zeroness: $f(x)=0$ for all $x\in\Omega$

Sum/Product over $\Omega$ : $\sum_{x\in\Omega}\left(f(x)-g(x)\right)=0$ or $\prod_{x\in\Omega}\frac{f(x)}{g(x)}=1$

Permutation: $(f(\omega^i))_{i=0}^{k-1}$ is a permutation of $(g(\omega^i))_{i=0}^{k-1}$ . Warning: it is not enough to check $\prod_{x\in\Omega}\frac{f(x)}{g(x)}=1$ ! The soundness relies on the random challenge by the verifier.

Prescribed permutation: $f(y)=g(W(y))$ for some known permutation $W:\Omega\to\Omega$

Encoding a circuit into a polynomial

We again take $C(x,w)$ as the circuit satisfiability problem. Let $d=3|C|+|x|+|w|$ , and label each gate with a integer. And we have the $d$ -th root $\omega$ , which satisfies $\omega^d=1$ .

Setup phase outputs polynomial $S$ and permutation $W$ . The prover wants to prove that it knows $w$ such that $C(x,w)=0$ , so it interpolates a polynomial $T\in\mathbb{F}_p^{\leq d}[X]$ such that

$T(\omega^{-j})=$ the value of the $j$ -th input

$T(\omega^{3i}),T(\omega^{3i+1}),T(\omega^{3i+2})$ are the left-input/right-input/output of the $i$ -th gate, $i=0,1,\dots,|C|-1$

using FFT in time $O(d\log d)$ .

Then the prover proves the following:

The inputs are correct: $T(\omega^{-j})=x_j$ for $j=0,1,\dots,|x|-1$

The math operations are correct: Use a public $S(\omega^{3i})=1$ iff $i$ -th gate is multiplicative; then, prove for all $y\in\{\omega^{3i}:i<|C|\}$ ,

S(y)\cdot T(y)\cdot T(y\omega)+(1-S(y))\cdot(T(y)+T(y\omega))-T(y\omega^{2})=0.

Wiring is correct: Use a public $W$ to rotate the wires that share the same value; then, prove $T(y)=T(W(y))$ for all $y\in\{\omega^{i}:i<d\}$

The output is $0$ : $T(\omega^{3|C|-1})=0$

using the gadgets mentioned above.

Polynomial commitment based on discrete-log and pairing

KZG: pairing

We have $G$ as a group of prime order $p$ with generator $g$ . We have a bilinear map $e:G\times G\to G_T$ where $G\cong G_T\cong C_p$ are $p$ -order cyclic groups. We use $\mathcal{F}=\mathbb{F}_p^{\leq d}[X]$ , and the public parameters $gp$ are generated as

(g,g^\tau,g^{\tau^2},\dots,g^{\tau^d})\leftarrow\mathsf{setup}(\lambda,\mathcal{F};\tau)

where $\tau$ is a random element in $\mathbb{F}_p$ ; after computing, abort the random $\tau$ . To commit to a polynomial $f(X)=\sum_{i=0}^d f_iX^i$ , we compute

\prod_{i=0}^d (g^{\tau^i})^{f_i}\leftarrow\mathsf{commit}(gp, f).

In the evaluation phase, with input $u$ , the prover sends $v$ and a proof $\pi$ for the verifier to verify $f(u)=v$ using $\pi$ . The idea is if $f(u)=v$ , then $f(X)-v$ is divisible by $X-u$ . Let $q(X)=(f(X)-v)/(X-u)$ , then the proof is written as $\pi=g^{q(\tau)}$ and $(v,\pi)\leftarrow\mathsf{open}(gp,f,u)$ . The verifier checks if $e\left(g^{\tau-u},\pi\right)=e\left(\frac{\mathsf{com}_f}{g^{v}}, g\right)$ . The verifier knows $g^{\tau-u}$ because it knows $g^\tau$ from the global parameter and it can compute $g^u$ from $u$ .

Time: commit $O(d)$ group exponentiations, open (including computing $q$ ) $O(d)$ group exponentiations and $O(d)$ group multiplications, verify $O(1)$ pairings. Size: proof and commitment are both $O(1)$ group elements.

The completeness is trivial. For soundness, if the prover outputs another $(v',\pi')$ such that $f(u)\not=v'$ and $e\left(g^{\tau-u},\pi'\right)=e\left(\frac{\mathsf{com}_f}{g^{v'}}, g\right)$ , then we have (denote $\delta=f(u)-v'$ ):

e\left(\frac{\mathsf{com}_f}{g^{v'}}, g\right)=e\left(g^{f(\tau)-f(u)+\delta}, g^{}\right)=e\left(g^{q(\tau)+\frac{\delta}{\tau-u}}, g^{\tau-u}\right)=e\left(g^{\tau-u},\pi'\right)

so as a result

e\left(g, g\right)^{\frac{\delta}{\tau-u}}=\frac{e\left(g,\pi'\right)}{e\left(g, g\right)^{q(\tau)}}

where the operations on the exponent are in $\mathbb{F}_p$ which corresponds to the multiplication in the groups $G$ and $G_T$ . This is the $q$ -Strong Bilinear Diffie-Hellman** assumption, which says that it is hard to compute $e(g,g)^{\frac{1}{\tau-u}}$ given $(p,G,g,G_T,e)$ and $gp=(g,g^\tau,\dots,g^{\tau^d})$ .

Knowledge Soundness

The plain KZG is not knowledge soundness without assumptions like KoE or GGM. We will discuss the assumptions in the PCP-based SNARK.

Zero-Knowledge-ness

The plain KZG is not zk since the commit algorithm is deterministic. For example, you can use $g^{f(\tau)}$ to test if $f$ is a zero polynomial. Here, we denote the protocol modified to zk in the graph below.

\begin{align*} &\text{Setup: sample }\tau,\eta\xleftarrow{\$}\mathbb{F}_p\text{ and compute }gp←(g,g^{\tau},\dots,g^{\tau^d},g^\eta);\text{ then delete }\tau,\eta.\\ &\text{Prover}&&&\text{Verifier}\\ &r\xleftarrow{\$}\mathbb{F}_p,\mathsf{com}_f=g^{f(\tau)+r\eta}\leftarrow\mathsf{commit}(gp, f; r)&\xrightarrow{\mathsf{com}_f}&&\\ &r'\xleftarrow{\$}\mathbb{F}_p,v\leftarrow f(u),\pi=\left(g^{q(\tau)+r'\eta},g^{r-r'(\tau-u)}\right)\leftarrow\mathsf{open}(gp, f, u, r;r')&\xleftarrow{u\in\mathbb{F}_p}&\;&\\ &&\xrightarrow{v,\pi}&\;&e\left(\pi_1,g^{\tau-u}\right)e\left(\pi_2,g^{\eta}\right) \stackrel{?}{=} e\left(\frac{\mathsf{com}_f}{g^{v}}, g\right) \end{align*}

The polynomial is hidden by the random $r'$ .

Variants of KZG

For multivariate poly: $f(x_1,\dots,x_k)-f(u_1,\dots,u_k)=\sum_{i\in[k]}(x_i-u_i)q_i(x_1,\dots,x_k)$ . Prover computes and sends $\pi_i=g^{q_i(\tau_1,\dots,\tau_n)}$ . Prover time is $O(km)$ group exponentials where $m\leq 2^k$ is the number of terms of $f$ . [Chopin] claims to reduce the prover time to $m+O(\sqrt{m})$ MSMs (compute $\prod_j g_j^{a_j}$ ).

For batch query on $u_1,\dots,u_m$ : $f(x)-h(x)=q(x)\prod_{i\in[m]}(x-u_i)$ where $h(x)$ is the extrapolation of $(u_i,f(u_i))$ ’s.

For multi-party, each party has a private number for setup, and the public parameters are (for example, 2-party) $gp=\left(g,g^{st},\dots,g^{(st)^d}\right)$ . The parameters can be generated party after party, since $g^{(st)^i}=\left(g^{s^i}\right)^{t^i}$ .

And it is worth noting that PLONK uses univariate KZG and plonk IOP, while vSQL uses multivariate KZG and sum-check protocol.

Bulletproofs: discrete-log

KZG requires trusted setup. Here we present Bulletproofs which has transparent setup. The goal of the prover is to convince the verifier that $f(u)=v$ where $f=\sum_{i=0}^{d+1}f_ix^i\in\mathbb{F}_p[X]$ . The setup phase is just a random $gp=(gp_{0,0},\dots,gp_{0,d})\in G$ and we assume $d+1=2^k$ is some exponential of $2$ . Then the prover and verifier go through a interaction shown below:

\begin{align*} &\text{Prover} & & &\text{Verifier}\\ &\mathsf{com}_f=\prod_{i=0}^{2^k-1}gp_{0,i}^{f_i}\leftarrow\mathsf{commit}(gp, f) &\xrightarrow{\mathsf{com}_f,v} & &\\ \text{round }1:\; &f=f_0\rightarrow f_L+u^{2^{k-1}}f_R,L=\prod_{i=0}^{2^{k-1}-1}gp_{0,i+2^{k-1}}^{f_{0,i}},R=\prod_{i=2^{k-1}}^{2^k-1}gp_{0,i-2^{k-1}}^{f_{0,i}},v_L=f_L(u),v_R=f_R(u) &\xrightarrow{L,R,v_L,v_R} & &\;\text{``I verify } f_0(u)=v \text{ '': }v\stackrel{?}{=}v_L+v_Ru^{2^{k-1}}\\ &f_1\leftarrow rf_L+f_R &\xleftarrow{gp_1,r} & &gp_1\leftarrow (gp_{0,i}^{r^{-1}}gp_{0,i+2^{k-1}})_{i\in[0,2^{k-1})},r\xleftarrow{\$}\mathbb{F}_p[X],v_1\leftarrow rv_L+v_R,\mathsf{com}_{f_1}\leftarrow L^rR^{r^{-1}}\mathsf{com}_f\\ \text{round }2:\; &f_1\rightarrow f_L+u^{2^{k-2}}f_R,L=\prod_{i=0}^{2^{k-2}-1}gp_{1,i+2^{k-2}}^{f_{1,i}},R=\prod_{i=2^{k-2}}^{2^{k-1}-1}gp_{1,i-2^{k-2}}^{f_{1,i}},v_L=f_L(u),v_R=f_R(u) &\xrightarrow{L,R,v_L,v_R} & &\text{`` I verify } f_1(u)=v_1 \text{'': }v_1\stackrel{?}{=}v_L+v_Ru^{2^{k-2}}\\ &... &... & &...\\ \text{round }k:\; &f_{k-1}\rightarrow f_L+uf_R,L=gp_{k-1,1}^{f_{k-1,0}},R=gp_{k-1,0}^{f_{k-1,1}},v_L=f_L(u),v_R=f_R(u) &\xrightarrow{L,R,v_L,v_R} & &\;\text{``I verify } f_{k-1}(u)=v_{k-1} \text{ '': }v_{k-1}\stackrel{?}{=}v_L+v_Ru\\ &f_k\in\mathbb{F}_p\leftarrow rf_L+f_R &\xleftarrow{gp_k,r} & &gp_k\leftarrow (gp_{k-1,0}^{r^{-1}}gp_{k-1,1}),r\xleftarrow{\$}\mathbb{F}_p[X],v_k\leftarrow rv_L+v_R,\mathsf{com}_{f_k}\leftarrow L^rR^{r^{-1}}\mathsf{com}_{f_{k-1}}\\ \text{final }:\; & & & &\text{`` I verify } f_{k}=v_{k} \text{ as a constant polynomial '': }\mathsf{com}_{f_k}\stackrel{?}{=}gp_{k}^{v_k}\\ \end{align*}

The $r$ ’s at every round is sampled randomly and independently. Then we apply Fiat-Shamir to the protocol. The $gp$ ’s should be computed by the verifier to prevent backdoor in parameters.

Time analysis

Setup: $O(d)$ sampling

Commit: $O(d)$ group exponentiations

Prover: $O(d)$ group exponentiations

Verifier: $O(d)$ group exponentiations

The proof size is $O(\log d)$ group elements, and the commitment size is $O(1)$ group elements (this is the first commitment $\mathsf{com}_f$ . Other commitments are computed by the verifier).

Soundness analysis.

Statistical soundness: we assume the prover does not know the correct $v$ . The final $f_k$ is a multilinear polynomial of all the random $r$ ’s, say $f_k=g_f(r_1,\dots,r_{k})$ and the function $g_f$ is deterministic. Note that every $f$ corresponds to a unique $g_f$ .

So, conditioned on the prover does not know the correct $v$ but send a $v^*\not=v$ instead, we can say $v^*=f^*(u)$ for some $f^*\not=f$ . Thus by the Schwartz-Zippel lemma, we have

\mathbf{Pr}_{r_1,\dots,r_k\sim \mathbb{F}_p}\left[f^*_k-v_k=g^*(r_1,\dots,r_k)-g(r_1,\dots,r_k)=0\middle|v^*\not=v\right]\leq\frac{k}{p}

Since $v_k\in\mathbb{F}_p$ , $f^*_k=v_k$ if and only if $\mathsf{com}_{f_k}=gp_{k}^{v_k}$ , so the soundness error is at most $\frac{k}{p}=\frac{\log d}{p}$ .

Polynomial commitment based on linear codes

The motivations to develop code-based commitment scheme are:

post-quantum secure

no group exponentiations (only hash, addition and multiplication)

small global parameters

but the new commitment scheme has larger proof sizes; additionally, without the algebraic structure, it is not homomorphic and harder to aggregate.

The goal of the prover is to prove $f(u)=v$ where $f\in\mathbb{F}_p[X]$ is a degree- $d-1$ polynomial. The value $f(u)$ equals a quadratic form

f(u)= \begin{bmatrix}1&u&\dots&u^{\sqrt{d}-1}\end{bmatrix} \begin{bmatrix}f_{0,0}&\cdots&f_{0,\sqrt{d}-1}\\f_{1,0}&\cdots&f_{1,\sqrt{d}-1}\\\vdots&\ddots&\vdots\\f_{\sqrt{d}-1,0}&\cdots&f_{\sqrt{d}-1,\sqrt{d}-1}\end{bmatrix} \begin{bmatrix}1\\ u^{\sqrt{d}}\\ \vdots\\ u^{(\sqrt{d}-1)\sqrt{d}}\end{bmatrix}

because $f(u)=\mathbf{u}^T\mathbf{F}\mathbf{u}'=\sum_{i=0}^{\sqrt{d}-1}\sum_{j=0}^{\sqrt{d}-1} f_{i,j}u^{i\sqrt{d}+j}$ .

Setup phase samples a random hash function. Then in commit phase, the prover encode each row of $\mathbf{F}$ with a $[n,\sqrt{d}]$ linear code (this encoding is a public algorithm, e.g. Reed-Solomon code), represented by a matrix $\mathbf{C}\in\mathbb{F}_p^{\sqrt{d}\times n}$ . The result is a matrix of $\sqrt{d}\times n$ :

\mathbf{P}=\begin{bmatrix}\mathbf{f}_0\mathbf{C}\\\mathbf{f}_1\mathbf{C}\\\vdots\\\mathbf{f}_{\sqrt{d}-1}\mathbf{C}\end{bmatrix}.

After encoding, use Merkle tree to generate the commitment with the hash function in the global parameters. The leaf nodes are the $n$ columns of the matrix $\mathbf{P}$ . The root hash is the commitment of the function.

Then the prover and verifier go through the following interaction to verify $f(u)=x$ :

\begin{align*} &\text{Prover}&&&\text{Verifier}\\ & &\xleftarrow{\mathbf{r}} & &\mathbf{r}\xleftarrow{\$}\mathbb{F}_p^{\sqrt{d}}\\ &\mathbf{w}\leftarrow\mathbf{r}^T\mathbf{F}\;(\text{originally we send encoded version }\mathbf{r}^T\mathbf{P}\text{, this is for optimization}) &\xrightarrow{\mathbf{w}} &\; &\mathbf{w}\leftarrow\mathbf{w}\mathbf{C}\text{ is indeed a codeword}\\ & &\xleftarrow{s} &\; &s\xleftarrow{\$}\mathbb{F}_p\\ &\text{Generate Merkle proof }\pi\text{ that the }s\text{-th column of }\mathbf{P}\text{ equals to }\mathbf{v}\in\mathbb{F}_p^{\sqrt{d}} &\xrightarrow{\pi,\mathbf{v}} &\; &\text{verify the proof }\pi\text{, and verify }\mathbf{r}^T\mathbf{v}=\mathbf{w}_s\text{, then repeat sampling }s\text{ multiple times}\\ &\mathbf{w}'\leftarrow\mathbf{u}^T\mathbf{F} &\xrightarrow{\mathbf{w}'} &\; &\mathbf{w}'\leftarrow\mathbf{w}'\mathbf{C}\text{ is indeed a codeword}\\ & &\xleftarrow{s} &\; &s\xleftarrow{\$}\mathbb{F}_p\\ &\text{Generate Merkle proof }\pi\text{ that the }s\text{-th column of }\mathbf{P}\text{ equals to }\mathbf{v}\in\mathbb{F}_p^{\sqrt{d}} &\xrightarrow{\pi,\mathbf{v}} &\; &\text{verify the proof }\pi\text{, and verify }\mathbf{r}^T\mathbf{v}=\mathbf{w}'_s\text{, then repeat sampling }s\text{ multiple times}\\ & & &\; &\mathbf{w}'\mathbf{u}'\stackrel{?}{=}x \end{align*}

Analysis of the protocol is as follows:

Keygen: $O(1)$ , transparent

Commit:
- Encoding: $O(d \log d)$ field multiplications using Reed-Solomon code, $O(d)$ using linear-time encodable code
- Merkle tree: $O(d)$ hashes, $O(1)$ commitment size

Prover time: $O(d)$ field multiplications

Proof size: $O(\sqrt{d})$

Verifier time: $O(\sqrt{d})$

Fast Reed-Solomon IOPP

Intro: Merkle trees for univariate poly-commitment

An intuitive attempt is to commit to a vector of evaluations of a given polynomial $f\in\mathbb{F}_p^{\leq d}[X]$ . The leaves of the tree are $\{f(x):x\in\mathbb{F}_p\}$ and the tree hashes every neighboring $2$ nodes to form a new layer. When $V$ requires $f(r)$ , the prover sends the hash values along the path up to the root.

The Merkle tree for commitment requires $O(pd)$ field multiplication (using Qin-Horner method) to commit, and the proof size is $O(\log p)$ hash values. The verifier takes $O(\log p)$ hashes. The setup only generates a hash function, so it is transparent.

There are 2 problems:

The field may be very large; and the time is linear to the max degree.

The verifier cannot know if $f$ has degree at most $d$ .

Fix the Problem 1

FRI commitment uses a multiplicative subset of $\mathbb{F}_p$ : $\Omega=\{\omega^i:\omega^n=1,i=0,1,\dots,n-1\}$ , for example, $\{1,3,9,27,-1,-3,-9,-27\}$ in $\mathbb{F}_{41}$ . The leaves of the Merkle tree will be the values $f(\omega^i)$ .

We call the rate $\rho=\frac{d}{n}$ the FRI blowup factor.

Fix the Problem 2

\begin{align*} &\text{Prover} & & &\text{Verifier}\\ &f_0=f,\deg f_0\leq k-1,\Omega_0=\{1,\omega,\dots,\omega^{n-1}\},n=\rho^{-1}k &\xrightarrow{\mathsf{com}_0=\mathsf{MerkleCommit}(f_0|_{\Omega_0})} & &\\ \text{round }1:\; &f_0(X)=f_{0,e}(X^2)+Xf_{0,o}(X^2) &\xleftarrow{r_1} & &r_1\xleftarrow{\$}\mathbb{F}_p\\ &f_1(Z)=f_{0,e}(Z)+r_1f_{0,o}(Z),\deg f_1\leq\frac{k}{2}-1,\Omega_1=\{x^2:x\in\Omega_0\} &\xrightarrow{\mathsf{com}_1=\mathsf{MerkleCommit}(f_1|_{\Omega_1})} & &\\ \text{round }2:\; &f_1(X)=f_{1,e}(X^2)+Xf_{1,o}(X^2) &\xleftarrow{r_2} & &r_2\xleftarrow{\$}\mathbb{F}_p\\ &f_2(Z)=f_{1,e}(Z)+r_2f_{1,o}(Z),\deg f_2\leq\frac{k}{4}-1,\Omega_2=\{x^2:x\in\Omega_1\} &\xrightarrow{\mathsf{com}_2=\mathsf{MerkleCommit}(f_2|_{\Omega_2})} & &\\ &... & & &...\\ \text{round }\log_2k:\; &f_{t-1}(X)=f_{t-1,e}(X^2)+Xf_{t-1,o}(X^2) &\xleftarrow{r_t} & &r_t\xleftarrow{\$}\mathbb{F}_p\\ &f_t(Z)=f_{t-1,e}(Z)+r_tf_{t-1,o}(Z),\deg f_t=0,\Omega_t=\{x^{2^{\log_2k}}:x\in\Omega_0\} &\xrightarrow{\mathsf{com}_t=f_t|_{\Omega_t}\ } & &\text{verify that }f_t|_{\Omega_t}\text{ is a constant vector}\\ \text{query }:\; &\text{对于每个 }j_q,\text{打开从第 }t=\log_2k\text{ 层到第 }0\text{ 层的路径:}\pi_{t,j_q}\leftarrow\text{MerkleOpen}(\mathsf{com}_t,j_q)&\xleftarrow{\{j_q\}_{q=1}^s} & &\text{随机选择 }j_1,\dots,j_s\in[0,|\Omega_t|-1]\\ &\\ &\pi_{t-1,j_q^+},\pi_{t-1,j_q^-}\leftarrow\text{MerkleOpen}(\mathsf{com}_{t-1},\text{对应的两个索引})\\ &... &\xrightarrow{\text{所有打开值}+\text{路径}} & &\text{验证每条路径的一致性；同时，}\\ &&&& \text{对于第 }t\text{ 层: }f_t(\omega_t^{j_q})\text{ 是常数}\\ &&&& \text{对于第 }i\text{ 层: 检查}f_{i+1}(z)\stackrel{?}{=}\frac{r_{i+1}+x}{2x}f_i(x)+\frac{r_{i+1}-x}{-2x}f_i(-x)\\ &&&& \text{其中 }z=x^2,x=\omega_i^{j},\omega_i\text{ 是第 }i\text{ 层的生成元}\\ \end{align*}

Remark.

It is easy from $f_i|_{\Omega_i}$ to $f_{i+1}|_{\Omega_{i+1}}$ . This is because $f_{i+1}(x^2)=f_{i,e}(x^2)+r_{i+1}f_{i,o}(x^2)=\frac{f_i(x)+f_i(-x)}{2}+r_{i+1}\frac{f_i(x)-f_i(-x)}{2x}$ . Also, the verifying process relies on this equation.

To analysis the soundness, we introduce the relative Hamming distance $\mathsf{Ham}_{\Omega}(f,g)=\frac{|\{x\in\Omega:f(x)\not=g(x)\}|}{|\Omega|}$ and $\delta=\min_{h\in\mathbb{F}_p^{\leq k-1}[X]}\left(\mathsf{Ham}_{\Omega}(f,h)\right)$ , which is the distance on $\Omega$ of $f$ to the closest degree- $(k-1)$ polynomial.

We consider $f$ such that $\delta\leq 1-\sqrt{\rho}$ , which is to say, the $f$ is far from any degree- $(k-1)$ polynomial.

A cheating prover has a high ( $> k-1$ ) degree polynomial $f$ . In case the prover is accepted, either it folds incorrectly, or it folds correctly and is lucky to have those $r_i$ ’s that reduce the degree to $0$ eventually. In the first case, the accept probability is $\left(1-\delta\right)^t$ ; in the second case, the accept probability is at most $\frac{k}{p}$ . As a result, to reach an accept probability $2^{-\lambda}$ ( $\lambda$ is the security parameter), $s$ should be $\Omega\left(\frac{\lambda}{\log\rho^{-1}}\right)$ .

(The security proof is problematic.)

Polynomial commitment from FRI

The FRI introduces another two problems:

Prover has only committed to evaluations on a subset $\Omega\subset\mathbb{F}_p$ .

Verifier only knows that $f$ is close to a low-degree polynomial (in the sense of Hamming distance), but not necessarily a low-degree polynomial.

There is an attack to the first problem: the prover can commit to a polynomial $g$ that agrees with $f$ on $T\subset\Omega$ with much lower degree ( $|T|=k$ ). Then $\delta<1-\rho<1-\sqrt{\rho}$ and the soundness is broken.

Then to confirm that $f(r)=v$ where $f\in\mathbb{F}_p^{\leq d}[X]$ , the prover applies FRI on the polynomial $(f(x)-v)/(x-r)$ , which has degree at most $d-1$ .

Fiat-Shamir transform

Definition. (Interactive Security) A poly-commitment scheme runs at $\lambda$ bits of interactive security if and only if: assume $P$ cannot find a collision of the hash function in the global parameters, then for every $P^*$ , the probability that $P^*$ can make the verifier accept a false claim is at most $2^{-\lambda}$ .

To find out if a challenge is lucky, the prover should interact with the verifier at least $2^\lambda$ times in average. It is unlikely that $V$ continues to interact after rejecting the same prover so many rounds. So following is the

Definition. (Non-interactive security) A poly-commitment scheme runs at $\lambda$ bits of non-interactive security if and only if: for every $P^*$ willing to compute $2^k$ hashs, the probability that $P^*$ can make the verifier accept a false claim is at most $2^{k-\lambda}$ .

In this scheme, a lying $P$ can propose the grinding attack silently, without interacting. This definition is weaker than the interactive security, i.e., if a prover breaks the interactive security of one interactive protocol, then it can also break the non-interactive security of the Fiat-Shamir transform of that protocol.

We show that Fiat-Shamir transform may be insecure. Consider a protocol of the empty language. In this protocol, $P$ sends a nonce, and then $V$ sends a random bit $r$ , and accepts iff $r=1$ . The $V$ accepts iff every round it accepts, so the soundness error is $2^{-\lambda}$ with $\lambda$ rounds. However, the Fiat-Shamir transform of this protocol is insecure, since the prover can grind for a nonce 2 times per round in average, in order to find a nonce that leads to $r=1$ for every round. The cost is $2\lambda$ in average, which is much smaller than $2^\lambda$ and the protocol does not satisfy the non-interactive security.

Applying Fiat-Shamir to a many-round interactive protocol can lead to a huge loss in security, whereby the resulting non-interactive protocol is totally insecure. So we need “round-by-round” soundness to describe the security of an interactive protocol, which means that for every prover in every round, if the prover is not “lucky” enough to make the verifier accept, then the probability that the prover can make the verifier accept in the next round is at most $2^{-\lambda}$ .

The sum-check protocol is round-by-round sound, so its Fiat-Shamir transform is secure.

Linear PCP-based SNARK

Here is another paradigm for SNARKs: a cryptographic tool and a linear PCP builds a SNARK system. Before IOP and PCS paradigm was put forward, the main paradigm is PCP, e.g. Kilian’92 and Micali’00 is PCP+Merkle tree; IKO’07 is linear PCP; GGPR’13 uses QAP as linear PCP.

Quadratic Arithmetic Program (QAP)

QAP is another method to encode a circuit into a polynomial. Recall that in the first IOP for circuit SAT, we label each gate with a bitstring, and the satisfying value is encoded into a multilinear polynomial on the hypercube (interpolated to $\mathbb{F}_p^n$ ). In PLONK, we label each wire with an integer, and the satisfying value is encoded into a univariate polynomial on the roots of unity.

QAP labels all the input and output of all multiplicative gates. We label each gate with integers in $[n]$ and each wire (except output wires of additive gate) with integers in $[m]$ . The value of wire $i$ is $c_i$ . Then we define $l_i(x)$ by $l_i(\omega^j)=1,\omega^n=1$ , iff $c_i$ is the left input of the gate $j$ ; and the same as $r_i(x)$ , $o_i(x)$ for the right input and output.

Note that we say $i$ is the left input of gate $j$ also if $i$ passes several additive gates before it is the left input of gate $j$ . Then we have: the value on the circuit is proper iff

\left.V(x)=\prod_{i\in[n]}(x-\omega^i)\;\middle|\;L(x)R(x)-O(x)\right.

where

L_{\mathbf{c}}(x)=\sum_{i\in[m]}c_il_i(x),R_{\mathbf{c}}(x)=\sum_{i\in[m]}c_ir_i(x),O_{\mathbf{c}}(x)=\sum_{i\in[m]}c_io_i(x)

exactly encodes the left-input, right-input and output of gate $j$ with $L_{\mathbf{c}}(\omega^j)$ , $R_{\mathbf{c}}(\omega^j)$ , and $O_{\mathbf{c}}(\omega^j)$ .

Thus, the statement that $P$ knows $w$ such that $C(x,w)=0$ holds iff $P$ knows a vector $\mathbf{c}$ such that $L_{\mathbf{c}}(x)R_{\mathbf{c}}(x)-O_{\mathbf{c}}(x)=q(x)\prod_{i\in[n]}(x-\omega^i)=q(x)V(x)$ .

Constructing SNARK: [PGHR13] and [Groth16]

The global parameter is $gp=\left(\left(g^{\tau^i}\right)_{i\in[n]},\left(g^{l_i(\tau)}\right)_{i\in[m]},\left(g^{r_i(\tau)}\right)_{i\in[m]},\left(g^{o_i(\tau)}\right)_{i\in[m]},g^{q(\tau)}\right)$ where $\tau$ is a random element (later deleted) in $\mathbb{F}_p$ .

With KZG in our hands, we reach to a simple protocol: The prover can compute $\pi_l=g^{L_{\mathbf{c}}(\tau)}$ , $\pi_r=g^{R_{\mathbf{c}}(\tau)}$ , $\pi_o=g^{O_{\mathbf{c}}(\tau)}$ , and $\pi=g^{q(\tau)}$ by homomorphically combining the global parameters. The verifier checks if $e(\pi_l,\pi_r)=e(\pi_o,g) e(\pi,g^{V(\tau)})$ .

However, the protocol is far from the real protocol.

Problem 1: How to make sure that $\pi_l$ is computed from $g^{l_i(\tau)}$ ?

KoE Assumption. We use $gp=(g^{l_i(\tau)},g^{\alpha l_i(\tau)})^{i\in[m]}$ . If the prover can compute $\pi_1=g^{\sum_ic_il_i(\tau)}$ and $\pi_2=g^{\alpha \sum_ic_il_i(\tau)}$ without knowing $\alpha$ , then there is an extractor that can extract $c_i$ ’s from the prover. The Pinocchio [PGHR13] uses KoE assumption.

GGM Assumption. Any prover can only compute linear combinations of $g^{l_i(\tau)}$ ’s, i.e., if the prover can compute $\pi=g^{\sum_ic_il_i(\tau)}$ , then it must know every $c_i$ . [Groth16] uses GGM assumption.

Problem 2: How to make sure every $\pi$ share the same $c_i$ ’s?

We use another random $\beta$ for setup (also deleted after setup). We add $g^{\beta(l_i(\tau)+r_i(\tau)+o_i(\tau))}$ and $g^{\beta}$ to the global parameters.

Problem 3: How to handle the public input and the expected output?

Recall that all the wires except the output of add gates are labelled. So the input wire and output wire (if not labelled, add a multiply $1$ gate) are labelled with $I_{io}$ . The idea is to split $L_{\mathbf{c}}$ apart, and let verifier compute $\sum_{i\in I_{io}}c_il_i(x)$ .

Analysis

Setup: $O(n)$ group exps, since there is a unique $j$ that makes $l_i(\omega^j)=1$ , if we label different wires with a same output gate with different labels.

Prover: $O(n\log n)$ for NTT. The prover computes the values of polynomial $q$ on the unity set $\langle\omega\rangle$ and uses NTT to get its coefficients. Only with coefficients can the prover compute $g^{q(\tau)}$ . Also, $O(n)$ group exps.

Proof size: $O(1)$

Verifier: $O(1)$ for pairing, $O(|I_{io}|)$ group exps.

Recursive SNARKs

Recall if the circuit size is $n$ :

Pinocchio and Groth16: prover time $O(n\log n)$ and proof size $O(1)$

PLONK-KZG: prover time $O(n\log n)$ (every gadgets the prover should compute NTT) and proof size $O(1)$

FRI-based: prover time and proof size $O(\log^2 n)$

How can we achieve the better one in both prover time and proof size? The recursive proof provides a “proof of proof”. Let the inner system be $(S,P,V)$ , which proves that the prover knows the $w$ such that $C(x,w)=1$ . Suppose that the prover and verifier $P$ and $V$ is fast but the proof $\pi$ is long, the idea is to construct another proof system $(S^\prime,P^\prime,V^\prime)$ proving that the prover knows the $\pi$ such that $V(vp,x,\pi)=1$ . This outer system has slower prover, but since the verifier $V$ is fast, the circuit of $V$ is much smaller than that of the original $C$ . So the outer prover $P^\prime$ is not so slow, as we expected.

A simple argument shows that if $(S,P,V)$ and $(S^\prime,P^\prime,V^\prime)$ are both knowledge sound, then the integrated proof system is also knowledge sound. The idea of proof is to regard the extractor $E^\prime$ as a malicious prover of $(S,P,V)$ . The probability gap is the sum of those of the original prove systems, thus negligible.

Application 1: incrementally verifiable computation

Suppose that a computation $F$ is applied to initial state $s_0$ recursively, each round takes an input $w_i$ . The prover wants to prove that it knows the $w_i$ ’s such that each computation step is correct. The verifier wants to verify that after $n$ steps, the state $s_0$ eventually becomes $s_n$ . (TODO)

Application 2: streaming proof generation

Suppose there is a bunch of transactions to be proved. We need not wait the transactions all to take place; rather, we generate proofs on every arrival of batches. For example if the batch size is $1$ , first $\pi_1$ proves the prover knows $w_1$ that $C(x_1,w_1)=0$ ; then $\pi_2$ proves that the prover knows $w_2$ such that $C(x_2,w_2)=0$ . Then a proof is generated to prove that the prover knows $\pi_1,\pi_2,\dots$ such that the verifier accepts them all.

Construction: alternating groups

Recall in KZG, the public parameter is a tuple $(p,G,q,g,e)$ , where $G$ is a group of order $p$ with $g$ its generator. And in this section, we regard the verifier algorithm as a circuit, so we had better embed $G$ into some vector space over $\mathbb{F}_q$ .

Definition (Algebraic Groups). Group $G\leq\mathbb{F}_q^\ell$ is an algebraic group if and only if
there are polynomials $f_1,\dots,f_\ell\in\mathbb{F}_q[X^\ell]$ such that for all $a,b\in G$ , $a+b=\left(f_1(a,b),\dots,f_\ell(a,b)\right)$ ;
there is an efficient algorithm testing if $a=b$ .

Can we make $|G|=p$ a subgroup of $\mathbb{F}_p^\ell$ ? No, because the discrete log is trivial in such groups. Take Smart Attack in the anomalous curve for an example.

The idea is to construct a group chain that: $|G_1|=p$ , $G_1<\mathbb{F}_q^\ell$ ; $|G_2|=q$ , $G_2<\mathbb{F}_r^\ell$ . The original circuit $C$ is over $\mathbb{F}_p$ . The KZG PCS will help us understand this recursive SNARK, because in that scheme, an exponentiation embeds every elements in $\mathbb{F}_p$ into $G_1<\mathbb{F}_q^\ell$ . Now that $V$ is a circuit over $\mathbb{F}_q$ , the proof that the prover $P^\prime$ knows a proof $\pi_1$ involves group arithmetic on $\mathbb{F}_q$ . Since $G_2$ has order $q$ , then $P^\prime$ sends a proof in $G_2$ .

However it is inefficient to have two pairing groups. For one pairing group and another non-pairing group, we can use KZG for one and bulletproofs for the other. The Halo is based on two non-pairing groups, i.e. elliptic group $E(\mathbb{F}_p)$ and $E(\mathbb{F}_q)$ of the elliptic curve $y^2=x^3+5$ . Details https://eprint.iacr.org/2019/1021.pdf

Construction: folding

Another idea is homomorphic commitment, which we can compute a commitment of an addition simply by adding the commitments. We use this to prove two circuits at once (generate one proof for the two circuits) . More precisely, all circuits can be written in R1CS form $(Az)\circ(Bz)=Dz$ . An idea to prove two instance $z_1 = (x_1, w_1),z_2 = (x_2, w_2)$ of the same R1CS at one time is to randomly choose an $r$ and prove for some R1CS, the combination $z_1+rz_2$ is feasible. However, we cannot have $(Az_1+rAz_2)\circ(Bz_1+rBz_2)=D(z_1+rz_2)$ if $z_1,z_2$ are feasible inputs. Thus we turn to modify the definition and put forward Relaxed R1CS.

HW

❎ Knowledge soundness is a meaningful notion when a prover claims that there are no satisfying assignments to a system of constraints

❎ Knowledge soundness is a meaningful notion for the sumcheck protocol

✅ Knowledge soundness (as defined in lecture 2) implies soundness

❎ Non-interactive implies publicly verifiable

❎ Vector commitments are as expressive as polynomial commitments

✅ Polynomial extensions are distance amplifying encodings

✅ Multivariate polynomial encoding reduces the total degree by an exponential factor compared to univariate polynomial encoding

✅ The verifier in the sum-check protocol is oblivious to the polynomial g whose evaluations are being summed until the last step

❎ The uniqueness of multilinear extensions is crucial for the soundness of the sumcheck protocol

❎ The claim f(x) = g(x) for all k-bit inputs, where f and g are low degree polynomials over a large field, can be reduced to the following sumcheck claim: _{x ^k} (f(x)-g(x)) = 0

❎ The polynomial IOP for SAT (from lecture 4) can be transformed into a SNARK with verifier complexity independent of circuit size

✅ The polynomial IOP for SAT (from lecture 4) has optimal prover complexity

✅ Extending the polynomial IOP for SAT in the natural way to support gates with n inputs will result in a sumcheck protocol over (n+1) * logS variables

verkle trees