Plonkish Arithmetization

This is a key component of the PLONK protocol, before diving in, let's explore the term "arithmetization" in the context of Zero Knowledge;

Arithmetization is the process of converting computational statements into mathematical equations, specifically polynomial equations, that can be efficiently proven and verified. It's like translating a computer program into a mathematical language that zero-knowledge proofs can understand.

Here is the basic building blocks of the concept of Arithmetization;

  • Variables: Represent values in computation

  • Constraints: Mathematical relationships between variables

  • Polynomials: Used to encode relationships and checks

  • Fields: Mathematical structures where computation occurs

Over the years many forms of Arithmetization have been laid out, let's explore a few;

Rank-1 Constrant System (R1CS)

  • Used in protocols like Groth16

  • Represents computation as: AxBx=CxA⋅x ∘ B⋅x = C⋅x

  • Where:

    • A,B,CA, B, C are matrices

    • xx is the witness vector

    • represents element-wise multiplication

Plonkish Arithmetization

  • Used in PLONK and its variants

  • Gate format: qLa+qRb+qOc+qM(ab)+qC=0q_L⋅a + q_R⋅b + q_O⋅c + q_M⋅(a⋅b) + q_C = 0

  • Where:

    • a,b,ca, b, c are witness values

    • qL,qR,qO,qM,qCq_L, q_R, q_O, q_M, q_C are selector polynomials, indicating the gate operations

Algebraic Intermediate Representation

  • Used in STARKs

  • Represents computation as polynomial constraints over execution trace

  • Constraints form: f(x1,x2,...,xn)=0f(x_1, x_2, ..., x_n) = 0

Common Steps in Arithmetization

Step 1: Statement → Circuit

  • Convert computation into logical/arithmetic circuit

  • Identify inputs, outputs, and intermediate values

Step 2: Circuit → Constraints

  • Transform circuit gates into mathematical constraints

  • Ensure constraints capture computational logic

Step 3: Constraints → Polynomials

  • Convert constraints into polynomial equations

  • Ensure polynomial degrees are manageable

Step 4: Optimization

  • Reduce the number of constraints

  • Minimize polynomial degrees

  • Optimize for proof size/verification time

Back to PLONK Arithmetization

Say we are given this computation; com=xy(p+q)com = xy(p+q)

it circuit execution trace would be as follows;

z=xyz = x * y
r=p+qr = p + q

out=zrout = z * r

when the circuit is being executed all these variables would be taking a concrete value, forming the execution trace table with which proof would be generated;

This is how the circuit would look like when x=2x = 2, y=3y = 3, p=4p = 4, and q=5q = 5.

A
B
C

2

3

6

4

5

9

6

9

54

Using this execution trace, we can now introduce the constraint gate, encoding the gate operation and ensuring correct execution. This would introduce the vanilla PLONK gate equation.

qLa+qRb+qOc+qM(ab)+qC=0q_L⋅a + q_R⋅b + q_O⋅c + q_M⋅(a⋅b) + q_C = 0

where the a,b,a, b, and cc are the left, right, and output wires of the gate.

To say a+b=ca+b=c, we let qLq_L and qRq_R equal 11, while qOq_O equals 1-1. The other coefficients wouldn't be used, so we set qMq_M and qCq_C to 00.

To say a×b=ca×b = c, we set qOq_O equal to 1-1, set qMq_M equal 11, and set the rest of the coefficients to 00.

To bind a variable to a public value, we let qR,qO,qMq_R, q_O, q_M equal to 00, qLq_L equal to 1, and qCq_C equal to the public value.

so for the 3 gates we are working with the gate equations would look like this;

gate 1: 02+03+16+1(23)+0=00⋅2 + 0⋅3 + -1⋅6 + 1⋅(2⋅3) + 0 = 0; this equation holds.

gate 2: 14+15+19+0(45)+0=01⋅4 + 1⋅5 + -1⋅9 + 0⋅(4⋅5) + 0 = 0; this equation holds.

gate 3: 06+09+154+1(69)+0=00⋅6 + 0⋅9 + -1⋅54 + 1⋅(6⋅9) + 0 = 0; this equation holds.

the next is to collect all these vectors so we can build just one relationship with which the protocol can progress;

qL=(0,1,1)q_L = (0,1,1)

qR=(0,1,0)q_R = (0,1,0)

qO=(1,1,1)q_O = (-1,-1,-1)

qM=(1,0,1)q_M = (1,0,1)

qC=(0,0,0)q_C = (0,0,0)

a=(2,4,6)a = (2,4,6)

b=(3,5,9)b = (3,5,9)

c=(6,9,54)c = (6,9,54)

Next up is to interpolate these vectors to obtain a polynomial, the nature of this interpolation is unique, and this interpolation would be done on the Domain of the roots of unity. But our example would encounter a problem seeing that the size of the vector is 33 which is not a power of 22 which is needed to manipulate/obtain a multiplicative sub-group over the Domain root of unity which would be useful in the protocol as we progress. To solve our little problem would extend the vectors Zero before carrying out the interpolation over the root of unity ω\omega when nn is 4;

These are the resulting vectors after the extension;

qL=(0,1,1,0)q_L = (0,1,1,0)

qR=(0,1,0,0)q_R = (0,1,0,0)

qO=(1,1,1,0)q_O = (-1,-1,-1,0)

qM=(1,0,1,0)q_M = (1,0,1,0)

qC=(0,0,0,0)q_C = (0,0,0,0)

a=(2,4,6,0)a = (2,4,6,0)

b=(3,5,9,0)b = (3,5,9,0)

c=(6,9,54,0)c = (6,9,54,0)

Using this technique, we can successfully prove and verify this circuit execution trace with the help of these 8 polynomials no matter how big the circuit is (Number of constraints).

But that is one more thing! we have only been able to mathematically prove that these gates are present in a circuit but we have not established how these gates are wired to other gates and signals in the circuit. This introduces us to the term "Copy Constraints".

Copy Constraints

Copy constraints are a fundamental mechanism in the PLONK protocol that ensure certain wire values are equal across different parts of a circuit. They solve the critical problem of enforcing value consistency without revealing the actual values.

Key Characteristics:

  1. Verify that specific wire values are identical

  2. Maintain zero-knowledge properties

  3. Efficiently check value equality across circuit

We need to check that the wire vaules are copied correctly, this can be checked by making use of the permutation argument Bayer, Groth12.

Looking into this slide;

permutationNow we've got a good understanding of how permutation check is designed, let's explore it connected to the PLONK protocol;

The best method for explaining this concept I found was documented by the Lambdaclass team, I would be explaining this PLONKISH wire relationship using this concept also. We would be representing the circuit structure and the witnesses in the form of a matrix.

qL=(0,1,1,0)q_L = (0,1,1,0)

qR=(0,1,0,0)q_R = (0,1,0,0)

qM=(1,0,1,0)q_M = (1,0,1,0)

qO=(1,1,1,0)q_O = (-1,-1,-1,0)

qC=(0,0,0,0)q_C = (0,0,0,0)

Introducing the QQ matrix;

Q_L
Q_R
Q_M
Q_O
Q_C

0

0

1

-1

0

1

1

0

-1

0

1

0

1

-1

0

0

0

0

-1

0

The VV Matix The claim in the previous section is not an "if and only if" statement because the following trace columns do satisfy the equations but do not correspond to a valid execution:

A
B
C

2

3

6

4

5

9

6

9

54

The VV matrix encodes the carry of the results from one gate to the right or left operand of a subsequent one. These are called wirings. Like the QQ matrix, it's independent of the individual evaluation. It consists of indices for all input and intermediate variables. In this case, that matrix is:

L
R
0

1

2

0

4

5

3

0

3

6

Here 00 is the index of zz, 11 is the index of xx, 22 is the index of yy, 33 is the index of vv, 44 is the index of pp, 55 is the index of qq, and 66 is the index of the output outout. Now we can update the claim to have an "if and only if" statement.

New PLONK Claim: Let TT be a matrix with columns AA, BB, CC. It corresponds to a proper evaluation of the circuit if and only if;

  1. for all ii the following equality holds QLiAi+QRiBi+QiCi+QMi(AiBi)+QCi=0Q_{Li}⋅A_i + Q_{Ri}⋅B_i + Q_{i}⋅C_i + Q_{Mi}⋅(A_i⋅B_i) + Q_{Ci} = 0

  2. for all i,j,k,li,j,k,l such that Vi,j=Vk,lV_{i,j} = V_{k,l} we have Ti,j=Tk,lT_{i,j} = T_{k,l}, this is what is referred to as the copy constraint. Would dive in deeper as we proceed.

With this we can successfully encode the circuit execution and enforce its wiring, just what we need for the PLONK protocol.

In Zero Knowledge proof protocols, we almost always operate on polynomials and we would like to do the same here, we need to convert these Claim equations to polynomials.

Starting with Claim 1;

for all ii the following equality holds QLiAi+QRiBi+QiCi+QMi(AiBi)+QCi=0Q_{Li}⋅A_i + Q_{Ri}⋅B_i + Q_{i}⋅C_i + Q_{Mi}⋅(A_i⋅B_i) + Q_{Ci} = 0 Let ωω be a primitive NthN-th root of unity and let H=ωi:0i<NH=ω^i:0≤i<N. Let a,b,c,qL,qR,qM,qO,qCa,b,c,q_L,q_R,q_M,q_O,q_C be the polynomials of degree at most NN that interpolate the columns A,B,C,QL,QR,QM,QO,QCA,B,C,Q_L,Q_R,Q_M,Q_O,Q_C at the domain HH. This means for example that a(ωi)=Aia(ω^i) = A_i for all ii, and similarly for all the other columns.

With this, Claim 11 can be expressed as follows;

a(x)qL(x)+b(x)qR(x)+a(x)b(x)qM(x)+c(x)qO(x)+qC(x)=0a(x)q_L(x) + b(x)q_R(x) + a(x)b(x)q_M(x) + c(x)q_O(x) + q_C(x) = 0 for all xx in domain HH. Interestingly, in the world of polynomials, this is also equivalent to; aqL+bqR+abqM+cqO+qC=zHtaq_L + bq_R + abq_M + cq_O + q_C = z_{H^t}, where XN1X^N - 1.

Transforming Claim 2; To perform this transformation, recall our discussion on the Permutation check. A permutation is a rearrangement of a set, usually denoted σ\sigma. For finite sets, it is a map from a set to itself that takes all values. In our case, the set will be the set of all pairs;

I=(i,j):I = (i,j): such that 0i<N0 ≤ i < N, and 0j<30 ≤ j < 3

The matrix VV induces a permutation of this set where σ((ij))σ((ij)) is equal to the indices of the next occurrence of the value at position (i,j)(i,j). If you are already at the last occurrence, go to the first one. By next, we mean the following occurrence, as if the columns were stacked on each other. Let's see how this works in the example circuit. Recall VV is;

L
R
0

1

2

0

4

5

3

0

3

6

The permutation expression for this case can be represented as such;

  • σ((0,0))=(0,0)\sigma((0,0)) = (0,0)

  • σ((0,1))=(0,1)\sigma((0,1)) = (0,1)

  • σ((0,2))=(2,0)\sigma((0,2)) = (2,0)

  • σ((1,0))=(1,0)\sigma((1,0)) = (1,0)

  • σ((1,1))=(1,1)\sigma((1,1)) = (1,1)

  • σ((1,2))=(2,1)\sigma((1,2)) = (2,1)

  • σ((2,0))=(0,2)\sigma((2,0)) = (0,2)

  • σ((2,1))=(1,2)\sigma((2,1)) = (1,2)

  • σ((2,2))=(2,2)\sigma((2,2)) = (2,2)

It can be easily observed Check 2 can be expressed as follows; for all (i,j)I(i,j) \in I; Ti,j=Tσ((i,j))T_{i,j} = T_{\sigma((i,j))}

This mathematically implies that set AA and BB are equal given that;

A=((i,j),Ti,j):(i,j)IA = {((i,j), T_{i,j}): (i,j) \in I}

B=(σ((i,j)),Ti,j):(i,j)IB = {(\sigma((i,j)), T_{i,j}): (i,j) \in I}

This is great, but the problem we had initially was that we would like to represent Check 2 problem as a Polynomial Identity Representation (PIR) which would be more convenient for this protocol, the solution to this problem would introduce the concept of Equality of Sets.

First, explore this, a deep dive into multiset equality check, this would give a very good overview of this concept before diving into what this has to do with our polynomial identity problem.

Equality of sets

Problem: How to check if two sets A=a0,a1,,ak1A = {a_0, a_1, \ldots, a_{k-1}} and B=b0,b1,,bk1B = {b_0, b_1, \ldots, b_{k-1}} of field elements are equal?

Naive Attempt: Multiplication

You might think to multiply all elements of A and B , then compare the results:

i=0k1aiandi=0k1bi\prod_{i=0}^{k-1} a_i \quad \text{and} \quad \prod_{i=0}^{k-1} b_i.

If they match, AA and BB might be equal. However, this test fails in some cases. For instance:

A=4,15A = {4, 15} and B=6,10B = {6, 10} both yield 6060 as the product, but clearly ABA \neq B.

Better Approach: Polynomials

Instead of comparing products, we use polynomial roots to represent the sets. For A=a0,a1,,ak1A = {a_0, a_1, \ldots, a_{k-1}} , construct the polynomial:

fA(X)=(a0+X)(a1+X)(ak1+X)f_A(X) = (a_0 + X)(a_1 + X)\cdots(a_{k-1} + X).

Do the same for BB :

fB(X)=(b0+X)(b1+X)(bk1+X)f_B(X) = (b_0 + X)(b_1 + X)\cdots(b_{k-1} + X).

These polynomials encode the sets $A$ and $B$ . The sets $A$ and $B$ are equal if and only if:

fA(X)f_A(X) = fB(X)f_B(X),

which holds because of the unique factorization theorem in polynomial algebra.

Checking Equality Efficiently

Instead of comparing the polynomials symbolically, we use a random evaluation strategy. Pick a random field element γ\gamma and evaluate:

fA(γ)andfB(γ)f_A(\gamma) \quad \text{and} \quad f_B(\gamma).

If fA(γ)f_A(\gamma) = fB(γ)f_B(\gamma), then with high probability, A=BA = B.

This relies on the Schwartz-Zippel Lemma, which states that if two distinct polynomials are evaluated at a random point, the chance of them being equal is very small.

Generalization: Using Roots of Unity

When the sets AA and BB are large, we can optimize further using a special domain HH based on the roots of unity. Define:

ω\omega : A primitive kthk -th root of unity.

H=1,ω,ω2,,ωk1H = {1, \omega, \omega^2, \ldots, \omega^{k-1}} : The evaluation points.

For AA, construct the polynomial ff that interpolates the values:

a0+γ,a1+γ,,ak1+γ{a_0 + \gamma, a_1 + \gamma, \ldots, a_{k-1} + \gamma}

at points in HH . Similarly, construct gg for BB . The sets AA and BB are equal if:

f(h)=g(h)hHf(h) = g(h) \quad \forall h \in H.

Polynomial Z(X)Z(X)

To simplify, introduce a polynomial Z(X)Z(X) that relates ff and gg :

Z(X)f(X)=g(X)Z(ωX)Z(X)f(X) = g(X)Z(\omega X),

with Z(1)=1Z(1) = 1. By evaluating Z(X)Z(X) recursively at HH , we conclude A=BA = B if:

i=0k1(ai+γ)=i=0k1(bi+γ)\prod_{i=0}^{k-1}(a_i + \gamma) = \prod_{i=0}^{k-1}(b_i + \gamma).

Note: if you find $g(X)Z(\omega X)$ confusing, explore this

This method scales efficiently and leverages the structure of polynomials and roots of unity for large sets.

The next thing to do now is to extend this expression in that is would not just accommodate an equality check for two sets of field elements but rather to a set of tuples of field elements, with this we get much closer to the Plonkish permutation argument.

Lemma:

Let A=aˉ0,,aˉk1A = {\bar{a}0, \ldots, \bar{a}{k-1}} and B=bˉ0,,bˉk1B = {\bar{b}0, \ldots, \bar{b}{k-1}} be sets of pairs of field elements, where each aˉi=(ai,0,ai,1)\bar{a}i = (a_{i,0}, a_{i,1}) and similarly for bˉi\bar{b}_i. We aim to determine whether AA and BB are equal using polynomial techniques.

  1. Mapping Tuples to Field Elements:

    Let β,γF\beta, \gamma \in F be random field elements. Each pair (ai,0,ai,1)(a_{i,0}, a_{i,1}) in A is mapped to a single field element: ai,0+ai,1β+γa_{i,0} + a_{i,1}\beta + \gamma, and similarly for BB.

  2. Interpolating Polynomials:

    Define H=1,ω,ω2,,ωk1H = {1, \omega, \omega^2, \ldots, \omega^{k-1}}, where ω\omega is a kthk -th root of unity. Construct polynomials f(X)f(X) and g(X)g(X) that interpolate the values: ai,0+ai,1β+γi=0,,k1{a_{i,0} + a_{i,1}\beta + \gamma \mid i = 0, \ldots, k-1}, and bi,0+bi,1β+γi=0,,k1{b_{i,0} + b_{i,1}\beta + \gamma \mid i = 0, \ldots, k-1}, respectively, over the evaluation points in HH.

  3. Equality Verification:

    To verify that AA and BB are equal, we check for the existence of a polynomial Z(X)Z(X) satisfying: Z(ω0)=1Z(\omega^0) = 1, and Z(X)f(X)=g(X)Z(ωX)Z(X)f(X) = g(X)Z(\omega X), for all XHX \in H.

If such a polynomial Z(X)Z(X) exists, then with overwhelming probability, the sets AA and BB are equal. This probability is high due to the randomness of β\beta and γ\gamma and the properties of polynomials over finite fields.

With this, we are set to applying this to the PLONK problem

Recall the permutation check problem we had initially, set AA and BB are equal given that; A=((i,j),Ti,j):(i,j)IA = {((i,j), T_{i,j}): (i,j) \in I}

B=(σ((i,j)),Ti,j):(i,j)IB = {(\sigma((i,j)), T_{i,j}): (i,j) \in I}

We cannot apply this principle we just explored yet because it was defined to work on field elements and an extension of a tuple of field elements but what we've got now are tuple indexes. The next subsections would do justice to this!

Mapping Indexed Sets to Field Elements and Verifying Equality

Recall that we aim to rephrase Check 2 in terms of polynomials. Specifically, Check 2 is equivalent to the equality of two sets AA and BB, defined as:

A=((i,j),Ti,j):(i,j)IA = {((i, j), T_{i,j}) : (i, j) \in I}, B=(σ((i,j)),Ti,j):(i,j)IB = {(\sigma((i, j)), T_{i,j}) : (i, j) \in I}.

Mapping Indexed Pairs to Field Elements

The sets AA and BB are not directly composed of field elements or pairs of field elements; instead, their elements include an index pair (i,j)(i, j) and a field element vv. To leverage the results from the previous section, we need to convert these sets into pairs of field elements.

Mapping Strategy

For each element ((i,j),v)((i, j), v), we leave vv as it is and map the index (i,j)(i, j) to a field element a0a_0 such that different index pairs map to distinct field elements. Let η\eta be a 3Nth3N -th primitive root of unity. Define:

a0=η3i+j,a1=va_0 = \eta^{3i + j}, \quad a_1 = v.

Thus, we map ((i,j),v)((i, j), v) to the pair (η3i+j,v)(\eta^{3i + j}, v), which is a pair of field elements.

Constructing the New Sets

Using this mapping, the sets AA and BB become:

A=(η3i+j,Ti,j):(i,j)IA = {(\eta^{3i + j}, T_{i,j}) : (i, j) \in I},

B=(η3k+l,Ti,j):(i,j)I,σ((i,j))=(k,l)B = {(\eta^{3k + l}, T_{i,j}) : (i, j) \in I, \sigma((i, j)) = (k, l)}.

Check 2 is equivalent to the equality of these sets AA and BB.

Equality Verification via Polynomials

We now apply the polynomial-based method from the previous section to verify the equality of AA and BB.

Polynomials for Interpolation

Let η\eta be a 3Nth3N -th root of unity, β,γF\beta, \gamma \in F be random field elements, and D=1,η,η2,,η3N1D = {1, \eta, \eta^2, \ldots, \eta^{3N-1}} . Define f(X)f(X) and g(X)g(X) as the polynomials interpolating the following values at DD :

f(X) interpolates Ti,j+η3i+jβ+γ:(i,j)If(X) \text{ interpolates } {T_{i,j} + \eta^{3i + j}\beta + \gamma : (i, j) \in I},

g(X) interpolates Ti,j+η3k+lβ+γ:(i,j)I,σ((i,j))=(k,l)g(X) \text{ interpolates } {T_{i,j} + \eta^{3k + l}\beta + \gamma : (i, j) \in I, \sigma((i, j)) = (k, l)}.

Verifying Equality

If there exists a polynomial $Z(X)$ such that:

Z(η0)=1Z(\eta^0) = 1,

Z(d)f(d)=g(d)Z(ηd),dDZ(d)f(d) = g(d)Z(\eta d), \quad \forall d \in D,

then the sets AA and BB are equal with overwhelming probability.

Additional Definitions for the Protocol

Let ω\omega = η3\eta^3 be a primitive NthN -th root of unity and H=1,ω,ω2,,ωN1H = {1, \omega, \omega^2, \ldots, \omega^{N-1}}. Define Sσ1,Sσ2,Sσ3S_\sigma^1, S_\sigma^2, S_\sigma^3 as the interpolations at HH of the sets of values:

Sσ1:η3k+l:(i,0)I,σ((i,0))=(k,l)S_\sigma^1 : {\eta^{3k + l} : (i, 0) \in I, \sigma((i, 0)) = (k, l)},

Sσ2:η3k+l:(i,1)I,σ((i,1))=(k,l)S_\sigma^2 : {\eta^{3k + l} : (i, 1) \in I, \sigma((i, 1)) = (k, l)},

Sσ3:η3k+l:(i,2)I,σ((i,2))=(k,l)S_\sigma^3 : {\eta^{3k + l} : (i, 2) \in I, \sigma((i, 2)) = (k, l)}.

Before concluding this session, it is necessary to clearly define the structure CommonPreprocessedInput, this would be essential in the proof generation phase of this protocol.

CommonPreprocessedInput { Sσ1S_\sigma^1, Sσ2S_\sigma^2, Sσ3S_\sigma^3, qLq_L, qRq_R, qOq_O, qMq_M, qCq_C, aa, bb, cc }

With these, PLONKish arithmetization is complete.

references:

https://research.metastate.dev/plonk-by-hand-part-1

https://www.youtube.com/watch?v=-SXMd6S0r0I

https://blog.lambdaclass.com/all-you-wanted-to-know-about-plonk/

(these resources are life savers)

Last updated