The Rainbow attack

February 25, 2022

Yesterday, a practical attack on the Rainbow digital signature (“Breaking Rainbow Takes a Weekend on a Laptop”) was released on ePrint. This is super exciting since Rainbow is a NIST post-quantum finalist! In this post, I want to summarize the attack and provide some additional background for people unfamiliar with multivariate cryptography. I mainly referenced yesterday’s paper as well as the author’s previous attack on Rainbow.

Background

We will always be working over a finite field \(\mathbb{F}_q\). A multivariate quadratic (MQ) is a polynomial in \(n\) variables where the degree of each term is at most two. For example, a \(2\)-variable quadratic looks like \[f(x_1, x_2) = \alpha_{11} x_1^2 + \alpha_{12} x_1 x_2 + \alpha_{22} x_2^2 + \alpha_1 x_1 + \alpha_2 x_2 + \alpha_0.\] A multivariate quadratic map \(\mathcal{F}: \mathbb{F}_q^n \to \mathbb{F}_q^m\) is a collection of \(m\) such quadratics \(f_1, \dots, f_m\) where \(\mathcal{F}(\mathbf{x}) = \mathbf{y}\) represents sending \[\mathbf{x} = (x_1, \dots, x_n) \quad \text{to} \quad \mathbf{y} = (f_1(x_1, \dots, x_n), \dots, f_m(x_1, \dots, x_n)).\]

Cryptographic primitives

For arbitrary \(\mathcal{F}\) and \(\mathbf{t}\), finding a solution \(\mathcal{F}(\mathbf{s}) = \mathbf{t}\) is known to be NP-hard. While this does not directly translate to cryptography (where we care about average-case hardness), it does lend credence. One idea is to somehow generate a keypair where

Of course, \(\mathcal{P}\) should be hard to invert without knowledge of the trapdoor. For signature schemes, we choose \(n >> m\) so that \(\mathcal{P}\) is overwhelmingly surjective. Then to sign a message, we hash it to a vector \(\mathbf{t} \in \mathbb{F}_q^m\) and sample a preimage \(\mathbf{s}\).

UOV

The Unbalanced Oil and Vinegar (UOV) signature scheme is one of the oldest signature schemes based on MQ. The main observation is that certain multivariate quadratic maps are easy to invert. For example, consider \(\mathcal{F}\) with quadratics of the form \[f(\mathbf{s}) = \sum_j \sum_{k > m} \alpha_{jk} s_j s_k.\] Notice that \(\mathcal{F}\) is linear in \(s_1, \dots, s_m\), since each term contains at most one such variable. To solve \(\mathcal{F}(\mathbf{s}) = \mathbf{t}\), randomly fix \(s_{m+1}, \dots, s_n\) and solve the resulting linear equations for \(s_1, \dots, s_m\). If there is no solution, try again with new random values.

In UOV, we obfuscate an easy-to-invert \(\mathcal{F}\) by composing with an invertible linear map \(\mathcal{T}\). The public key is \(\mathcal{P}= \mathcal{F}\circ \mathcal{T}\), and the private key is the decomposition. We sign \(\mathbf{t}\) by solving \(\mathcal{F}(\mathbf{s}) = \mathbf{t}\) and publishing \(\mathcal{T}^{-1}(\mathbf{s})\).

Rainbow

Rainbow is a variant of UOV designed to be robust against certain attacks. The idea is to solve for the first \(m\) variables in two phases. Choose \(0 < o_2 < m\), and consider \(\mathcal{F}\) with quadratics of the form \[\underbrace{f(\mathbf{x}) = \sum_j \sum_{k > o_2} \alpha_{jk} s_j s_k}_\text{first $o_2$ quadratics}, \quad \underbrace{f(\mathbf{x}) = \sum_{j > o_2} \sum_{k > m} \alpha_{jk} s_j s_k}_\text{last $m - o_2$ quadratics}.\] To solve \(\mathcal{F}(\mathbf{s}) = \mathbf{t}\), randomly fix \(s_{m+1}, \dots, s_n\). The last \(m - o_2\) quadratics become linear, and we solve for \(s_{o_2+1}, \dots, s_m\). This makes the first \(o_2\) quadratics linear, and we solve for \(s_1, \dots, s_{o_2}\). Again, we compose with random linear maps to obtain \(\mathcal{P}\).

Simpler formulations

To better understand the Rainbow attack, we will move to a (mathematically) simpler formulation of the UOV and Rainbow schemes. The matrix form of a multivariate quadratic is \(f(\mathbf{x}) = \mathbf{x}^\top A \mathbf{x} + \mathbf{b}^\top\mathbf{x} + c\), which implies \[\mathbf{x}^\top(A + A^\top) \mathbf{y} = f(\mathbf{x} + \mathbf{y}) - f(\mathbf{x}) - f(\mathbf{y}) + c.\] Analogously, the polar form of \(\mathcal{F}\) is defined to be \[\mathcal{F}'(\mathbf{x}, \mathbf{y}) := \mathcal{F}(\mathbf{x} + \mathbf{y}) - \mathcal{F}(\mathbf{x}) - \mathcal{F}(\mathbf{y}) + \mathcal{F}(0),\] which is bilinear. From now on we will drop the constant term, which is always zero.

UOV

Here is an alternate description of the UOV signature scheme. For a public key \(\mathcal{P}\), the trapdoor is a linear subspace \(O \subset \mathbb{F}_q^n\) of dimension \(m\) such that \(\mathcal{P}(\mathbf{o}) = 0\) for all \(\mathbf{o} \in O\); we say that \(\mathcal{P}\) vanishes on \(O\). To sample a preimage for \(\mathbf{t}\), fix a random vector \(\mathbf{v} \in \mathbb{F}_q^n\) and find a solution \(\mathbf{o} \in O\) for \[\mathbf{t} = \mathcal{P}(\mathbf{v} + \mathbf{o}) = \mathcal{P}'(\mathbf{v}, \mathbf{o}) + \mathcal{P}(\mathbf{v}) + \mathcal{P}(\mathbf{o}),\] which is linear since \(\mathcal{P}'\) is bilinear, \(\mathcal{P}(\mathbf{v})\) is constant, and crucially \(\mathcal{P}(\mathbf{o}) = 0\) for any choice of \(\mathbf{o}\). Intuitively, \(\mathcal{P}\) behaves linearly when restricted to the coset \(\mathbf{v} + O\). Since \(O\) has dimension \(m\), we will likely find a solution; otherwise, try again with new \(\mathbf{v}\). It’s also worth noting that using any vector in the coset \(\mathbf{v} + O\) yields the same solution.

How does this relate to the original description? Consider the easy-to-invert \(\mathcal{F}\). The trapdoor is the set of vectors whose last \(n - m\) entries are zero, i.e. \(\mathop{\mathrm{span}}\{\mathbf{e}_1, \dots, \mathbf{e}_m\}\). Fixing the last \(n - m\) entries of \(\mathbf{x}\) corresponds with fixing \(\mathbf{v}\) (in particular, the coset element whose first \(m\) entries are all zero). Solving for the first \(m\) entries of \(\mathbf{x}\) corresponds with solving for \(\mathbf{o}\). One can verify the polar equation simplifies into the linear system we solve in UOV. Finally, any trapdoor for \(\mathcal{F}\) translates into a trapdoor in \(\mathcal{P}\) by application of \(\mathcal{T}^{-1}\). The UOV signing algorithm is really just solving the polar equation for \(\mathcal{F}\), rather than \(\mathcal{P}\) directly!

Rainbow

The alternate description of Rainbow follows the same pattern. Again, consider the easy-to-invert \(\mathcal{F}\). Although our vanishing subspace \(O_2 = \mathop{\mathrm{span}}\{\mathbf{e}_1, \dots, \mathbf{e}_{o_2}\}\) only has dimension \(o_2\), we do have a subspace \(O_1 = \mathop{\mathrm{span}}\{\mathbf{e}_1, \dots, \mathbf{e}_m\}\) of dimension \(m\). For any \(\mathbf{o}_1 \in O_1\), \(\mathcal{F}\) sends \(\mathbf{o}_1\) to a vector whose last \(m - o_2\) entries are zero. In other words, \(O_1\) is a vanishing subspace if we work in the quotient space \(\mathbb{F}_q^m / W\), where \(W\) is spanned by the first \(o_2\) standard basis vectors in \(\mathbb{F}_q^m\).

To sample a preimage for \(\mathbf{t}\), fix a random vector \(\mathbf{v} \in \mathbb{F}_q^n\). We use the same technique as in UOV to obtain \(\mathbf{o}_1 \in O_1\) such that the cosets \(\mathcal{F}(\mathbf{v} + \mathbf{o}_1) + W\) and \(\mathbf{t} + W\) are equal. Next, we would like to find a solution \(\mathbf{o}_2 \in O_2\) for \[\begin{aligned} t & = \mathcal{F}((\mathbf{v} + \mathbf{o}_1) + \mathbf{o}_2) \\ & = \mathcal{F}'(\mathbf{v} + \mathbf{o}_1, \mathbf{o}_2) + \mathcal{F}(\mathbf{v} + \mathbf{o}_1) + \mathcal{F}(\mathbf{o}_2) \\ & = \mathcal{F}'(\mathbf{v} + \mathbf{o}_1, \mathbf{o}_2) + \mathcal{F}(\mathbf{v} + \mathbf{o}_1).\end{aligned}\] We have chosen \(\mathbf{o_1}\) so that \(\mathcal{F}(\mathbf{v} + \mathbf{o}_1) - t\) lies in \(W\). We claim that for any choice of \(\mathbf{o}_2\), \(\mathcal{F}'(\mathbf{v} + \mathbf{o}_1, \mathbf{o}_2)\) does as well. If this is the case, this constitutes a system of \(\dim W = o_2\) equations and we can expect to find a solution.

To prove the claim, take arbitrary \(\mathbf{x} \in \mathbb{F}_q^n\) and \(\mathbf{o}_2 \in O_2\). The idea is \(\mathcal{F}(\mathbf{x})\) and \(\mathcal{F}(\mathbf{x} + \mathbf{o}_2)\) will only ever vary in the first \(o_2\) entries. This is because the last \(m - o_2\) quadratics of \(\mathcal{F}\) ignore the first \(o_2\) variables. Hence \[\mathcal{F}'(\mathbf{x}, \mathbf{o}_2) = \mathcal{F}(\mathbf{x} + \mathbf{o}_2) - \mathcal{F}(\mathbf{x}) - \mathcal{F}(\mathbf{o}_2) = \mathcal{F}(\mathbf{x} + \mathbf{o}_2) - \mathcal{F}(\mathbf{x})\] lies in \(W\).

Once linear maps are composed with \(\mathcal{F}\), we obtain secret subspaces \(O_1\), \(O_2\), and \(W\) where everything still applies.

The attack

Now that we’ve come this far, the attack is pretty simple to explain. In Rainbow, for all \(\mathbf{x} \in \mathbb{F}_q^n\) the differential \[\begin{aligned} D_{\mathbf{x}}: \mathbb{F}_q^n & \to \mathbb{F}_q^m \\ \mathbf{y} & \mapsto \mathcal{P}'(\mathbf{x}, \mathbf{y})\end{aligned}\] maps \(O_2\) into \(W\), which share the same dimension. With its domain restricted to \(O_2\), it behaves like a random linear map (over the choice of \(\mathbf{x}\)). Hence the probability that \(D_{\mathbf{x}}\) is singular, i.e. has a kernel vector in \(O_2\), is approximately \(1/q\). When this happens, the rank-\(m\) linear system \(D_{\mathbf{x}}(\mathbf{o}) = 0\) and the \(n\)-variable quadratic system \(\mathcal{P}(\mathbf{o}) = 0\) have a nontrivial intersection, namely a vector in \(O_2\). Finding it amounts to solving an \((n - m)\)-variable quadratic system. In conjunction with existing key-recovery attacks, this severely reduces the security of Rainbow.

It’s also worth noting that in UOV, \(D_\mathbf{x}\) sends \(O\) to a random subspace in \(\mathbb{F}_q^m\). The attack does not apply since the map will almost always be injective.

Discussion

While this attack is specific to Rainbow, my opinion is that at this point we should avoid all schemes that rely on ad hoc, structured MQ assumptions. On the other hand, basing security on the hardness of solving a random quadratic map still seems viable. The issue is that we lose the trapdoor, which yielded short signatures.

I think this also reflects the state of post-quantum signatures as a whole: even though we have schemes like Picnic and SPHINCS that use minimal assumptions, achieving compact signatures is a different story. The other NIST finalists are Dilithium (structured lattice), Falcon (structured lattice), and GeMSS (structured MQ). I’ve personally been researching alternatives — a design writeup/paper will be coming soon :)