paper reading

1. Code-based Cryptography: Lecture Notes

1.1 Introduction

linear code $\mathcal{C}$ $n$ $k$ $\mathbb{F}_q$ $[n,k]_q$ $\mathbb{F}_q^n$ $k$ .

$[n,k]_q$ $k$ generator matrix $\mathbf{G}\in\mathbb{F}_q^{k\times n}$ $\mathcal{C}$ $\mathcal{C}=\{\mathbf{c}=\mathbf{mG}:\mathbf{m}\in\mathbb{F}_q^k\}$ .

$\mathbf{G}\in\mathbb{F}_q^{k\times n}$ $k$ $[n,k]_q$ -code.

A subspace of a vector space is a nonempty subset that satisfies the requirements for a vector space $x$ $y$ $x+y$ is in the subspace.
$x$ $c$ $cx$ is in the subspace.

dual code $[n,k]_q$ $\mathcal{C}$ $\mathcal{C}^*=\{\mathbf{c}^*\in\mathbb{F}_q^n:\forall\mathbf{c}\in\mathcal{C},\langle\mathbf{c},\mathbf{c}^*\rangle=0\}$ $\mathbf{c}=\mathbf{mG}$ $\mathcal{C}^*$ $\mathbf{G}$ $[n,n-k]_q$ -code.

parity-check matrix $\mathcal{C}$ $\mathbf{H}\in\mathbb{F}_q^{(n-k)\times n}$ , whose rows are basis of the nullspace.

$\mathcal{C}$ $\mathcal{C}=\{\mathbf{c}\in\mathbf{F}_q^n:\mathbf{Hc^{\intercal}}=\mathbf{0}\}$ .

$\mathbf{H}\in\mathbb{F}_q^{(n-k)\times n}$ $n-k$ $[n,k]_q$ -code.

$\mathbf{Gx}=\mathbf{0}$ nullspace $\mathbf{G}$ .
$\mathbf{G}$ $\mathbf{G'}=[\mathbf{I_{k}}|\mathbf{A}]$ $\mathbf{Gx}=\mathbf{0}$ $\mathbf{G'x}=\mathbf{0}$ $\mathbf{x}$ $\mathbf{G}$ $\mathbf{G'}$ .
$\mathbf{G'}$ $k$ $k$ $n-k$ $n-k$ $n-k$ $\mathbf{Gx}=\mathbf{0}$ and form the basis of the nullspace.

$\mathbf{SG}$ $\mathbf{SH}$ ), it just gives another basis.

$\mathsf{RREF}(\mathbf{G})=[\mathbf{I_k}|\mathbf{A}]$ $\mathbf{H}=[\mathbf{-A^{\intercal}}|\mathbf{I_{n-k}}]$ $\mathbf{GH^{\intercal}}=\mathbf{0}$ .

$\mathbf{H}\in\mathbb{F}_q^{(n-k)\times n}$ syndrome $\mathbf{y}$ $\mathbf{H}$ $\mathbf{Hy^{\intercal}}$ $\mathbb{F}_q^{n-k}$ is called a syndrome.

Syndrome Decoding: $\mathbf{H}\in\mathbb{F}_q^{(n-k)\times n}$ $n-k$ $\mathbf{s}\in\mathbb{F}_q^{(n-k)}$ $t\in\{0,\dots,n\}$ $\mathbf{e}$ $t$ $\mathbf{He^{\intercal}}=\mathbf{s^{\intercal}}$ .

Any solver of syndrome decoding can be turned in polynomial time into an algorithm solving Noisy Codeword Decoding:
$\mathbf{G}\in\mathbb{F}_q^{k\times n}$ $k$ $\mathbf{y}\in\mathbb{F}_q^{n}$ $t\in\{0,\dots,n\}$ $\mathbf{e}$ $t$ $\mathbf{y}=\mathbf{mG}+\mathbf{e}$ $\mathbf{m}\in\mathbb{F}_q^k$ .

Hardness of Decoding Problem: From telecommunication side, the decoding problem should be easy to solve. From security side, the decoding problem should be hard to solve. All the subtlety lies in the inputs of the problem. There exist some codes and decoding distances where the problem is easy to solve. The existance of codes that are easy to decode is at the foundation of code-based cryptography.

Decoding problem is hard in the worst case (NP-complete), which means that we cannot hope to solve the decoding problem in polynomial time for all inputs.

$t$ , almost all codes are intractable.

The NP–completeness of a problem for a cryptographic use is a nice property but it is not the panacea to ensure its hardness, since it is quite possible that the security of a cryptosystem relies on the difficulty to solve an NP–complete problem but at the same time breaking the scheme amounts to solve the problem on a subset of inputs for which make the problem easy to solve.

algorithmic hardness $\mathcal{A}$ $\varepsilon$ $T$ $T/\varepsilon$ (complexity).

\begin{array}{r} ε = P (A (H, s = {xH}^{⊺}, w) = e s.t. | e | = t and {eH}^{⊺} = s) \end{array}

$n$ $n$ $R=k/n$ $\tau=t/n$ $\mathsf{DP}(n,q,R,\tau)$ . $\tau\in[(1-R)\frac{q-1}{q},R+(1-R)\frac{q-1}{q}]$ $\tau$ $\tau n$ . (proof in 1.3)

1.2 Random Codes

Random codes' parity-check matrix or generator matrix is drawn unifromly at random.

negligible functions $f(n)$ $f(n)\in\mathsf{negl}(n)$ $p(n)$ $|f(n)|<1/|p(n)|$ $n$ .

$p(n)$ $n$ grows. It helps establish a quantitative measure for the rate at which the function approaches zero.
$p(n)$ $p(n)$ provides a flexible framework for capturing these variations.

$n$ asymptotic $n\rightarrow+\infin$ .

The $q$ -ary entropy is defined as

\begin{matrix} h_{q} : x \in [0, 1] \to - x \log_{q} (\frac{x}{q - 1}) - (1 - x) \log_{q} (1 - x) \end{matrix}

$e$ $\mathbb{F}_q$ $q$ $x$ .

$h_q$ $\big[0,\frac{q-1}{q}\big]$ $\big[\frac{q-1}{q},1\big]$ .

$\mathcal{S}_t$ $t$ $|\mathcal{S}_t|={n\choose t}(q-1)^t$ . We also have

\begin{matrix} q^{n (h_{q} (τ) + O (\frac{\log_{q} (n)}{n}))} \leq (\binom{n}{t}) (q - 1)^{t} \leq q^{n h_{q} (τ)} \end{matrix}

Asymptotically, we have

\begin{matrix} \frac{1}{n} \log_{q} ((\binom{n}{t}) (q - 1)^{t}) = h_{q} (τ) + O (\frac{\log_{q} n}{n}) \end{matrix}

$P_X(A)$ $X$ $A$ $\mathbf{H}\in\mathbb{F}_q^{(n-k)\times n}$ .

$\mathsf{DP}$ with generator or parity-check matrix is just a matter of personal taste, it does not change the average hardness.

Lemma 2.2.3. $\mathbf s\in\mathbb{F}_q^{n-k}$ $\mathbf y\in\mathbb{F}_q^{n}$ $\mathbf y\neq\mathbf 0$ $\mathbf H$ $\mathbb{F}_q^{n-k}$ ,

\begin{matrix} P_{H} ({yH}^{⊺} = s) = \frac{1}{q^{n - k}} \end{matrix}

$\mathbf y$ $\mathbf s$ according to a certain code.

Proof. $y_1=1$ , then we are looking for the probability of the following event

\begin{matrix} \forall i \in {1, \dots, n - k}, h_{i, 1} = s_{i} - \sum_{j = 2}^{n} h_{i, j} y_{j} \end{matrix}

$h_{i,1}=s_i-\sum_{j=2}^nh_{i,j}y_j$ $q$ $h_{i,j}$ $n-k$ $1/q$ . This concludes the proof.

Question: $\mathcal{C}$ $\mathbf y \in \mathbb{F}_q^n$ codewords $\mathbf c\in\mathcal{C}$ $t$ $\mathbf y$ $\mathbf{y}=\mathbf{c}+\mathbf e$ $\mathcal{C}$ $\mathbf s$ error vectors $\mathbf e$ $t$ $\mathbf s$ $\mathbf H$ $\mathbf{eH^{\intercal}}=\mathbf s$ ?

$N_t\big(\mathcal{C},\mathbf s\big)=|\{\mathbf e\in\mathcal{S}_t:\mathbf{eH^{\intercal}}=\mathbf s\}|$ number of solutions $\mathsf{DP}$ $(\mathbf{H},\mathbf s)$ $\mathbf s\in\mathbb{F}_q^{n-k}$ $N_t\big(\mathcal{C},\mathbf 0\big)$ $\mathbf c\in\mathcal C$ $t$ .

This number is crucial to know, the reason is as follows.

$\mathsf{DP}$ $\mathbf e\in\mathcal{S}_t$ ,

$\mathbf e$ $\frac{1}{{n\choose t}(q-1)^t}$ .
$N$ $\mathbf e$ $\frac{N}{{n\choose t}(q-1)^t}$ .

$N$ to be able to predict the running time of our algorithm.

Solution:

$N_t\big(\mathcal{C},\mathbf s\big)$ ${n\choose t}(q-1)^t/q^{n-k}$ . Then

\begin{matrix} \frac{1}{n} \log_{q} ((\binom{n}{t}) (q - 1)^{t} / q^{n - k}) = h_{q} (τ) - (1 - R) + O (\frac{\log_{q} n}{n}) \end{matrix}

$0\le\tau,R\le 1$ $h_q(0)=0$ $h_q(1)=\log_q(q-1)$ $N_t\big(\mathcal{C},\mathbf s\big)$ $\tau^-$ $\tau^+$ $(1-R)\ge\log_q(q-1)$ $\tau^-$ $\tau^+$ is

\begin{aligned} τ^{-} & = g_{q}^{-} (1 - R) \\ τ^{+} & = g_{q}^{+} (1 - R) when R \leq 1 - \log_{q} (q - 1) \end{aligned}

$g_q^-$ $g_q^+$ $h_q$ $\big[0,\frac{q-1}{q}\big]$ $\big[\frac{q-1}{q},1\big]$ ).

$\tau^-$ $\tau^+$ $N_t\big(\mathcal{C},\mathbf s\big)$ to be exponentially large.

$\mathsf{DP}(n,q,R,\tau)$ over the input distribution is given by

\begin{matrix} 1 + \frac{(\binom{n}{t}) (q - 1)^{t} - 1}{q^{n - k}} \end{matrix}

截屏2023-11-19 16.52.23

expected minimum distance $t_{GV}(q,n,k)$ , which is defined as the largest integer such that

\begin{matrix} \sum_{l = 0}^{t_{G V} (q, n, k)} (\binom{n}{l}) (q - 1)^{l} \leq q^{n - k} \end{matrix}

Proof. $\sum_{l=0}^{t_{GV}(q,n,k)}{n\choose l}(q-1)^l/ q^{n-k}$ $t_{GV}$ .

The minimum Hamming weight of a code is defined as the smallest Hamming weight of all non-zero codewords.

It can be verified that

\begin{matrix} \frac{t_{G V} (q, n, k)}{n} = τ^{-} + o (1) \end{matrix}

1.3 Information Set Decoding Algorithms

Prange's approach: using linearity.

$\mathbf c\in\mathcal C$ $t$ $\mathbf y$ $\mathcal C$ $k$ $\mathbb F_q^n$ $k$ , called an information set, which uniquely determines every codewords.

$\mathbf H$ $n-k$ free variables, and their positions are fixed.

$\mathcal C$ $[n,k]$ $\mathcal J\in\{1,\dots,n\}$ $k$ , then

\begin{aligned} J is an information set for C \Leftrightarrow & \forall G generator matrix of C, G_{J} is invertible. \\ \Leftrightarrow & \forall H parity-check matrix of C, H_{\bar{J}} is invertible. \end{aligned}

$\mathcal J$ $\mathcal J$ $\bar{\mathcal J}$ are free variables.

$t$ $\mathbf y$ .

The algorithm:

$\mathcal J\in\{1,\dots,n\}$ $k$ $\mathbf H_{\bar{\mathcal J}}\in\mathbb{F}_q^{(n-k)\times(n-k)}$ $\mathcal J$ .
$\mathsf H_{\bar{\mathcal J}}^{-1}$ .
$\mathbf e_{\mathcal J}=\mathbf x\in\mathbb F_q^k$ $\mathcal D_t$ $\mathbf e\in\mathbb F_q^n$ be such that

\begin{matrix} e_{\bar{J}} = (s - {xH}_{J}^{⊺}) H_{\bar{J}}^{- ⊺} \end{matrix}

$|\mathbf e|\neq t$ $1$ , otherwise it is a solution.

$\mathbf x$ $3$ $k$ $\mathbf e$ $\mathbf x$ $\mathbf e$ $\mathbf x$ $\mathcal D_t$ .

$\mathcal D_t$ is defined as

$t<\frac{q-1}{q}(n-k)$ $\mathcal D_t$ $\mathbf 0\in\mathbb F_q^k$ .
$t\in(\frac{q-1}{q}(n-k),k+\frac{q-1}{q}(n-k))$ $\mathcal D_t$ $t-\frac{q-1}{q}(n-k)$ .
$t>k+\frac{q-1}{q}(n-k)$ $\mathcal D_t$ $k$ .

Rough Analysis:

$\mathbf s$ $\mathbb F_q^{n-k}$ $t/n$ $[\tau^-,\tau^+]$ $\mathbf e$ is given by

\begin{matrix} E (| e |) = | x | + \frac{q - 1}{q} (n - k) \end{matrix}

$\frac{q-1}{q}(n-k)$ $\mathcal B(n-k,\frac{q-1}{q})$ .

$\mathbf x\in\mathbb F_q^k$ , we can easily reach any weight in

\begin{matrix} (\frac{q - 1}{q} (n - k), k + \frac{q - 1}{q} (n - k)) \end{matrix}

$|\mathbf x|\in\{0,\dots,k\}$ $\mathsf{DP}$ is easy to solve in this interval.

Precise Analysis:

....

Dumer's approach: collision search

Dumer takes advantage of the birthday paradox to improve the search.

Lemma 3.2.1. $\mathcal L_1$ $\mathcal L_2$ $L$ $\mathbb F_q^r$ , we have

\begin{matrix} E (| L_{1} \cap L_{2} |) = \frac{L^{2}}{q^{r}} \end{matrix}

$L=\sqrt{q^r}$ .

关于code-based cryptography的论文笔记

1. Code-based Cryptography: Lecture Notes

1.1 Introduction

1.2 Random Codes

1.3 Information Set Decoding Algorithms

Prange's approach: using linearity.

Dumer's approach: collision search

留下评论取消回复

1. Code-based Cryptography: Lecture Notes

1.1 Introduction

1.2 Random Codes

1.3 Information Set Decoding Algorithms

Prange's approach: using linearity.

Dumer's approach: collision search

留下评论 取消回复

留下评论取消回复