Pseudorandom error-correcting code

A Pseudorandom Error-correcting Code (PRC) is a type of SKE that requires ciphertext decoding to be robust to some modifications edits, introduced by CG24. There is additionally a zero-bit PRC which does not allow for a message. Both variations are useful for constructing cryptographic watermarking of generative AI.

Syntax

A $L$ -bit PRC is a tuple of efficient algorithms $(Gen, Enc, Dec)$ , with respect to key space $K$ , message space ${0, 1}^{L}$ , and ciphertext space ${0, 1}^{n}$ such that

$Gen (1^{λ}) \to k$ , is a randomized algorithm that takes a security parameter, and outputs a key $k \in K$ ,
$Enc_{k} (m) \to c$ , is a randomized algorithm that takes a key $k \in K$ and message $m \in {0, 1}^{L}$ , and outputs a ciphertext $c \in {0, 1}^{n}$ ,
$Dec_{k} (c) \to {m, ⊥}$ , is a deterministic algorithm that takes a key $k \in K$ and candidate ciphertext $c \in {0, 1}^{n}$ , and outputs either a message $m \in {0, 1}^{L}$ or $⊥$

A zero-bit PRC, has the same requirements as a $L$ -bit PRC, except that the message space is just the singleton set ${1}$ , which means that $Enc$ takes no input and just outputs codewords. Then, $Dec$ simply detects whether or not the candidate ciphertext is close to a codeword.

Properties

Pseudorandomness

We define the advantage of a distinguisher $D$ as $Adv_{D}^{prc} (λ) \leq Pr [D^{Enc_{k}} (1^{λ}) = 1] - Pr [D^{R} (1^{λ}) = 1],$ where $k \leftarrow Gen (1^{λ})$ and $R$ is a random response oracle, which on each query gives a uniformly random $n$ -bit string (even on the same input, unlike a random oracle).

A PRC is pseudorandom if for all efficient $D$ , there exists a negligible function $ν$ , such that: $Adv_{D}^{prc} (λ) \leq ν (λ)$ .

Completeness/Robustness

A PRC is $ε$ -robust if there is a negligible function $ν$ , such that for every message $m$ , $Pr [Dec_{k} (E (Enc_{k} (m))) \neq = m] \leq ν (λ),$ where $k \leftarrow Gen (1^{λ})$ and $E$ is any $ε$ -bounded channel. Meaning that $E$ is a length preserving function with the property that for every $n$ -bit string $c$ , $∣ E (c) - c ∣ \leq ε \cdot n$ .

Soundness

A PRC is sound if there is a negligible function $ν$ , such that for all $\overset{c}{^}$ , $Pr_{k \leftarrow Gen (1^{λ})} [Dec_{k} (\overset{c}{^}) = ⊥] \leq ν (λ) .$

Variations

Adaptive robustness

A PRC with adaptive robustness strengthens the robustness property to allow the channel $E$ to be chosen after seeing the codeword $c = Enc_{k} (m)$ , rather than being fixed in advance. Formally, the adversarial channel $E$ may depend on $c$ (but not on $k$ or $m$ directly). This models a stronger adversary who can tailor the corruption pattern to the specific codeword.

Ideal PRC

An ideal PRC additionally requires that codewords are indistinguishable from uniformly random strings even to an adversary who holds the decoding key $k$ . That is, the joint distribution $(k, Enc_{k} (m))$ is computationally indistinguishable from $(k, U_{n})$ where $U_{n}$ is a uniformly random $n$ -bit string. This is strictly stronger than pseudorandomness (which only requires indistinguishability without the key). Ideal PRCs support watermarking schemes where even a user who knows the watermarking key cannot detect whether a given string is a codeword.

Zero-bit PRC

A zero-bit PRC has a singleton message space ${1}$ : the encoder takes no message input and simply outputs a codeword, while the decoder detects whether a candidate string is close to a codeword. Zero-bit PRCs are useful for watermarking generative AI outputs: embed a pseudorandom codeword into generated text/images such that possession of the secret key allows detection, while outputs look uniformly random to anyone without the key — CG24.

Other results

PRCs were introduced in CG24 motivated by undetectable watermarking of AI-generated content
Zero-bit PRCs give a watermarking scheme for language model outputs that is undetectable (codewords look like random tokens) and robust to paraphrasing attacks — CG24
PRCs can be constructed from LWE: the LWE ciphertext structure naturally yields a pseudorandom, decodable code robust to bounded noise — CG24
The zero-bit PRC construction from LWE has codewords of length $n = O (λ^{2} / lo g λ)$ and is robust to $ε$ -fraction bit flips for $ε < 1/2 - 1/ poly (λ)$ — CG24

Cryptology City

Explorer