[FGJ+25] Publicly-Detectable Watermarking for Language Models

Authors: Jaiden Fairoze, Sanjam Garg, Somesh Jha, Saeed Mahloujifar, Mohammad Mahmoody, Mingyuan Wang | Venue: CiC Vol 1, No 4 (2025) | Source

Abstract

We present a publicly-detectable watermarking scheme for LMs: the detection algorithm contains no secret information, and it is executable by anyone. We embed a publicly-verifiable cryptographic signature into LM output using rejection sampling and prove that this produces unforgeable and distortion-free (i.e., undetectable without access to the public key) text output. We make use of error-correction to overcome periods of low entropy, a barrier for all prior watermarking schemes. We implement our scheme and find that our formal claims are met in practice.

BibTeX

@Article{CiC:FGJMMW24,
  author = {Jaiden Fairoze and Sanjam Garg and Somesh Jha and Saeed Mahloujifar and Mohammad Mahmoody and Mingyuan Wang},
  title = {Publicly-Detectable Watermarking for Language Models},
  pages = {31},
  journal = {IACR Communications in Cryptology (CiC)},
  volume = {1},
  number = {4},
  publisher = {IACR},
  year = {2024},
  doi = {10.62056/ahmpdkp10},
}