Rényi Divergence for PMFs #
This file defines Rényi divergence on probability mass functions, following the
multiplicative convention R_α(p ‖ q) = (∑ x, p(x)^α · q(x)^(1-α))^(1/(α-1)) used
in the lattice cryptography literature (Falcon, GPV, etc.).
We also define the max-divergence (Rényi divergence of order ∞) as ⨆ x, p(x) / q(x).
Main Definitions #
PMF.renyiMGF a p q : ℝ≥0∞— the Rényi moment generating function (the base quantity before exponentiation):∑ x, p(x)^a * q(x)^(1-a)PMF.renyiDiv a p q : ℝ≥0∞— the multiplicative Rényi divergence of ordera > 1:(renyiMGF a p q)^(1/(a-1))PMF.maxDiv p q : ℝ≥0∞— the max-divergence (order∞):⨆ x, p(x) / q(x)
Main Results #
renyiMGF_self—M_a(p ‖ p) = 1renyiDiv_self—R_a(p ‖ p) = 1renyiMGF_map_le— data processing inequality for deterministic mapsrenyiDiv_map_le— data processing inequality for Rényi divergencerenyiDiv_prob_bound— probability preservation bound
Design Notes #
We use ℝ≥0∞ throughout rather than ℝ to avoid partiality issues (Rényi divergence can
be infinite when q has zeros that p doesn't). The multiplicative form (not the log form)
is used because:
- The Falcon literature uses
R_α(multiplicative), notD_α(log form). - Multiplicative form composes better:
R_α(p₁⊗p₂ ‖ q₁⊗q₂) = R_α(p₁‖q₁) · R_α(p₂‖q₂). - The probability preservation bound is cleaner:
q(E) ≥ p(E)^{α/(α-1)} / R_α(p‖q).
References #
- Rényi, A. "On measures of entropy and information." 1961.
- van Erven, T., Harremoës, P. "Rényi divergence and Kullback-Leibler divergence." 2014.
- Bai, S., et al. "An Improved Compression Technique for Signatures Based on Learning with Errors." 2014. (multiplicative convention for lattice crypto)
- Fouque, P.-A., et al. "A closer look at Falcon." ePrint 2024/1769.
Rényi Moment Generating Function #
Multiplicative Rényi Divergence #
Max-Divergence (Rényi order ∞) #
The max-divergence (Rényi divergence of order ∞): ⨆ x, p(x) / q(x).
This bounds the pointwise likelihood ratio.
Instances For
Data Processing Inequality #
Data processing inequality for the Rényi MGF under deterministic maps:
M_a(f∗p ‖ f∗q) ≤ M_a(p ‖ q).
Applying a deterministic function can only decrease the Rényi MGF.
Data processing inequality for the multiplicative Rényi divergence:
R_a(f∗p ‖ f∗q) ≤ R_a(p ‖ q).
Multiplicativity (Product Distributions) #
Multiplicativity of the Rényi MGF for independent product distributions:
M_a(p₁⊗p₂ ‖ q₁⊗q₂) = M_a(p₁ ‖ q₁) · M_a(p₂ ‖ q₂).
Multiplicativity of the Rényi divergence for independent product distributions:
R_a(p₁⊗p₂ ‖ q₁⊗q₂) = R_a(p₁ ‖ q₁) · R_a(p₂ ‖ q₂).
Probability Preservation #
Pointwise probability preservation: for a single element x,
p(x)^{a/(a-1)} / R_a(p ‖ q) ≤ q(x).
Proof: from p(x)^a * q(x)^{1-a} ≤ M_a(p ‖ q) (one term of the sum)
we get p(x)^a ≤ M_a * q(x)^{a-1}, then take 1/(a-1) powers.
Probability preservation bound: for any event E,
q(E) ≥ p(E)^{a/(a-1)} / R_a(p ‖ q).
This is the key lemma used in security reductions involving Rényi divergence:
if R_a(p ‖ q) is small, then events that are likely under p remain likely under q.
From Relative Error to Rényi Divergence #
If the pointwise relative error between two PMFs is bounded by δ,
then the Rényi MGF is bounded. Specifically, if for all x,
p(x) ≤ (1 + δ) · q(x), then M_a(p ‖ q) ≤ (1 + δ)^(a-1).
This is used to convert floating-point error bounds into Rényi divergence bounds.
Divergence to TV Distance #
Max-divergence bounds TV distance: ‖p - q‖_TV ≤ 1 - 1 / D_∞(p ‖ q).
When D = maxDiv p q = ⨆ x, p(x)/q(x) is finite, we have q(x) ≥ p(x)/D pointwise,
so ∑ min(p(x), q(x)) ≥ 1/D, giving TV(p,q) = 1 - ∑ min(p(x),q(x)) ≤ 1 - 1/D.
Finite-order Rényi divergence bounds TV distance (squared form):
TV(p, q)² ≤ 1 - R_a(p ‖ q)⁻¹.
Proof sketch (not yet formalized):
- TV ≤ √(1 - BC²) where BC = ∑ √(p(x) q(x)) is the Bhattacharyya coefficient
(Hellinger–TV inequality, via Cauchy–Schwarz on
|√p - √q| · |√p + √q|). - BC = M_{1/2}(p ‖ q), the Rényi MGF at order 1/2.
- By log-convexity of α ↦ M_α (interpolating at 1/2, 1, α): BC ≥ R_α^{-1/2}, so BC² ≥ R_α⁻¹, giving TV² ≤ 1 - R_α⁻¹.
Formalizing this requires Hellinger distance and log-convexity of the Rényi MGF, neither of which is currently available.
Compare etvDist_le_of_maxDiv (the analogous linear bound for max-divergence).