Documentation

ToMathlib.Probability.ProbabilityMassFunction.RenyiDivergence

Rényi Divergence for PMFs #

This file defines Rényi divergence on probability mass functions, following the multiplicative convention R_α(p ‖ q) = (∑ x, p(x)^α · q(x)^(1-α))^(1/(α-1)) used in the lattice cryptography literature (Falcon, GPV, etc.).

We also define the max-divergence (Rényi divergence of order ) as ⨆ x, p(x) / q(x).

Main Definitions #

Main Results #

Design Notes #

We use ℝ≥0∞ throughout rather than to avoid partiality issues (Rényi divergence can be infinite when q has zeros that p doesn't). The multiplicative form (not the log form) is used because:

  1. The Falcon literature uses R_α (multiplicative), not D_α (log form).
  2. Multiplicative form composes better: R_α(p₁⊗p₂ ‖ q₁⊗q₂) = R_α(p₁‖q₁) · R_α(p₂‖q₂).
  3. The probability preservation bound is cleaner: q(E) ≥ p(E)^{α/(α-1)} / R_α(p‖q).

References #

Rényi Moment Generating Function #

noncomputable def PMF.renyiMGF {α : Type u_1} (a : ) (p q : PMF α) :

The Rényi moment generating function of order a ∈ ℝ between two PMFs: M_a(p ‖ q) = ∑ x, p(x)^a * q(x)^(1-a). This is the base quantity whose 1/(a-1) power gives the multiplicative Rényi divergence.

Instances For
    @[simp]
    theorem PMF.renyiMGF_self {α : Type u_1} (a : ) (p : PMF α) :
    PMF.renyiMGF a p p = 1

    Multiplicative Rényi Divergence #

    noncomputable def PMF.renyiDiv {α : Type u_1} (a : ) (p q : PMF α) :

    The multiplicative Rényi divergence of order a > 1 between two PMFs: R_a(p ‖ q) = M_a(p ‖ q)^(1/(a-1)).

    When a ≤ 1, this returns 1 (the trivial bound).

    Instances For
      @[simp]
      theorem PMF.renyiDiv_self {α : Type u_1} (a : ) (p : PMF α) :
      PMF.renyiDiv a p p = 1
      theorem PMF.renyiDiv_eq_rpow {α : Type u_1} {a : } (ha : 1 < a) (p q : PMF α) :
      PMF.renyiDiv a p q = PMF.renyiMGF a p q ^ (a - 1)⁻¹

      Max-Divergence (Rényi order ∞) #

      noncomputable def PMF.maxDiv {α : Type u_1} (p q : PMF α) :

      The max-divergence (Rényi divergence of order ): ⨆ x, p(x) / q(x). This bounds the pointwise likelihood ratio.

      Instances For
        @[simp]
        theorem PMF.maxDiv_self {α : Type u_1} (p : PMF α) :
        p.maxDiv p = 1
        theorem PMF.maxDiv_one_le {α : Type u_1} (p q : PMF α) :
        1 p.maxDiv q

        Data Processing Inequality #

        theorem PMF.renyiMGF_map_le {α' β : Type u₀} (a : ) (ha : 1 a) (f : α'β) (p q : PMF α') :
        PMF.renyiMGF a (f <$> p) (f <$> q) PMF.renyiMGF a p q

        Data processing inequality for the Rényi MGF under deterministic maps: M_a(f∗p ‖ f∗q) ≤ M_a(p ‖ q). Applying a deterministic function can only decrease the Rényi MGF.

        theorem PMF.renyiDiv_map_le {α' β : Type u₀} (a : ) (ha : 1 < a) (f : α'β) (p q : PMF α') :
        PMF.renyiDiv a (f <$> p) (f <$> q) PMF.renyiDiv a p q

        Data processing inequality for the multiplicative Rényi divergence: R_a(f∗p ‖ f∗q) ≤ R_a(p ‖ q).

        theorem PMF.renyiMGF_bind_right_le {α' β : Type u₀} (a : ) (ha : 1 a) (f : α'PMF β) (p q : PMF α') :
        PMF.renyiMGF a (p.bind f) (q.bind f) PMF.renyiMGF a p q

        Data processing inequality for the Rényi MGF under Markov kernels (post-processing).

        theorem PMF.renyiDiv_bind_right_le {α' β : Type u₀} (a : ) (ha : 1 < a) (f : α'PMF β) (p q : PMF α') :
        PMF.renyiDiv a (p.bind f) (q.bind f) PMF.renyiDiv a p q

        Data processing inequality for the Rényi divergence under Markov kernels.

        Multiplicativity (Product Distributions) #

        theorem PMF.renyiMGF_prod {α' β : Type u₁} (a : ) (p₁ q₁ : PMF α') (p₂ q₂ : PMF β) :
        PMF.renyiMGF a (p₁.bind fun (x : α') => Prod.mk x <$> p₂) (q₁.bind fun (x : α') => Prod.mk x <$> q₂) = PMF.renyiMGF a p₁ q₁ * PMF.renyiMGF a p₂ q₂

        Multiplicativity of the Rényi MGF for independent product distributions: M_a(p₁⊗p₂ ‖ q₁⊗q₂) = M_a(p₁ ‖ q₁) · M_a(p₂ ‖ q₂).

        theorem PMF.renyiDiv_prod {α' β : Type u₁} (a : ) (ha : 1 < a) (p₁ q₁ : PMF α') (p₂ q₂ : PMF β) :
        PMF.renyiDiv a (p₁.bind fun (x : α') => Prod.mk x <$> p₂) (q₁.bind fun (x : α') => Prod.mk x <$> q₂) = PMF.renyiDiv a p₁ q₁ * PMF.renyiDiv a p₂ q₂

        Multiplicativity of the Rényi divergence for independent product distributions: R_a(p₁⊗p₂ ‖ q₁⊗q₂) = R_a(p₁ ‖ q₁) · R_a(p₂ ‖ q₂).

        Probability Preservation #

        theorem PMF.renyiDiv_apply_bound {α : Type u_1} (a : ) (ha : 1 < a) (p q : PMF α) (x : α) :
        p x ^ (a / (a - 1)) / PMF.renyiDiv a p q q x

        Pointwise probability preservation: for a single element x, p(x)^{a/(a-1)} / R_a(p ‖ q) ≤ q(x).

        Proof: from p(x)^a * q(x)^{1-a} ≤ M_a(p ‖ q) (one term of the sum) we get p(x)^a ≤ M_a * q(x)^{a-1}, then take 1/(a-1) powers.

        theorem PMF.renyiDiv_prob_bound {α : Type u_1} (a : ) (ha : 1 < a) (p q : PMF α) (E : αProp) [DecidablePred E] :
        (∑' (x : α), if E x then p x else 0) ^ (a / (a - 1)) / PMF.renyiDiv a p q ∑' (x : α), if E x then q x else 0

        Probability preservation bound: for any event E, q(E) ≥ p(E)^{a/(a-1)} / R_a(p ‖ q).

        This is the key lemma used in security reductions involving Rényi divergence: if R_a(p ‖ q) is small, then events that are likely under p remain likely under q.

        From Relative Error to Rényi Divergence #

        theorem PMF.renyiMGF_le_of_pointwise_le {α : Type u_1} (a : ) (ha : 1 < a) (p q : PMF α) (δ : ENNReal) ( : ∀ (x : α), p x (1 + δ) * q x) :
        PMF.renyiMGF a p q (1 + δ) ^ (a - 1)

        If the pointwise relative error between two PMFs is bounded by δ, then the Rényi MGF is bounded. Specifically, if for all x, p(x) ≤ (1 + δ) · q(x), then M_a(p ‖ q) ≤ (1 + δ)^(a-1).

        This is used to convert floating-point error bounds into Rényi divergence bounds.

        theorem PMF.renyiDiv_le_of_pointwise_le {α : Type u_1} (a : ) (ha : 1 < a) (p q : PMF α) (δ : ENNReal) ( : ∀ (x : α), p x (1 + δ) * q x) :
        PMF.renyiDiv a p q 1 + δ

        Rényi divergence bound from pointwise relative error.

        Divergence to TV Distance #

        theorem PMF.etvDist_le_of_maxDiv {α : Type u_1} (p q : PMF α) :
        p.etvDist q 1 - (p.maxDiv q)⁻¹

        Max-divergence bounds TV distance: ‖p - q‖_TV ≤ 1 - 1 / D_∞(p ‖ q).

        When D = maxDiv p q = ⨆ x, p(x)/q(x) is finite, we have q(x) ≥ p(x)/D pointwise, so ∑ min(p(x), q(x)) ≥ 1/D, giving TV(p,q) = 1 - ∑ min(p(x),q(x)) ≤ 1 - 1/D.

        theorem PMF.etvDist_sq_le_of_renyiDiv {α : Type u_1} (a : ) (ha : 1 < a) (p q : PMF α) :
        p.etvDist q ^ 2 1 - (PMF.renyiDiv a p q)⁻¹

        Finite-order Rényi divergence bounds TV distance (squared form): TV(p, q)² ≤ 1 - R_a(p ‖ q)⁻¹.

        Proof sketch (not yet formalized):

        1. TV ≤ √(1 - BC²) where BC = ∑ √(p(x) q(x)) is the Bhattacharyya coefficient (Hellinger–TV inequality, via Cauchy–Schwarz on |√p - √q| · |√p + √q|).
        2. BC = M_{1/2}(p ‖ q), the Rényi MGF at order 1/2.
        3. By log-convexity of α ↦ M_α (interpolating at 1/2, 1, α): BC ≥ R_α^{-1/2}, so BC² ≥ R_α⁻¹, giving TV² ≤ 1 - R_α⁻¹.

        Formalizing this requires Hellinger distance and log-convexity of the Rényi MGF, neither of which is currently available.

        Compare etvDist_le_of_maxDiv (the analogous linear bound for max-divergence).