Basics of Coding Theory #
We define a general code C to be a subset of n → R for some finite index set n and some
target type R.
We can then specialize this notion to various settings. For [CommSemiring R], we define a linear
code to be a linear subspace of n → R. We also define the notion of generator matrix and
(parity) check matrix.
Naming conventions #
- suffix
': computable/instantiation of the corresponding mathematical generic definitions without such suffix (e.g.Δ₀'(u, C)vsΔ₀(u, C),δᵣ'(u, C)vsδᵣ(u, C), ...) - NOTE: The generic (non-suffixed) definitions (Δ₀,δᵣ, ...) are recommended to be used in generic security statements, and the suffixed definitions (Δ₀',δᵣ', ...) are used for proofs or in statements of lemmas that need smaller value range. - We usually prove the equality as a bridge from the suffixed definitions into the non-suffixed definitions (e.g.distFromCode'_eq_distFromCode, ...)
Main Definitions #
- Distance between two words:
-
hammingDist u v (Δ₀(u, v)): The Hamming distance between two wordsuandv-relHammingDist u v (δᵣ(u, v)): The relative Hamming distance between two wordsuandv - Distance of code:
-
dist C (‖"C‖₀): The Hamming distance of a codeC, defined as the infimum (inℕ∞) of the Hamming distances between any two distinct elements ofC. This is noncomputable. - Distance from a word to a code:
-
distFromCode u C (Δ₀(u, C)): The hamming distance from a worduto a codeCdistFromCode_of_empty:Δ₀(u, ∅) = ⊤distFromCode_eq_top_iff_empty:Δ₀(u, C) = ⊤ ↔ C = ∅-distFromCode' u C (Δ₀'(u, C)): A computable version ofdistFromCode u C, assumingCis aFintype.distFromCode'_eq_distFromCode:Δ₀'(u, C) = Δ₀(u, C)-relDistFromCode u C (δᵣ(u, C)): The relative Hamming distance from a worduto a codeCrelDistFromCode' u C (δᵣ'(u, C)): A computable version ofrelDistFromCode u C, assumingCis aFintypeandCis non-empty.relDistFromCode'_eq_relDistFromCode:δᵣ'(u, C) = δᵣ(u, C)
- Switching between different distance realms:
-
relDistFromCode_eq_distFromCode_div:δᵣ(u, C) = Δ₀(u, C) / |ι|-pairDist_eq_distFromCode_iff_eq_relDistFromCode_div:Δ₀(u, v) = Δ₀(u, C) ↔ δᵣ(u, v) = δᵣ(u, C)-relDistFromCode_le_relDist_to_mem:δᵣ(u, C) ≤ δᵣ(u, v)-relCloseToCode_iff_relCloseToCodeword_of_minDist:δᵣ(u, C) ≤ δ ↔ ∃ v ∈ C, δᵣ(u, v) ≤ δ-pairRelDist_le_iff_pairDist_le:(δᵣ(u, v) ≤ δ) ↔ (Δ₀(u, v) ≤ Nat.floor (δ * Fintype.card ι))-distFromCode_le_iff_relDistFromCode_le:Δ₀(u, C) ≤ e ↔ δᵣ(u, C) ≤ (e : ℝ≥0) / (Fintype.card ι : ℝ≥0)-relDistFromCode_le_iff_distFromCode_le:δᵣ(u, C) ≤ δ ↔ Δ₀(u, C) ≤ Nat.floor (δ * Fintype.card ι)-relCloseToWord_iff_exists_possibleDisagreeCols-relCloseToWord_iff_exists_agreementCols-relDist_floor_bound_iff_complement_bound-distFromCode_le_dist_to_mem:Δ₀(u, C) ≤ Δ₀(u, v), given v ∈ C-distFromCode_le_card_index_of_Nonempty:Δ₀(u, C) ≤ |ι|, given C is non-empty - Unique decoding radius:
-
uniqueDecodingRadius C (UDR(C)): The unique decoding radius of a codeC-relativeUniqueDecodingRadius C (relUDR(C)): The relative unique decoding radius of a codeC-UDR_close_iff_exists_unique_close_codeword:Δ₀(u, C) ≤ UDR(C) ↔ ∃! v ∈ C, Δ₀(u, v) ≤ UDR(C)-UDRClose_iff_two_mul_proximity_lt_d_UDR:e ≤ UDR(C) ↔ 2 * e < ‖C‖₀-eq_of_le_uniqueDecodingRadius-UDR_close_iff_relURD_close:Δ₀(u, C) ≤ UDR(C) ↔ δᵣ(u, C) ≤ relUDR(C)-dist_le_UDR_iff_relDist_le_relUDR:e ≤ UDR(C) ↔ (e : ℝ≥0) / (Fintype.card ι : ℝ≥0) ≤ relUDR(C)
We define the block length, rate, and distance of C. We prove simple properties of linear codes
such as the singleton bound.
TODOs #
- Implement
ENNRat (ℚ≥0∞), for usage inrelDistFromCodeandrelDistFromCode', as counterpart ofENat (ℕ∞)indistFromCodeanddistFromCode'.
The Hamming distance of a code C is the minimum Hamming distance between any two distinct
elements of the code.
We formalize this as the infimum sInf over all d : ℕ such that there exist u v : n → R in the
code with u ≠ v and hammingDist u v ≤ d. If none exists, then we define the distance to be 0.
Equations
Instances For
Equations
Equations
Equations
Instances For
Equations
Instances For
The distance from a vector u to a code C is the minimum Hamming distance between u
and any element of C.
Equations
Instances For
Equations
Instances For
A non-trivial code (a code with at least two distinct codewords) must have a minimum distance greater than 0.
Equations
Instances For
Computable version of the Hamming distance of a code C, assuming C is a Fintype.
The return type is ℕ∞ since we use Finset.min.
Equations
Instances For
The set of possible distances δf between distinct codewords in a code C.
- TODO: This allows us to express distance in non-ℝ, which is quite convenient.
Extending to
(E)Distforces this intoℝ; give some thought.
Equations
Instances For
A generalisation of distFromCode for an arbitrary distance function δf.
Equations
Instances For
Computable version of the distance from a vector u to a finite code C.
Equations
Instances For
Equations
Instances For
δᵣ(u,v) denotes the relative Hamming distance between vectors u and v.
Equations
Instances For
The relative Hamming distance from a vector to a code, defined as the infimum
of all relative distances from u to codewords in C.
The type is ENNReal (ℝ≥0∞) to correctly handle the case C = ∅.
For case of Nonempty C, we can use relDistFromCode (δᵣ') for smaller value range in
ℚ≥0, which is equal to this definition.
Equations
Instances For
δᵣ(u,C) denotes the relative distance from u to C. This is the main standard definition
used in statements. The NNRat version of it is δᵣ'(u, C).
Equations
Instances For
This follows proof strategy from exists_closest_codeword_of_Nonempty_Code. However, it's NOT
used to construct pickRelClosestCodeword_of_Nonempty_Code.
Note that this gives the same codeword as pickClosestCodeword_of_Nonempty_Code.
Equations
Instances For
Relative distance version of closeToCode_iff_closeToCodeword_of_minDist.
If the distance to a code is at most δ, then there exists a codeword within distance δ.
NOTE: can we make this shorter using relDistFromCode_eq_distFromCode_div?
A word u is close to a code C within an absolute error bound e if and only if
it is close within the equivalent relative error bound e / n.
A word u is relatively close to a code C within an relative error bound δ if and only if
it is relatively close within the equivalent absolute error bound ⌊δ * n⌋.
The equivalence between the two lowerbound of upperBound in Nat and NNReal context.
In which, upperBound is viewed as the size of an agreement set S (e.g. between two words,
or between a word to a code, ...).
Specifically, n - ⌊δ * n⌋ ≤ (upperBound : ℕ) ↔ (1 - δ) * n ≤ (upperBound : ℝ≥0).
This lemma is useful for jumping back-and-forth between absolute distance and relative distance
realms, especially when we work with an agreement set. For example, it can be used after simping
with closeToWord_iff_exists_agreementCols and relCloseToWord_iff_exists_agreementCols.
The range of the relative Hamming distance is well-defined.
The range of the relative Hamming distance function is finite.
The set of possible distances between distinct codewords in a code.
Equations
Instances For
The set of possible distances between distinct codewords in a code is a subset of the range of the relative Hamming distance function.
The set of possible distances between distinct codewords in a code is a finite set.
The minimum relative Hamming distance of a code.
Equations
Instances For
δᵣ C denotes the minimum relative Hamming distance of a code C.
Equations
Instances For
The range set of possible relative Hamming distances from a vector to a code is a subset of the range of the relative Hamming distance function.
The set of possible relative Hamming distances from a vector to a code is a finite set.
Equations
Computable version of the relative Hamming distance from a vector w to a finite
non-empty code C. This one is intended to mimic the definition of distFromCode'.
However, we don't have ENNRat (ℚ≥0∞) (as counterpart of ENat (ℕ∞) in distFromCode')
so we require [Nonempty C].
TODO: define ENNRat (ℚ≥0∞) so we can migrate both relDistFromCode
and relDistFromCode' to ℚ≥0∞
Equations
Instances For
δᵣ'(w,C) denotes the relative Hamming distance between a vector w and a code C.
This is a different statement of the generic definition δᵣ(w,C).
Equations
Instances For
The unique decoding radius: ≤ ⌊(d-1)/2⌋ for any code C.
Equations
Instances For
Alias of Code.uniqueDecodingRadius.
The unique decoding radius: ≤ ⌊(d-1)/2⌋ for any code C.
Equations
Instances For
The relative unique decoding radius, obtained from the absolute radius by normalizing with the
block length. This also works with ≤.
Equations
Instances For
Alias of Code.relativeUniqueDecodingRadius.
The relative unique decoding radius, obtained from the absolute radius by normalizing with the
block length. This also works with ≤.
Equations
Instances For
Given an error/proximity parameter e within the unique decoding radius of a code C where
‖C‖₀ > 0, this lemma proves the standard bound 2 * e < d
(i.e. condition of Code.eq_of_lt_dist).
A stronger version of distFromCode_eq_of_lt_half_dist:
If two codewords v and w are both within the uniqueDecodingRadius of
u (i.e. 2 * Δ₀(u, v) < ‖C‖₀ and 2 * Δ₀(u, w) < ‖C‖₀), then they must be equal.
A word u is within the uniqueDecodingRadius of a code C if and only if
there exists exactly one codeword v in C that is that close.
A word u is close to a code C within the absolute unique decoding radius
if and only if it is close within the relative unique decoding radius.
Singleton bound for arbitrary codes
A ModuleCode ι F A is an F-linear code of length indexed by ι over the alphabet A,
defined as an F-submodule of ι → A.
Equations
Instances For
Module code defined by left multiplication by its generator matrix. For a matrix G : Matrix κ ι F (over field F) and module A over F, this generates the F-submodule of (ι → A) spanned by the rows of G acting on (κ → A). The matrix acts on vectors v : κ → A by: (G • v)(i) = ∑ k, G k i • v k where G k i : F is the scalar and v k : A is the module element.
Equations
Instances For
Linear code defined by right multiplication by a generator matrix.
Equations
Instances For
Define a linear code from its (parity) check matrix
Equations
Instances For
The Hamming distance of a linear code can also be defined as the minimum Hamming norm of a non-zero vector in the code
Equations
Instances For
The dimension of a linear code.
Equations
Instances For
The length of a linear code.
Equations
Instances For
The rate of a linear code.
Equations
Instances For
ρ LC is the rate of the linear code LC.
Uses & to make the notation non-reserved, allowing ρ to also be used as a variable name.
Equations
Instances For
The minimum taken over the weight of codewords in a linear code.
Equations
Instances For
The Hamming distance between codewords equals to the weight of their difference.
The min distance of a linear code equals the minimum of the weights of non-zero codewords.
Singleton bound for linear codes
A computable version of the Hamming distance of a module code MC.