Sub-Gaussian random variables #
This presentation of sub-Gaussian random variables is inspired by section 2.5 of
[vershynin2018high]. Let X be a random variable. Consider the following five properties, in which
Kᵢ are positive reals,
- (i) for all
t ≥ 0,ℙ(|X| ≥ t) ≤ 2 * exp(-t^2 / K₁^2), - (ii) for all
p : ℕwith1 ≤ p,𝔼[|X|^p]^(1/p) ≤ K₂ sqrt(p), - (iii) for all
|t| ≤ 1/K₃,𝔼[exp (t^2 * X^2)] ≤ exp (K₃^2 * t^2), - (iv)
𝔼[exp(X^2 / K₄)] ≤ 2, - (v) for all
t : ℝ,𝔼[exp (t * X)] ≤ exp (K₅ * t^2 / 2).
Properties (i) to (iv) are equivalent, in the sense that there exists a constant C such that
if X satisfies one of those properties with constant K, then it satisfies any other one with
constant at most CK.
If 𝔼[X] = 0 then properties (i)-(iv) are equivalent to (v) in that same sense.
Property (v) implies that X has expectation zero.
The name sub-Gaussian is used by various authors to refer to any one of (i)-(v). We will say that a
random variable has sub-Gaussian moment generating function (mgf) with constant K₅ to mean that
property (v) holds with that constant. The function exp (K₅ * t^2 / 2) which appears in
property (v) is the mgf of a Gaussian with variance K₅.
That property (v) is the most convenient one to work with if one wants to prove concentration
inequalities using Chernoff's method.
TODO: implement definitions for (i)-(iv) when it makes sense. For example the maximal constant K₄
such that (iv) is true is an Orlicz norm. Prove relations between those properties.
Conditionally sub-Gaussian random variables and kernels #
A related notion to sub-Gaussian random variables is that of conditionally sub-Gaussian random
variables. A random variable X is conditionally sub-Gaussian in the sense of (v) with respect to
a sigma-algebra m and a measure μ if for all t : ℝ, exp (t * X) is μ-integrable and
the conditional mgf of X conditioned on m is almost surely bounded by exp (c * t^2 / 2)
for some constant c.
As in other parts of Mathlib's probability library (notably the independence and conditional independence definitions), we express both sub-Gaussian and conditionally sub-Gaussian properties as special cases of a notion of sub-Gaussianity with respect to a kernel and a measure.
Main definitions #
Kernel.HasSubgaussianMGF: a random variableXhas a sub-Gaussian moment generating function with parametercwith respect to a kernelκand a measureνif forν-almost allω', for allt : ℝ, the moment generating function ofXwith respect toκ ω'is bounded byexp (c * t ^ 2 / 2).HasCondSubgaussianMGF: a random variableXhas a conditionally sub-Gaussian moment generating function with parametercwith respect to a sigma-algebramand a measureμif for allt : ℝ,exp (t * X)isμ-integrable and the moment generating function ofXconditioned onmis almost surely bounded byexp (c * t ^ 2 / 2)for allt : ℝ. The actual definition usesKernel.HasSubgaussianMGF:HasCondSubgaussianMGFis defined as sub-Gaussian with respect to the conditional expectation kernel formand the restriction ofμto the sigma-algebram.HasSubgaussianMGF: a random variableXhas a sub-Gaussian moment generating function with parametercwith respect to a measureμif for allt : ℝ,exp (t * X)isμ-integrable and the moment generating function ofXis bounded byexp (c * t ^ 2 / 2)for allt : ℝ. This is equivalent toKernel.HasSubgaussianMGFwith a constant kernel. SeeHasSubgaussianMGF_iff_kernel.
Main statements #
measure_sum_ge_le_of_iIndepFun: Hoeffding's inequality for sums of independent sub-Gaussian random variables.measure_sum_ge_le_of_HasCondSubgaussianMGF: the Azuma-Hoeffding inequality for sub-Gaussian random variables.
Implementation notes #
Definition of Kernel.HasSubgaussianMGF #
The definition of sub-Gaussian with respect to a kernel and a measure is the following:
structure Kernel.HasSubgaussianMGF (X : Ω → ℝ) (c : ℝ≥0)
(κ : Kernel Ω' Ω) (ν : Measure Ω' := by volume_tac) : Prop where
integrable_exp_mul : ∀ t, Integrable (fun ω ↦ exp (t * X ω)) (κ ∘ₘ ν)
mgf_le : ∀ᵐ ω' ∂ν, ∀ t, mgf X (κ ω') t ≤ exp (c * t ^ 2 / 2)
An interesting point is that the integrability condition is not integrability of exp (t * X)
with respect to κ ω' for ν-almost all ω', but integrability with respect to κ ∘ₘ ν.
This is a stronger condition, as the weaker one did not allow to prove interesting results about
the sum of two sub-Gaussian random variables.
For the conditional case, that integrability condition reduces to integrability of exp (t * X)
with respect to μ.
Definition of HasCondSubgaussianMGF #
We define HasCondSubgaussianMGF as a special case of Kernel.HasSubgaussianMGF with the
conditional expectation kernel for m, condExpKernel μ m, and the restriction of μ to m,
μ.trim hm (where hm states that m is a sub-sigma-algebra).
Note that condExpKernel μ m ∘ₘ μ.trim hm = μ. The definition is equivalent to the two
conditions
- for all
t,exp (t * X)isμ-integrable, - for
μ.trim hm-almost allω, for allt, the mgf with respect to the the conditional distributioncondExpKernel μ m ωis bounded byexp (c * t ^ 2 / 2).
For any t, we can write the mgf of X with respect to the conditional expectation kernel as
a conditional expectation, (μ.trim hm)-almost surely:
mgf X (condExpKernel μ m ·) t =ᵐ[μ.trim hm] μ[fun ω' ↦ exp (t * X ω') | m].
References #
- [R. Vershynin, High-dimensional probability: An introduction with applications in data science][vershynin2018high]
Sub-Gaussian with respect to a kernel and a measure #
A random variable X has a sub-Gaussian moment generating function with parameter c
with respect to a kernel κ and a measure ν if for ν-almost all ω', for all t : ℝ,
the moment generating function of X with respect to κ ω' is bounded by exp (c * t ^ 2 / 2).
This implies in particular that X has expectation 0.
- integrable_exp_mul (t : ℝ) : MeasureTheory.Integrable (fun (ω : Ω) => Real.exp (t * X ω)) (MeasureTheory.Measure.bind ν ⇑κ)
Instances For
Chernoff bound on the right tail of a sub-Gaussian random variable.
For ν : Measure Ω', κ : Kernel Ω' Ω and η : (Ω' × Ω) Ω'', if a random variable X : Ω → ℝ
has a sub-Gaussian mgf with respect to κ and ν and another random variable Y : Ω'' → ℝ has
a sub-Gaussian mgf with respect to η and ν ⊗ₘ κ : Measure (Ω' × Ω), then X + Y (random
variable on the measurable space Ω × Ω'') has a sub-Gaussian mgf with respect to
κ ⊗ₖ η : Kernel Ω' (Ω × Ω'') and ν.
For ν : Measure Ω', κ : Kernel Ω' Ω and η : Ω Ω'', if a random variable X : Ω → ℝ
has a sub-Gaussian mgf with respect to κ and ν and another random variable Y : Ω'' → ℝ has
a sub-Gaussian mgf with respect to η and κ ∘ₘ ν : Measure Ω, then X + Y (random
variable on the measurable space Ω × Ω'') has a sub-Gaussian mgf with respect to
κ ⊗ₖ prodMkLeft Ω' η : Kernel Ω' (Ω × Ω'') and ν.
Conditionally sub-Gaussian moment generating function #
A random variable X has a conditionally sub-Gaussian moment generating function
with parameter c with respect to a sigma-algebra m and a measure μ if for all t : ℝ,
exp (t * X) is μ-integrable and the moment generating function of X conditioned on m is
almost surely bounded by exp (c * t ^ 2 / 2) for all t : ℝ.
This implies in particular that X has expectation 0.
The actual definition uses Kernel.HasSubgaussianMGF: HasCondSubgaussianMGF is defined as
sub-Gaussian with respect to the conditional expectation kernel for m and the restriction of μ
to the sigma-algebra m.
Equations
- ProbabilityTheory.HasCondSubgaussianMGF m hm X c μ = ProbabilityTheory.Kernel.HasSubgaussianMGF X c (ProbabilityTheory.condExpKernel μ m) (μ.trim hm)
Instances For
Sub-Gaussian moment generating function #
A random variable X has a sub-Gaussian moment generating function with parameter c
with respect to a measure μ if for all t : ℝ, exp (t * X) is μ-integrable and
the moment generating function of X is bounded by exp (c * t ^ 2 / 2) for all t : ℝ.
This implies in particular that X has expectation 0.
This is equivalent to Kernel.HasSubgaussianMGF X c (Kernel.const Unit μ) (Measure.dirac ()),
as proved in HasSubgaussianMGF_iff_kernel.
Properties about sub-Gaussian moment generating functions should be proved first for
Kernel.HasSubgaussianMGF when possible.
- integrable_exp_mul (t : ℝ) : MeasureTheory.Integrable (fun (ω : Ω) => Real.exp (t * X ω)) μ
Instances For
Chernoff bound on the right tail of a sub-Gaussian random variable.
Hoeffding inequality for sub-Gaussian random variables.
Hoeffding inequality for sub-Gaussian random variables.
If X is sub-Gaussian with parameter cX with respect to the restriction of μ to
a sub-sigma-algebra m and Y is conditionally sub-Gaussian with parameter cY with respect to
m and μ then X + Y is sub-Gaussian with parameter cX + cY with respect to μ.
HasSubgaussianMGF X cX (μ.trim hm) can be obtained from HasSubgaussianMGF X cX μ if X is
m-measurable. See HasSubgaussianMGF.trim.
Let Y be a random process adapted to a filtration ℱ, such that for all i : ℕ, Y i is
conditionally sub-Gaussian with parameter cY i with respect to ℱ (i - 1).
In particular, n ↦ ∑ i ∈ range n, Y i is a martingale.
Then the sum ∑ i ∈ range n, Y i is sub-Gaussian with parameter ∑ i ∈ range n, cY i.
Azuma-Hoeffding inequality for sub-Gaussian random variables.