Geometric discrepancy: An illustrated guide
Main results: EpsilonNetTheorem, ShatterFunctionLemma, PackingLemma, ShortEdgeLemma
1Overview
All Entries
- def DualShatterFunction
- def EpsilonApproximation
- def EpsilonNet
- def ShatterFunction
- lem ShortEdgeLemma
- def VCDimension
- lem EasyEpsilonNetLemma
- lem EpsilonApproximationBound
- thm EpsilonNetTheorem
- lem PackingLemma
- lem ShatterFunctionLemma
- lem UnitDistanceGraphEdges
2 Main Results
Preliminaries
Let \((X, \mathcal{S})\) be a set system, where \(\mathcal{S} \subseteq 2^X\). The primal shatter function \(\pi_{\mathcal{S}} \colon \mathbb{N} \to \mathbb{N}\) is defined by \[\pi_{\mathcal{S}}(m) = \max_{Y \subseteq X,\; |Y| = m} |\mathcal{S}|_Y|,\] where \(\mathcal{S}|_Y = \{ S \cap Y : S \in \mathcal{S} \}\) is the trace of \(\mathcal{S}\) on \(Y\).
| 1 | import Mathlib.Combinatorics.SetFamily.Shatter |
| 2 | |
| 3 | namespace Catalog.VC.ShatterFunction |
| 4 | |
| 5 | variable {α : Type*} [DecidableEq α] |
| 6 | |
| 7 | /-- The trace (restriction) of a set family `𝒜` on a subset `Y`: |
| 8 | the family `{ S ∩ Y | S ∈ 𝒜 }` of distinct intersections. -/ |
| 9 | def trace (𝒜 : Finset (Finset α)) (Y : Finset α) : Finset (Finset α) := |
| 10 | 𝒜.image (· ∩ Y) |
| 11 | |
| 12 | /-- The primal shatter function `π_𝒜(m)`: the maximum number of distinct |
| 13 | traces of `𝒜` on any `m`-element subset of the ground set. (Def 5.1) -/ |
| 14 | def shatterFun [Fintype α] (𝒜 : Finset (Finset α)) (m : ℕ) : ℕ := |
| 15 | (Finset.univ.filter (fun Y : Finset α => Y.card = m)).sup |
| 16 | (fun Y => (trace 𝒜 Y).card) |
| 17 | |
| 18 | end Catalog.VC.ShatterFunction |
| 19 |
Let \((X, \mathcal{S})\) be a set system. A subset \(A \subseteq X\) is shattered by \(\mathcal{S}\) if \(\mathcal{S}|_A = 2^A\), that is, every subset of \(A\) is of the form \(S \cap A\) for some \(S \in \mathcal{S}\).
The VC-dimension of \(\mathcal{S}\), denoted \(\dim(\mathcal{S})\), is the supremum of the sizes of all finite shattered subsets of \(X\). Equivalently, \[\dim(\mathcal{S}) = \sup\{ m : \pi_{\mathcal{S}}(m) = 2^m \}.\]
| 1 | import Mathlib.Combinatorics.SetFamily.Shatter |
| 2 | |
| 3 | namespace Catalog.VC.VCDimension |
| 4 | |
| 5 | /-- A finite set `s` is shattered by family `𝒜` if every subset of `s` |
| 6 | is realized as `s ∩ S` for some `S ∈ 𝒜`. |
| 7 | Wraps Mathlib's `Finset.Shatters`. (Def 5.8) -/ |
| 8 | abbrev Shatters {α : Type*} [DecidableEq α] |
| 9 | (𝒜 : Finset (Finset α)) (s : Finset α) : Prop := |
| 10 | 𝒜.Shatters s |
| 11 | |
| 12 | /-- The VC-dimension of `𝒜`: the maximum cardinality of a set shattered |
| 13 | by `𝒜`. Equivalently, `sup { m : π_𝒜(m) = 2^m }`. |
| 14 | Wraps Mathlib's `Finset.vcDim`. (Def 5.8) -/ |
| 15 | abbrev vcDim {α : Type*} [DecidableEq α] |
| 16 | (𝒜 : Finset (Finset α)) : ℕ := |
| 17 | 𝒜.vcDim |
| 18 | |
| 19 | end Catalog.VC.VCDimension |
| 20 |
EpsilonNetTheorem
Let \((X, \mathcal{S})\) be a set system and \(\mu\) a probability measure on \(X\). A subset \(N \subseteq X\) is an \(\varepsilon\)-net for \((X, \mathcal{S})\) with respect to \(\mu\) if \(N \cap S \ne \emptyset\) for every \(S \in \mathcal{S}\) with \(\mu(S) \ge \varepsilon\).
When \(X\) is finite and \(\mu\) is the counting measure (\(\mu(A) = |A|/|X|\)), this says that \(N\) intersects every set \(S \in \mathcal{S}\) with \(|S| \ge \varepsilon |X|\).
| 1 | import Mathlib.Combinatorics.SetFamily.Shatter |
| 2 | import Mathlib.Analysis.SpecialFunctions.Log.Basic |
| 3 | |
| 4 | namespace Catalog.VC.EpsilonNet |
| 5 | |
| 6 | variable {α : Type*} [DecidableEq α] [Fintype α] |
| 7 | |
| 8 | /-- `N` is an `ε`-net for set family `𝒜` under the uniform distribution |
| 9 | on the finite ground set `α`. Every `S ∈ 𝒜` whose density |
| 10 | `|S| / |α| ≥ ε` satisfies `N ∩ S ≠ ∅`. -/ |
| 11 | def IsEpsilonNet (𝒜 : Finset (Finset α)) (ε : ℝ) (N : Finset α) : Prop := |
| 12 | ∀ S ∈ 𝒜, ε * (Fintype.card α : ℝ) ≤ (S.card : ℝ) → (N ∩ S).Nonempty |
| 13 | |
| 14 | end Catalog.VC.EpsilonNet |
| 15 |
For every \(d \ge 1\) there exists a constant \(C(d)\) such that the following holds. If \(X\) is a nonempty set with a probability measure \(\mu\), \(\mathcal{S}\) is a system of \(\mu\)-measurable subsets of \(X\) with \(\dim(\mathcal{S}) \le d\) (Definition VCDimension), and \(r \ge 2\), then there exists a \(\frac{1}{r}\)-net (Definition EpsilonNet) for \((X, \mathcal{S})\) with respect to \(\mu\) of size at most \(C(d)\, r \ln r\).
| 1 | import Catalog.VC.VCDimension.Contract |
| 2 | import Catalog.VC.EpsilonNet.Contract |
| 3 | import Mathlib.Analysis.SpecialFunctions.Log.Basic |
| 4 | |
| 5 | namespace Catalog.VC.EpsilonNetTheorem |
| 6 | |
| 7 | /-- For every VC-dimension bound `d`, there exists a constant `C` such that |
| 8 | any set family with VC-dimension at most `d` admits a `(1/r)`-net of size |
| 9 | at most `C · r · ln r`, for all `r ≥ 2`. (Theorem 5.12) -/ |
| 10 | axiom epsilonNetTheorem (d : ℕ) : ∃ C : ℝ, 0 < C ∧ |
| 11 | ∀ (α : Type*) [DecidableEq α] [Fintype α] [Nonempty α] (𝒜 : Finset (Finset α)), |
| 12 | 𝒜.vcDim ≤ d → |
| 13 | ∀ (r : ℝ), 2 ≤ r → |
| 14 | ∃ N : Finset α, Catalog.VC.EpsilonNet.IsEpsilonNet 𝒜 (1 / r) N ∧ |
| 15 | (N.card : ℝ) ≤ C * r * Real.log r |
| 16 | |
| 17 | end Catalog.VC.EpsilonNetTheorem |
| 18 |
ShatterFunctionLemma
For any set system \(\mathcal{S}\) of VC-dimension at most \(d\) (Definition VCDimension), the primal shatter function (Definition ShatterFunction) satisfies \[\pi_{\mathcal{S}}(m) \le \Phi_d(m) \quad \text{for all } m,\] where \[\Phi_d(m) = \binom{m}{0} + \binom{m}{1} + \cdots + \binom{m}{d}.\]
| 1 | import Catalog.VC.ShatterFunction.Contract |
| 2 | import Catalog.VC.VCDimension.Contract |
| 3 | |
| 4 | namespace Catalog.VC.ShatterFunctionLemma |
| 5 | |
| 6 | /-- The binomial sum `Φ_d(m) = ∑_{k=0}^{d} C(m, k)`. Upper bound for the |
| 7 | shatter function of a family with VC-dimension at most `d`. -/ |
| 8 | def binomialSum (d m : ℕ) : ℕ := |
| 9 | (Finset.range (d + 1)).sum (fun k => m.choose k) |
| 10 | |
| 11 | /-- The shatter function of a family with VC-dimension at most `d` is bounded |
| 12 | by the binomial sum: `π_𝒜(m) ≤ Φ_d(m)` for all `m`. (Lemma 5.9) -/ |
| 13 | axiom shatterFunctionLemma {α : Type*} [DecidableEq α] [Fintype α] |
| 14 | (𝒜 : Finset (Finset α)) {d : ℕ} (hd : 𝒜.vcDim ≤ d) (m : ℕ) : |
| 15 | Catalog.VC.ShatterFunction.shatterFun 𝒜 m ≤ binomialSum d m |
| 16 | |
| 17 | end Catalog.VC.ShatterFunctionLemma |
| 18 |
PackingLemma
Let \(d > 1\) and \(C\) be constants, and let \((X, \mathcal{S})\) be a set system on an \(n\)-point set whose primal shatter function (Definition ShatterFunction) satisfies \(\pi_{\mathcal{S}}(m) \le C m^d\) for all \(m = 1, 2, \ldots, n\). Let \(\delta\) be an integer, \(1 \le \delta \le n\), and let \(\mathcal{P} \subseteq \mathcal{S}\) be \(\delta\)-separated: any two distinct sets of \(\mathcal{P}\) satisfy \(|S_1 \mathbin{\triangle} S_2| > \delta\). Then \[|\mathcal{P}| = O\!\left(\left(\frac{n}{\delta}\right)^{\!d}\right).\]
| 1 | import Catalog.VC.ShatterFunction.Contract |
| 2 | import Mathlib.Analysis.SpecialFunctions.Pow.Real |
| 3 | |
| 4 | namespace Catalog.VC.PackingLemma |
| 5 | |
| 6 | variable {α : Type*} [DecidableEq α] |
| 7 | |
| 8 | /-- A subfamily `𝒫` is `δ`-separated if any two distinct members have |
| 9 | symmetric difference of size strictly greater than `δ`. -/ |
| 10 | def IsSeparated (P : Finset (Finset α)) (δ : ℕ) : Prop := |
| 11 | ∀ S ∈ P, ∀ S' ∈ P, S ≠ S' → δ < (S \ S' ∪ (S' \ S)).card |
| 12 | |
| 13 | /-- Packing lemma: if the shatter function satisfies `π_𝒜(m) ≤ C · m^d` |
| 14 | and `𝒫 ⊆ 𝒜` is `δ`-separated, then `|𝒫| ≤ C' · (n/δ)^d`. |
| 15 | The constant `C'` depends only on `C` and `d`. (Lemma 5.14) -/ |
| 16 | axiom packingLemma (d : ℕ) (C : ℝ) (hC : 0 < C) : ∃ C' : ℝ, 0 < C' ∧ |
| 17 | ∀ (α : Type*) [DecidableEq α] [Fintype α] (F P : Finset (Finset α)), |
| 18 | P ⊆ F → |
| 19 | (∀ m : ℕ, (Catalog.VC.ShatterFunction.shatterFun F m : ℝ) ≤ C * (m : ℝ) ^ d) → |
| 20 | ∀ (δ : ℕ), 0 < δ → δ ≤ Fintype.card α → |
| 21 | IsSeparated P δ → |
| 22 | (P.card : ℝ) ≤ C' * ((Fintype.card α : ℝ) / δ) ^ d |
| 23 | |
| 24 | end Catalog.VC.PackingLemma |
| 25 |
ShortEdgeLemma
Let \((X, \mathcal{S})\) be a set system. The dual shatter function \(\pi_{\mathcal{S}}^* \colon \mathbb{N} \to \mathbb{N}\) is defined by \[\pi_{\mathcal{S}}^*(m) = \max_{\mathcal{Y} \subseteq \mathcal{S},\; |\mathcal{Y}| = m} \bigl|\bigl\{ \{S \in \mathcal{Y} : x \in S\} : x \in X \bigr\}\bigr|.\] Equivalently, \(\pi_{\mathcal{S}}^*(m)\) is the maximum number of nonempty cells in the Venn diagram of \(m\) sets chosen from \(\mathcal{S}\).
| 1 | import Mathlib.Combinatorics.SetFamily.Shatter |
| 2 | |
| 3 | namespace Catalog.VC.DualShatterFunction |
| 4 | |
| 5 | variable {α : Type*} [DecidableEq α] |
| 6 | |
| 7 | /-- The atom map of a subfamily `𝒴` at point `x`: the set of members |
| 8 | of `𝒴` that contain `x`. Two points `x, y` have the same atom map |
| 9 | iff they belong to exactly the same sets of `𝒴`. -/ |
| 10 | def atomMap (𝒴 : Finset (Finset α)) (x : α) : Finset (Finset α) := |
| 11 | 𝒴.filter (x ∈ ·) |
| 12 | |
| 13 | /-- The dual shatter function `π*_𝒜(m)`: the maximum number of distinct |
| 14 | atom types induced by any `m`-element subfamily of `𝒜`. |
| 15 | Equivalently, the maximum number of nonempty cells in the Venn diagram |
| 16 | of `m` sets chosen from `𝒜`. (Def 5.2) -/ |
| 17 | def dualShatterFun [Fintype α] (𝒜 : Finset (Finset α)) (m : ℕ) : ℕ := |
| 18 | (𝒜.powerset.filter (fun 𝒴 => 𝒴.card = m)).sup |
| 19 | (fun 𝒴 => (Finset.univ.image (atomMap 𝒴)).card) |
| 20 | |
| 21 | end Catalog.VC.DualShatterFunction |
| 22 |
Let \((X, \mathcal{S})\) be a set system on an \(n\)-point set \(X\) with dual shatter function (Definition DualShatterFunction) \(\pi_{\mathcal{S}}^*(m) \le C m^d\) for all \(m\), where \(C\) and \(d > 1\) are constants. A set \(S \in \mathcal{S}\) crosses an edge \(\{u,v\}\) of a graph on \(X\) if \(|S \cap \{u,v\}| = 1\).
Then for any multiset \(Q\) of sets from \(\mathcal{S}\), there exist distinct points \(x, y \in X\) such that the number of sets of \(Q\) crossing \(\{x, y\}\) is at most \[C_2 \, \frac{|Q|}{n^{1/d}},\] where \(C_2\) depends only on \(C\) and \(d\).
| 1 | import Catalog.VC.DualShatterFunction.Contract |
| 2 | import Mathlib.Analysis.SpecialFunctions.Pow.Real |
| 3 | |
| 4 | namespace Catalog.VC.ShortEdgeLemma |
| 5 | |
| 6 | universe u |
| 7 | |
| 8 | variable {α : Type*} [DecidableEq α] |
| 9 | |
| 10 | /-- The crossing number of an edge `{x, y}` by a subfamily `Q`: the number |
| 11 | of sets in `Q` that contain exactly one of `x, y`. -/ |
| 12 | abbrev crossingNumber (Q : Finset (Finset α)) (x y : α) : ℕ := |
| 13 | (Q.filter (fun S => (x ∈ S ∧ y ∉ S) ∨ (x ∉ S ∧ y ∈ S))).card |
| 14 | |
| 15 | /-- Short edge lemma: if the dual shatter function satisfies |
| 16 | `π*_𝒜(m) ≤ C · m^d` with `d > 1`, then for any subfamily `Q ⊆ 𝒜` |
| 17 | there exist distinct points `x, y` with crossing number at most |
| 18 | `C₂ · |Q| / n^{1/d}`. (Lemma 5.18) -/ |
| 19 | axiom shortEdgeLemma (d : ℕ) (C : ℝ) (hd : 1 < d) (hC : 0 < C) : |
| 20 | ∃ C₂ : ℝ, 0 < C₂ ∧ |
| 21 | ∀ (α : Type u) [DecidableEq α] [Fintype α] (𝒜 : Finset (Finset α)), |
| 22 | (∀ m : ℕ, (Catalog.VC.DualShatterFunction.dualShatterFun 𝒜 m : ℝ) ≤ C * (m : ℝ) ^ d) → |
| 23 | 2 ≤ Fintype.card α → |
| 24 | ∀ (Q : Finset (Finset α)), Q ⊆ 𝒜 → |
| 25 | ∃ x y : α, x ≠ y ∧ |
| 26 | ((Q.filter (fun S => (x ∈ S ∧ y ∉ S) ∨ (x ∉ S ∧ y ∈ S))).card : ℝ) ≤ |
| 27 | C₂ * (Q.card : ℝ) / |
| 28 | (Fintype.card α : ℝ) ^ ((1 : ℝ) / d) |
| 29 | |
| 30 | end Catalog.VC.ShortEdgeLemma |
| 31 |