Discrete Ricci Curvature
Analyze graph geometry using Forman-Ricci and Ollivier-Ricci curvature.
Discrete Ricci curvature extends the concept of curvature from differential geometry to graphs. It reveals the local "shape" of the graph around each edge, distinguishing between clustered regions, branching points, and linear paths.
Ricci curvature is an advanced metric. Start with centrality and community detection before exploring curvature analysis.
Geometric Intuition
In differential geometry, curvature measures how space bends:
| Curvature | Shape | Graph Analogy |
|---|---|---|
| Positive | Sphere | Tight clusters, cliques |
| Zero | Flat plane | Regular grids, linear chains |
| Negative | Saddle | Tree-like branching, bridges |
Forman-Ricci Curvature
Forman-Ricci curvature is a combinatorial approximation based on the graph's local structure. It's fast to compute, making it suitable for large graphs.
Formula
For an edge e = (u, v):
Where:
- deg(u), deg(v) are the degrees of the endpoint nodes
- T(e) is the set of triangles containing edge e
Simplified Formula
Without triangle counting:
Complexity
O(E) — very fast, linear in the number of edges
Interpretation
| F(e) Value | Local Structure | In Lean Projects |
|---|---|---|
| F > 0 | Tightly clustered | Dense theorem groups |
| F ≈ 0 | Linear chain | Sequential proof steps |
| F < 0 | Branching point | Lemma used by many proofs |
Example
For an edge between two nodes with degrees 3 and 4:
This negative curvature indicates a branching structure.
Ollivier-Ricci Curvature
Ollivier-Ricci curvature uses optimal transport (Wasserstein distance) to measure how probability mass spreads from one node to another.
Formula
Where:
- W₁ is the Wasserstein-1 distance (Earth Mover's Distance)
- μₓ is the probability measure around node x
- d(x, y) is the graph distance between x and y
Probability Measure
For a node x, the probability distribution μₓ is defined as:
- μₓ(x) = α (probability of staying at x)
- μₓ(v) = (1 - α) / deg(x) for neighbors v ∈ N(x)
- μₓ(v) = 0 otherwise
Where α is the "laziness" parameter.
Parameters
| Parameter | Default | Description |
|---|---|---|
| alpha | 0.5 | Laziness parameter (higher = more local) |
| method | "OTD" | "OTD" (optimal) or "ATD" (approximate) |
Complexity
O(V × E) — slower but more accurate than Forman
Interpretation
| κ(x,y) Value | Meaning | In Lean Projects |
|---|---|---|
| κ > 0 | Mass converges | Theorems in same cluster |
| κ ≈ 0 | Mass preserved | Regular structure |
| κ < 0 | Mass diverges | Branching proofs |
Wasserstein Distance
The Wasserstein distance measures the cost of "transporting" one probability distribution to another.
Formula
Where:
- Γ(μ, ν) is the set of all couplings (transport plans)
- d(x, y) is the ground distance
For p = 1, this is the Earth Mover's Distance (EMD).
Implementation
Astrolabe uses the POT (Python Optimal Transport) library, with scipy as a fallback.
Curvature Statistics
Astrolabe computes aggregate statistics across all edges:
| Statistic | Description |
|---|---|
| min, max | Range of curvature values |
| mean | Average curvature |
| std | Standard deviation |
| median | Median curvature |
| positive_count | Edges with κ > 0 |
| negative_count | Edges with κ < 0 |
| positive_ratio | Fraction of positive curvature |
Global Interpretation
| Mean Curvature | Graph Structure |
|---|---|
| > 0.5 | Highly clustered (many cliques) |
| 0 < μ < 0.5 | Moderately clustered |
| -0.5 < μ < 0 | Tree-like structure |
| μ < -0.5 | Highly branching |
Node Curvature
Node curvature is derived by averaging the curvatures of incident edges:
Interpretation in Lean
| Node Type | Curvature | Description |
|---|---|---|
| Cluster center | High positive | Core theorem with many related lemmas |
| Chain node | Near zero | Part of a sequential proof |
| Branch point | Negative | Lemma used by diverging proofs |
| Bridge | Very negative | Connects separate modules |
Choosing Between Forman and Ollivier
| Aspect | Forman | Ollivier |
|---|---|---|
| Speed | O(E) — very fast | O(VE) — slower |
| Accuracy | Combinatorial approximation | Based on optimal transport |
| Graph size | Any size | Recommended for < 2000 nodes |
| Best for | Quick overview | Detailed analysis |
Practical Application
Example Workflow
- Quick scan: Run Forman-Ricci on the full graph
- Identify hotspots: Find highly negative (bridge) and positive (cluster) regions
- Detailed analysis: Run Ollivier-Ricci on interesting subgraphs
- Interpret: Map curvature to mathematical structure
In Lean Codebases
| Curvature Pattern | Mathematical Interpretation |
|---|---|
| Positive cluster | Related theorems (e.g., group theory basics) |
| Negative hub | Key lemma connecting domains |
| Zero chain | Step-by-step proof construction |
| Mixed region | Interface between mathematical areas |
API Endpoints
GET /api/project/analysis/curvature # Forman-Ricci (default, fast)
Returns:
- edge_curvatures: Edge → source, target, curvature
- node_curvatures: Node → average curvature
- statistics: Aggregate stats
- interpretation: Overall structure assessment
For Ollivier-Ricci, the endpoint accepts additional parameters:
GET /api/project/analysis/curvature?method=ollivier&alpha=0.5