Discrete Ricci Curvature

Analyze graph geometry using Forman-Ricci and Ollivier-Ricci curvature.

Discrete Ricci curvature extends the concept of curvature from differential geometry to graphs. It reveals the local "shape" of the graph around each edge, distinguishing between clustered regions, branching points, and linear paths.

Ricci curvature is an advanced metric. Start with centrality and community detection before exploring curvature analysis.

Geometric Intuition

In differential geometry, curvature measures how space bends:

CurvatureShapeGraph Analogy
PositiveSphereTight clusters, cliques
ZeroFlat planeRegular grids, linear chains
NegativeSaddleTree-like branching, bridges

Forman-Ricci Curvature

Forman-Ricci curvature is a combinatorial approximation based on the graph's local structure. It's fast to compute, making it suitable for large graphs.

Formula

For an edge e = (u, v):

F(e)=4deg(u)deg(v)+3T(e)F(e) = 4 - \deg(u) - \deg(v) + 3 \cdot |T(e)|

Where:

  • deg(u), deg(v) are the degrees of the endpoint nodes
  • T(e) is the set of triangles containing edge e

Simplified Formula

Without triangle counting:

F(e)=4deg(u)deg(v)F(e) = 4 - \deg(u) - \deg(v)

Complexity

O(E) — very fast, linear in the number of edges

Interpretation

F(e) ValueLocal StructureIn Lean Projects
F > 0Tightly clusteredDense theorem groups
F ≈ 0Linear chainSequential proof steps
F < 0Branching pointLemma used by many proofs

Example

For an edge between two nodes with degrees 3 and 4:

F(e)=434=3F(e) = 4 - 3 - 4 = -3

This negative curvature indicates a branching structure.

Ollivier-Ricci Curvature

Ollivier-Ricci curvature uses optimal transport (Wasserstein distance) to measure how probability mass spreads from one node to another.

Formula

κ(x,y)=1W1(μx,μy)d(x,y)\kappa(x, y) = 1 - \frac{W_1(\mu_x, \mu_y)}{d(x, y)}

Where:

  • W₁ is the Wasserstein-1 distance (Earth Mover's Distance)
  • μₓ is the probability measure around node x
  • d(x, y) is the graph distance between x and y

Probability Measure

For a node x, the probability distribution μₓ is defined as:

  • μₓ(x) = α (probability of staying at x)
  • μₓ(v) = (1 - α) / deg(x) for neighbors v ∈ N(x)
  • μₓ(v) = 0 otherwise

Where α is the "laziness" parameter.

Parameters

ParameterDefaultDescription
alpha0.5Laziness parameter (higher = more local)
method"OTD""OTD" (optimal) or "ATD" (approximate)

Complexity

O(V × E) — slower but more accurate than Forman

Interpretation

κ(x,y) ValueMeaningIn Lean Projects
κ > 0Mass convergesTheorems in same cluster
κ ≈ 0Mass preservedRegular structure
κ < 0Mass divergesBranching proofs

Wasserstein Distance

The Wasserstein distance measures the cost of "transporting" one probability distribution to another.

Formula

Wp(μ,ν)=(infγΓ(μ,ν)d(x,y)pdγ(x,y))1/pW_p(\mu, \nu) = \left( \inf_{\gamma \in \Gamma(\mu, \nu)} \int d(x, y)^p \, d\gamma(x, y) \right)^{1/p}

Where:

  • Γ(μ, ν) is the set of all couplings (transport plans)
  • d(x, y) is the ground distance

For p = 1, this is the Earth Mover's Distance (EMD).

Implementation

Astrolabe uses the POT (Python Optimal Transport) library, with scipy as a fallback.

Curvature Statistics

Astrolabe computes aggregate statistics across all edges:

StatisticDescription
min, maxRange of curvature values
meanAverage curvature
stdStandard deviation
medianMedian curvature
positive_countEdges with κ > 0
negative_countEdges with κ < 0
positive_ratioFraction of positive curvature

Global Interpretation

Mean CurvatureGraph Structure
> 0.5Highly clustered (many cliques)
0 < μ < 0.5Moderately clustered
-0.5 < μ < 0Tree-like structure
μ < -0.5Highly branching

Node Curvature

Node curvature is derived by averaging the curvatures of incident edges:

κ(v)=1deg(v)evκ(e)\kappa(v) = \frac{1}{\deg(v)} \sum_{e \ni v} \kappa(e)

Interpretation in Lean

Node TypeCurvatureDescription
Cluster centerHigh positiveCore theorem with many related lemmas
Chain nodeNear zeroPart of a sequential proof
Branch pointNegativeLemma used by diverging proofs
BridgeVery negativeConnects separate modules

Choosing Between Forman and Ollivier

AspectFormanOllivier
SpeedO(E) — very fastO(VE) — slower
AccuracyCombinatorial approximationBased on optimal transport
Graph sizeAny sizeRecommended for < 2000 nodes
Best forQuick overviewDetailed analysis

Practical Application

Example Workflow

  1. Quick scan: Run Forman-Ricci on the full graph
  2. Identify hotspots: Find highly negative (bridge) and positive (cluster) regions
  3. Detailed analysis: Run Ollivier-Ricci on interesting subgraphs
  4. Interpret: Map curvature to mathematical structure

In Lean Codebases

Curvature PatternMathematical Interpretation
Positive clusterRelated theorems (e.g., group theory basics)
Negative hubKey lemma connecting domains
Zero chainStep-by-step proof construction
Mixed regionInterface between mathematical areas

API Endpoints

GET /api/project/analysis/curvature    # Forman-Ricci (default, fast)

Returns:

  • edge_curvatures: Edge → source, target, curvature
  • node_curvatures: Node → average curvature
  • statistics: Aggregate stats
  • interpretation: Overall structure assessment

For Ollivier-Ricci, the endpoint accepts additional parameters:

GET /api/project/analysis/curvature?method=ollivier&alpha=0.5