1.
Introduction
This article deals with the equivalence of static and dynamical optimal transport on metric graphs, and with a gradient flow formulation of McKean–Vlasov equations in the Wasserstein space of probability measures.
Let (V,E) be a finite (undirected) connected graph and let ℓ:E→(0,∞) be a given weight function. Loosely speaking, the associated metric graph is the geodesic metric space (G,d) obtained by identifying the edges e∈E with intervals of length ℓe, and gluing the intervals together at the nodes. In other words, a metric graph describes a continuous "cable system", rather than a discrete set of nodes; see Section 2.2 for a more formal definition. Metric graphs arise in many applications in chemistry, physics, or engineering, describing quasi-one-dimensional systems such as carbon nano-structures, quantum wires, transport networks, or thin waveguides. They are also widely studied in mathematics; see [6,23] for an overview.
For 1≤p<∞, the Lp-Wasserstein distance Wp between Borel probability measures μ and ν on G is defined by the Monge–Kantorovich transport problem
where Π(μ,ν) denotes the set of transport plans; i.e., all Borel probability measures σ on G×G with respective marginals μ and ν. Since G is compact, the Wasserstein distance Wp metrises the topology of weak convergence of Borel probability measures on G, for any p≥1. The resulting metric space of probability measures is the Lp-Wasserstein space over G.
Metric graphs (G,d) are prototypical examples of metric spaces that exhibit branching of geodesics: it is (typically) possible to find two distinct constant speed geodesics (γt)t∈[0,1] and (˜γt)t∈[0,1], taking the same values for all times t∈[0,t0] up to some t0∈(0,1). For this reason, several key results from optimal transport are not directly applicable to metric graphs. This paper contains two of these results.
The Benamou–Brenier formula. On Euclidean space Rd, a dynamical characterisation of the Wasserstein distance has been obtained in celebrated works of Benamou and Brenier [4,5]. The Benamou–Brenier formula asserts that
where the infimum runs over all 2-absolutely continuous curves (μt)t∈[0,1] in the L2-Wasserstein space over Rd, satisfying the continuity equation
with boundary conditions μ0=μ and μ1=ν.
Here we are interested in obtaining an analogous result in the setting of metric graphs. However, such an extension is not straightforward, since standard proofs in the Euclidean setting (see [2,26]) make use of the flow map T:[0,1]×Rd→Rd associated to a (sufficiently regular) vector field (vt)t∈[0,1], which satisfies
see, e.g., [2,Proposition 8.1.8]. On a metric graph such a flow map T typically fails to exist, since solutions (μt)t∈[0,1] to the continuity equation (2) are usually not uniquely determined by an initial condition μ0 and a given vector field (vt)t∈[0,1].
Circumventing this difficulty, Gigli and Han obtained a version of the Benamou–Brenier formula in very general metric measure spaces [13]. However, this paper requires a strong assumption on the measures (namely, a uniform bound on the density with respect to the reference measure). While this assumption is natural in the general setting of [13], it is unnecessarily restrictive in the particular setting of metric graphs.
In this paper, we prove a Benamou–Brenier formula that applies to arbitrary Borel probability measures on metric graphs. The key ingredient in the proof is a careful regularisation step for solutions to the continuity equation.
Gradient flow structure of diffusion equations. As an application of the Benamou–Brenier formula, we prove another central result from optimal transport: the identification of diffusion equations as gradient flow of the free energy in the Wasserstein space (P(G),W2). In the Euclidean setting, results of this type go back to the seminal work of Jordan, Kinderlehrer, and Otto [14].
Here we consider diffusion equations of the form
for suitable potentials V:G→R and W:G×G→R. In analogy with the Euclidean setting, we show that this equation arises as the gradient flow equation for a free energy F:P(G)→(−∞,+∞] composed as the sum of entropy, potential, and interaction energies:
if μ=ηλ, where λ denotes the Lebesgue measure on G.
A key difference compared to the Euclidean setting is that the entropy is not semi-convex along W2-geodesics; see Section 4. This prevents us from applying "off-the-shelf" results from the theory of metric measure spaces. Instead, we present a self-contained proof of the gradient flow result. Its main ingredient is a chain rule for the entropy along absolutely continuous curves, which we prove using a regularisation argument.
Interestingly, we do not need to assume continuity of the potential V at the vertices. Therefore our setting includes diffusion with possibly singular drift at the vertices.
The Wasserstein distance over metric graphs for p=1 has been studied in [20]. The focus is on the approximation of Kantorovich potentials using p-Laplacian type problems.
The recent paper [10] deals with dynamical optimal transport metrics on metric graphs. The authors start with the dynamical definition à la Benamou–Brenier and consider links to several other interesting dynamical transport distances. The current paper is complementary, as it shows the equivalence of static and dynamical optimal transport, and a gradient flow formulation for diffusion equations.
Various different research directions involving optimal transport and graphs exist. In particular, dynamical optimal transport on discrete graphs have been intensively studied in recent years following the papers [11,19,21]. The underlying state space in these papers is a discrete set of nodes rather than a gluing of one-dimensional intervals.
Another line of research deals with branched optimal transport, which is used to model phenomena such as road systems, communication networks, river basins, and blood flow. Here one starts with atomic measures in the continuum, and graphs emerge to describe paths of optimal transport [28,7].
Organisation of the paper. In Section 2 we collect preliminaries on optimal transport and metric graphs. Section 3 is devoted to the continuity equation and the Benamou–Brenier formula on metric graphs. In particular, we present a careful regularisation procedure for solutions to the continuity equation. Section 4 contains an example which demonstrates the lack convexity of the entropy along W2-geodesics in the setting of metric graphs. Section 5 deals with the gradient flow formulation of diffusion equations.
2.
Preliminaries
In this section we collect some basic definitions and results from optimal transport and metric graphs.
2.1. Optimal transport
In this section we collect some basic facts on the family of Lp-Wasserstein distances on spaces of probability measures. We refer to [26,Chapter 5], [1,Chapter 2] or [27,Chapter 6] for more details.
Let (X,d) be a compact metric space. The space of Borel probability measures on X is denoted by P(X). The pushforward measure T#μ induced by a Borel map T:X→Y between two Polish spaces is defined by (T#μ)(A):=μ(T−1(A)) for all Borel sets A⊆Y.
Definition 2.1 (Transport plans and maps).
1. A (transport) plan between probability measures μ,ν∈P(X) is a probability measure σ∈P(X×X) with respective marginals μ and ν, i.e.,
where proji(x1,x2):=xi for i=1,2. The set of all transport plans between μ and ν is denoted by Π(μ,ν).
2. A transport plan σ∈Π(μ,ν) is said to be induced by a Borel measurable transport map T:X→X if σ=(Id,T)#μ, where (Id,T):X→X×X denotes the mapping x↦(x,T(x)).
Definition 2.2 (Kantorovich–Rubinstein–Wasserstein distance). For p≥1, the Lp-Kantorovich–Rubinstein–Wasserstein distance between probability measures μ,ν∈P(X) is defined by
For any μ,ν∈P(X) the infimum above is attained by some σmin∈Π(μ,ν); any such σmin is called an optimal (transport) plan between μ and ν. If a transport map T induces an optimal transport plan, we call T optimal as well.
By compactness of (X,d), the Lp-Wasserstein distance metrises the weak convergence in P(X) for any p≥1; see, e.g., [27,Corollary 6.13]. Moreover, (P(X),Wp) is a compact metric space as well.
We conclude this section with a dual formula for the Wasserstein distance (see, e.g., [26,Section 1.6.2]). To this aim, we recall that for c:X×X→R, the c-transform of a function φ:X→R∪{+∞} is defined by φc(y):=infx∈Xc(x,y)−φ(x). A function ψ:X→R∪{+∞} is called c-concave if there exists a function φ:X→R∪{+∞} such that ψ=φc.
Proposition 2.3 (Kantorovich duality). For any μ,ν∈P(X) we have
Moreover, the supremum is attained by a maximising pair of the form (φ,ψ)=(φ,φc), where φ is a c-concave function, for c(x,y):=dp(x,y).
Any maximiser φ is called a Kantorovich potential.
2.2. Metric graphs
In this subsection we state some basic facts on metric graphs; see e.g., [20], [6] or [16] for more details.
Definition 2.4 (Metric graph). Let G=(V,E,ℓ) be an oriented, weighted graph, which is finite and connected. We identify each edge e=(einit,eterm)∈E with an interval (0,ℓe) and the corresponding nodes einit,eterm∈V with the endpoints of the interval, (0 and ℓe respectively). The spaces of open and closed metric edges over G are defined as the respective topological disjoint unions
The metric graph over G is the topological quotient space
where points in ¯E corresponding to the same vertex are identified.
Note that the orientation of e determines the parametrisation of the edges, but does not otherwise play a role. To distinguish ingoing and outgoing edges at a given node, we introduce the signed incidence matrix I=(ιev) whose entries are given by
As a quotient space, any metric graph naturally inherits the structure of a metric space from the Euclidean distance on its metric edges [9,Chapter 3]: indeed, under our standing assumption that G is connected the quotient semi-metric d becomes a metric.
The distance d:G×G→[0,∞) on G can be more explicitly described as follows: For x,y∈G, let ˜G=(˜V,˜E) be the underlying discrete graph obtained by adding new vertices at x and y, and let d(x,y) be the weighted graph distance between x and y in ˜G, i.e.,
where the minimum is taken over all sequences of vertices x=x0,…,xn=y in the extended graph ˜G such that xi−1 and xi are vertices of an edge ei∈˜E for all i=1,…,n. In particular, if x and y are vertices in G, no new nodes are added, and we recover the graph distance in the original graph. We refer to [23,Chapter 3] for details.
By construction, the distance function d metrises the topology of G. It is readily checked that d is a geodesic distance, i.e., each pair of points x,y∈G can be joined by a curve of minimal length d(x,y). Consequently, also the Wasserstein space (P(G),W2) is a geodesic space.
In a metric space (X,d), recall that the local Lipschitz constant of a function f:X→R is defined by
whenever x∈X is not isolated, and 0 otherwise. The (global) Lipschitz constant is defined by
If the underlying space X is a geodesic space, we have Lip(f)=supxlip(f)(x).
At the risk of being redundant, we explicitly introduce a few relevant function spaces, although they are actually already fully determined by the metric measure structure of the metric graph G.
(ⅰ) C(G) denotes the space of continuous real-valued functions on G, endowed with the uniform norm ‖⋅‖∞.
(ⅱ) Ck(¯E) is the space of all functions φ on ¯E such that the restriction to each closed edge has continuous derivatives up to order k∈N.
(ⅲ) Lp(G), for p∈[1,∞], is the p-Lebesgue space over the measure space (G,λ), where λ denotes the image of the 1-dimensional Lebesgue measure on ¯E under the quotient map.
(ⅳ) Likewise, we consider the Sobolev spaces W1,p(¯E) of Lp(G)-functions whose restriction on each edge is weakly differentiable with weak derivatives in Lp(G).
3.
The continuity equation on a metric graph
In this section we fix a metric graph G and perform a study of the continuity equation
in this context.
3.1. The continuity equation
In this work we mainly deal with weak solutions to the continuity equation, which will be introduced in Definition 3.2. To motivate this definition, we first introduce the following notion of strong solution.
Definition 3.1 (Strong solutions to the continuity equation). A pair of measurable functions (ρ,U) with ρ:(0,T)×G→R+ and U:(0,T)ׯE→R is said to be a strong solution to (5) if
(i) t↦ρ(t,x) is continuously differentiable for every x∈G;
(ii) x↦Ut(x) belongs to C1(¯E) for every t∈(0,T);
(iii) the continuity equation ddtρt(x)+∇⋅Ut(x)=0 holds for every t∈(0,T) and x∈E;
(iv) for every t∈(0,T) and w∈V we have ∑e∈EwιewUt(we)=0.
Here, we write ρt:=ρ(t,⋅) and Ut:=U(t,⋅) and denote by ∇ the spatial derivative. Moreover, Ew denotes the set of all edges adjacent to the node w∈V, and we∈¯E denotes the corresponding endpoint of the metric edge e which corresponds to w∈G.
To motivate the definition of a weak solution, suppose that we have a strong solution (ρt,Ut)t∈(0,T) to the continuity equation (5). Let ψ∈C1(¯E)∩C(G) be a test function. Integration by parts on every metric edge e gives
and summation over e∈E yields
where we use the continuity of ψ on G as well as the node condition (ⅳ) above in the last step. This ensures that the net ingoing momentum vanishes at every node in V. In particular, choosing ψ≡1 yields
for all s,t∈(0,T), i.e. solutions to the continuity equation are mass-preserving. Here condition (ⅳ) is crucial to ensure that no creation or annihilation of mass occurs at the nodes.
Definition 3.2 (Weak solution). A pair (μt,Jt)t∈(0,T) consisting of probability measures μt on G and signed measures Jt on ¯E, such that t↦(μt(A),Jt(A)) is measurable for all Borel sets A, is said to be a weak solution to (5) if
(i) t↦∫Gψdμt is absolutely continuous for every ψ∈C1(¯E)∩C(G);
(ii) ∫T0|Jt|(¯E)dt<∞;
(iii) for every ψ∈C1(¯E)∩C(G) and a.e. t∈(0,T), we have
Remark 1. Proposition 3.9 below shows that continuous functions on G can be uniformly approximated by C1 functions. By standard arguments it then follows that (μt,Jt)t∈(0,T) is a weak solution if and only if the following conditions hold: (i′) t↦μt is weakly continuous; (ii) from Definition 3.2 holds; and
(iii′) for every φ∈C1c((0,T)ׯE) such that φt,∂tφt∈C1(¯E)∩C(G) for all t∈(0,T) we have
See Lemma 3.10 below for the weak continuity in (i′).
The next result asserts that the momentum field does not give mass to vertices for a.e. time point. Hence, we can equivalently restrict the integrals over ¯E in (6) and (7) to the space of open edges E.
Lemma 3.3. Let B:=¯E∖E denote the set of all boundary points of edges. For any weak solution to the continuity equation (μt,Jt)t∈(0,T), we have
Proof. Fix a metric edge e in ¯E and take w∈{einit,eterm}. Without loss of generality we take w=einit. Then we can construct a family of functions φε for ε∈(0,ℓe) with the following properties:
(a) φε∈C1(¯E)∩C(G);
(b) φε≡0 on ¯E∖e and φε→0 uniformly on G as ε→0;
(c) |∇φε|=1 on (0,ε)⊂e and |∇φε|→0 uniformly on compact subsets of ¯E∖{w}.
For instance, we could set φε(x):=1e(x)ηε(x), where ηε:R→R is a C1 approximation of the function x↦(x∧ε(ℓe−x)ℓe−ε)∨0. Choosing φ(t,x)=φε(x) in (7), we obtain by passing to the limit ε→0 that ∫T0|Jt|({w})dt=0.
Lemma 3.4 (Weak and strong solutions). The following assertions hold:
(i) If (ρt,Ut)t∈(0,T) is a strong solution to the continuity equation, then the pair (μt,Jt)t∈(0,T) defined by μt=ρtλ and Jt=Utλ is a weak solution to the continuity equation.
(ii) If (μt,Jt)t∈(0,T) is a weak solution to the continuity equation (6) such that the densities ρt:=dμtdλ and Ut:=dJtdλ exist for all times t∈(0,T) and satisfy the regularity conditions (ⅰ) and (ⅱ) of Definition 3.1, then (ρt,Ut)t∈(0,T) is a strong solution to the continuity equation.
Proof. Both claims are straightforward consequences of integration by parts on each metric edge in E.
3.2. Characterisation of absolutely continuous curves
Let (X,d) be a metric space and let T>0.
Definition 3.5. For p≥1, we say that a curve γ:(0,T)→X is p-absolutely continuous if there exists a function g∈Lp(0,T) such that
The class of p-absolutely continuous curves is denoted by ACp((a,b);(X,d)). For p=1 we simply drop p in the notation. The notion of locally p-absolutely continuous curve is defined analogously.
Proposition 3.6. Let p≥1.For every p-absolutely continuous curve γ:(0,T)→X, the metric derivative defined by
exists for a.e. t∈(0,T) and t↦|˙γ|(t) belongs to Lp(0,T).The metric derivative |˙γ|(t) is an admissible integrand in the right-hand side of (8). Moreover, any other admissible integrand g∈Lp(0,T) satisfies|˙γ|(t)≤g(t) for a.e. t∈(0,T).
Proof. See, e.g., [2,Theorem 1.1.2].
The next result relates the metric derivative of t↦μt to the L2(μt)-norm of the corresponding vector fields vt.
Theorem 3.7 (Absolutely continuity curves). The following statements hold:
(i) If (μt)t∈(0,T) is absolutely continuous in (P(G),W2), then there exists, for a.e. t∈(0,T), a vector field vt∈L2(μt) such that ‖vt‖L2(μt)≤|˙μ|(t) and (μt,vtμt)t∈(0,T) is a weak solution to the continuity equation (6).
(ii) Conversely, if (μt,vtμt)t∈(0,T) is a weak solution to the continuity equation (6) satisfying ∫10‖vt‖L2(μt)dt<+∞, then (μt)t∈(0,T) is an absolutely continuous curve in (P(G),W2) and |˙μ|(t)≤‖vt‖L2(μt) for a.e. t∈(0,T).
Proof of (ⅰ). We adapt the proof of [2,Theorem 8.3.1] to the setting of metric graphs.
The idea of the proof is as follows: On the space-time domain Q:=(0,T)×G we consider the Borel measure μ:=∫T0δt⊗μtdt whose disintegration with respect to the Lebesgue measure on (0,T) is given by (μt)t∈(0,T). To deal with the fact that gradients of smooth functions are multi-valued at the nodes, we define ¯μt∈M+(¯E) by ¯μt(A):=∑e∈Eμt(A∩¯e) for every Borel set A⊆¯E. We then set ¯Q:=(0,T)ׯE and define ¯μ∈M+(¯Q) by ¯μ:=∫T0δt⊗¯μtdt. (Note that mass at the nodes is counted multiple times). Consider the linear spaces of functions T and V given by
The strategy is to show that the linear functional L:V→R given by
is well-defined and L2(¯Q,¯μ)-bounded with ‖L‖2≤∫T0|˙μ|2(t)dt. Once this is proved, the Riesz Representation Theorem yields the existence of a vector field vv in ¯V⊆L2(¯Q,¯μ) such that ‖vv‖2L2(¯μ)≤∫T0|˙μ|2(t)dt and
for vt:=vv(t,⋅) and all a∈C1c(0,T) and φ∈C1(¯E)∩C(G).
Once this is done, we show that the momentum vector field JJ:=vv⋅μ does not assign mass to boundary points in ¯E, so that vv can be interpreted as an element in L2(Q,μ) and the integration over vector fields can be restricted to E.
Step 1. Fix a test function φ∈C1(¯E)∩C(G) and consider the bounded and upper semicontinuous function H:G×G→R given by
for x,y∈G. For s,t∈(0,T), let σs→t∈Π(μs,μt) be an optimal plan. The Cauchy–Schwarz inequality yields
As φ is globally Lipschitz on G, we obtain
and infer that the mapping t↦∫Gφdμt is absolutely continuous, hence, differentiable outside of a null set Nφ⊆(0,T).
Fix t∈(0,T) and take a sequence {sn}n∈N converging to t. Since {μsn} is weakly convergent, this sequence is tight. Consequently, {σsn→t}n∈N is tight as well, and we may extract a subsequence converging weakly to some ˆσ∈P(G×G). It readily follows that ˆσ∈Π(μt,μt). Moreover, along the convergent subsequence, we have
which implies that ˆσ=(Id,Id)#μt.
Using this result and the upper-semicontinuity of H, it follows from (10) that
Step 2. Take Φ∈T. Using dominated convergence, Fatou's Lemma, and (11), we obtain
Since ∫Q|lipx(Φ)(x,t)|2dμ(x,t)≤∫¯Q|∇Φ(x,t)|2d¯μ(x,t), we infer that L is well-defined and extends to a bounded linear functional on the closure of V in L2(¯Q,¯μ) with ‖L‖2≤∫T0|˙μ|2(t)dt, which allows us to apply the Riesz Representation Theorem, as announced above.
In particular, (9) implies that t↦∫E∇φ⋅vtdμt is a distributional derivative for t↦∫Gφdμt. Since the latter function is absolutely continuous and, therefore, belongs to the Sobolev space W1,1(0,T), we obtain
We conclude that (μt,vt)t∈(0,T) solves the continuity equation in the weak sense. Lemma 3.3 implies that for a.e. t the momentum field Jt:=vt⋅¯μt does not give mass to any boundary point in ¯E. Consequently, the spatial domain of integration on the right-hand side of (9) may be restricted to E.
Step 3. It remains to verify (by a standard argument) the inequality relating the L2(μt)-norm of the vector field vt to the metric derivative of μt.
For this purpose, fix a sequence (ϖϖi)i∈N of functions ϖϖi∈V converging to vv in L2(¯μ) as i→∞. For every compact interval I⊆(0,T) and a∈C1(0,T) satisfying 0≤a≤1 and suppa=I, we then obtain
Letting ‖a−1I‖∞→0, this inequality implies
Since I⊆(0,T) is arbitrary, this implies that ‖vt‖L2(μt)≤|˙μ|(t) for a.e. t∈(0,T).
3.3. Regularisation of solutions to the continuity equation
Next we introduce a suitable spatial regularisation procedure for solutions to the continuity equation. This will be crucial in the proof of the second part of Theorem 3.7.
Let ε>0 be sufficiently small, i.e., such that 2ε is strictly smaller than the length of every edge in E. We then consider the supergraph Gext⊇G defined by adjoining an auxiliary edge eextv of length 2ε to each node v∈V (see Figure 1). The corresponding set of metric edges will be denoted by Eext⊃E.
We next define a regularisation procedure for functions based on averaging. The crucial feature here is that non-centred averages are used, to ensure that the regularised function is continuous.
In the definition below, we parametrise each edge e=(einit,eterm)∈E using the interval (−ℓe2,ℓe2) instead of (0,ℓe). The auxiliary edges eexteinit and eexteterm will then be identified with the intervals (−ℓe2−2ε,−ℓe2) and (ℓe2,ℓe2+2ε), respectively. We stress that for each vertex v, there is only one additional edge, but we use several different parametrisations for it – one for each edge incident in v.
Definition 3.8 (Regularisation of functions). For φ∈L1(Gext), we define φε:G→R by
where αεe:=(ℓe+2ε)/ℓe. We write αε(x):=αεe, whenever x∈E is a point on the metric edge e.
Note that the value of φε in each of the nodes depends only on data on the corresponding auxiliary edge. In particular, the value at the nodes does not depend on the choice of the edge, so that φε indeed defines a function on G. We collect some basic properties of this regularisation in the following result.
Proposition 3.9. The following properties hold for every ε>0 sufficiently small:
(i) Regularising effect: For any φ∈C(Gext) we have φε∈C1(¯E)∩C(G) and
for y∈[−ℓe/2,ℓe/2].
(ii) If φ belongs to C(Gext), then φε converges uniformly to φ|G as ε↘0.
Proof. (ⅰ) follows by direct computation; (ⅱ) follows using the uniform continuity of φ on Gext.
Lemma 3.10 (Weak continuity). Let (ρt,Jt)t∈(0,T) be a weak solution to the continuity equation on G.Then t↦∫Gφdμt is continuous for every φ∈C(G).
Proof. Fix φ∈C(G), take a continuous extension to Gext, and define φε accordingly. Proposition 3.9.ii then implies that φε converges uniformly to φ on G as ε↘0. As a result, the function t↦∫Gφεdμt converges uniformly to t↦∫Gφdμt. Since φε belongs to C(G)∩C1(¯E) by Proposition 3.9.i, we conclude that the mapping t↦∫Gφdμt is continuous, being a uniform limit of continuous functions.
By duality, we obtain a natural regularisation for measures.
Definition 3.11 (Regularisation of measures). For μ∈M(G) we define με∈M(Gext) by
for all φ∈C(Gext).
Analogously, for J∈M(E) we define Jε∈M(Eext) as follows: first we extend J to a measure on G giving no mass to G∖E. Then we define ˜Jε∈M(Gext) by the formula above. Finally, we define Jε∈M(Eext) by restriction of ˜Jε to Eext.
It is readily checked that the right-hand side defines a positive linear functional on C(Gext), so that με is indeed a well-defined measure.
Proposition 3.12. The following properties hold for any ε>0:
(i) Mass preservation: με(Gext)=μ(G) for any μ∈M(G).
(ii) Regularising effect:For any μ∈P(G), the measure με is absolutely continuous with respect to λ with density
where
In particular, ρε(x)≤12ε for all x∈Gext.
(iii) Kinetic energy bound: For μ∈P(G) and v∈L2(μ), define J=vμ|E∈M(E).Consider the regularised measures με∈P(Gext)and Jε∈M(Eext).Then we have Jε=vεμε for some vε∈L2(με) and
(iv) For any μ∈P(G) we have weak convergence με⇀μ in P(Gext) as ε→0.
(v) Let (μt,Jt)t∈(0,T) be a weak solution to the to the continuity equation (6). Then the regularised pair (μεt,Jεt)t∈(0,T) is a weak solution to a modified continuity equation on Gext in the following sense:
For every absolutely continuous function φ on Gext, the function t↦∫Gextφdμεt is absolutely continuous and for a.e. t∈(0,T) we have
with αε as in Definition 3.8.
In order to prove (ⅲ), we will make use of the so-called Benamou–Brenier functional (see, e.g., [26,Section 5.3.1] for corresponding results in the Euclidean setting).
Define K2:={(a,b)∈R×R:a+b2/2≤0}. By a slight abuse of notation, C(G,K2) (resp. L∞(G,K2)) denotes the set of all continuous (resp. bounded and measurable) functions a,b:G→R such that a+b2/2≤0.
Definition 3.13. The Benamou–Brenier functional B2:M(G)×M(E)→R∪{+∞} is defined by
Some basic properties of this functional are collected in the following lemma.
Lemma 3.14. The following statements hold:
(i) For y,z∈R we have
(ii) For μ∈M(G)and J∈M(E)we have
(iii) The functional B2 is convex and lower semicontinuous with respect to the topology of weak convergence on M(G)×M(E).
(iv) If μ∈M(G)is nonnegative and J∈M(E)satisfies J≪μ|Ewith J=vμ|E, then we have
Otherwise, we have B2(μ,J)=+∞.
Proof. (ⅰ): see [26,Lemma 5.17].
(ⅱ): Clearly, the right-hand side of (20) is bounded from below by B2(μ,J). To prove the reverse inequality, let a,b:G→R be measurable functions satisfying a+b2/2≤0. By Lusin's theorem (see, e.g., [8,Theorem 7.1.13]) there exist functions aδ,bδ∈C(G,K2) satisfying
Define ˜aδ:=min{aδ,−|bδ|2/2}, so that the inequality ˜aδ+b2δ/2≤0 is satisfied. Hence, the pair (˜aδ,bδ) is admissible for the supremum on the right-hand side of (20). Since ∫G˜aδdμ+∫EbδdJ converges to ∫Gadμ+∫EbdJ as δ↘0, we obtain (20).
(ⅲ): This follows from the definition of B2 as a supremum of linear functionals.
(ⅳ): Let μ be nonnegative and J≪μ with J=vμ. Setting v=0 on G∖E, (20) and (19) yield
To prove the converse, suppose first that there exists a Borel set A⊆G with μ(A)<0. Pick a=−k1A and b≡0 with k≥0, so that B2(μ,J)≥−kμ(A). Since k can be taken arbitrarily large, we infer that B2(μ,J)=+∞. Now suppose μ is non-negative, but the signed measure J is not absolutely continuous with respect to μ|E, i.e., there exists a μ-null set A⊆E such that J(A)≠0. For a=−k221A and b=k1A with k∈R, we have B2(μ,J)≥kJ(A), which implies the result.
Proof of Proposition 3.12. (ⅰ): The claim follows readily from the definitions.
(ⅱ): For φ∈C(Gext) we have
For w∈V, we note that φε(w) is obtained by averaging φ on a subset of the auxiliary edge eextw:
For e∈E we obtain, interchanging the order of integration,
Combining these three identities, the desired result follows.
(ⅲ): Take bounded measurable functions satisfying , extend them by to , and define regularised functions as done for in (14). By Jensen's inequality and the fact that the regularisation is linear and positivity-preserving, we obtain
i.e., and are admissible for the supremum in (20). Therefore,
The result follows by taking the supremum over all admissible functions and .
(ⅳ): This follows from the uniform convergence of to ; see Proposition 3.9(ⅱ).
(ⅴ): The function is absolutely continuous, hence a.e. differentiable on . Proposition 3.9(ⅰ) yields and
Since is constant on each metric edge in , we obtain, using the continuity equation and the definition of the regularisation,
Now we are ready to prove the second part of Theorem 3.7: we adapt the proof of [13], where (much) more general metric measure spaces are treated, but stronger assumptions on the measures are imposed (namely, uniform bounds on the density with respect to the reference measure). Here we consider more general measures using the above regularisation procedure.
3.4. Conclusion of the characterisation of absolutely continuous curves
In the proof of the second part of Theorem 3.7, we make use of the Hopf–Lax formula in metric spaces and its relation to the dual problem of optimal transport.
Definition 3.15 (Hopf–Lax formula). For a real-valued function on a geodesic Polish space , we define by
for all , and .
The operators form a semigroup of nonlinear operators with the following well-known properties; see [3] for a systematic study.
Proposition 3.16 (Hopf–Lax semigroup). Let be a geodesic Polish space. For any Lipschitz function the following statements hold:
(i) For every we have .
(ii) For every , the map is continuous on , locally semi-concave on , and the inequality
holds for all up to a countable number of exceptions.
(iii) The mapping is upper semicontinuous on .
Proof. (i): This statement can be derived from [3,Proposition 3.4]. For the convenience of the reader we provide the complete argument here.
Fix and . For we write . We claim that
where denotes the Lipschitz constant of . Indeed, if , we have
which implies the claim.
Fix now and . Using the claim, we may pick such that and . Then:
Reversing the roles of and , this estimate readily yields
Since is assumed to be a geodesic space, we have and the result follows.
(ii): See [3,Theorem 3.5].
(iii): See [3,Propositions 3.2 and 3.6].
We can now conclude the proof of Theorem 3.7 on the characterisation of absolutely continuous curves in the Wasserstein space over a metric graph.
Proof of (ⅱ) in Theorem 3.7. Without loss of generality, we set . The main step of the proof is to show that
From there, a simple reparametrisation argument (see also [2,Lemma 1.1.4 & 8.1.3]) yields
for all , which implies the absolute continuity of the curve in as well as the desired bound for every Lebesgue point of the map .
Thus, we have to show (25). To this aim, we will work on the supergraph .
By Kantorovich duality (Proposition 2.3), there exists satisfying
Moreover, is Lipschitz, which follows from the fact that is -concave with and is compact. We consider Lipschitz continuous extensions of and to , both constant on each auxiliary metric edge in . In particular, and are Lipschitz on .
Set and consider a regularised pair as defined by (16). We write
and bound the two terms on the right-hand side separately.
Bound 1. To estimate the first term on the right-hand side of (27), we use (24) to obtain
where the measures are defined on .
To show weak convergence of the sequence , we take . Note that is weakly continuous by Lemma 3.10, hence is weakly continuous as well. Consequently, for every . Integrating in time over , we infer, using dominated convergence, that converges weakly to as .
As is not necessarily continuous, an additional argument is required to pass to the limit in (28). For this purpose, we observe that Proposition 3.12.ii yields with a density for and . In particular, the family is uniformly integrable with respect to . Consequently, the Dunford–Pettis Theorem (see, e.g., [8,Theorem 4.7.18]) implies that has weakly-compact closure in . Since is bounded, we may pass to the limit in (28) and infer that
where we use that on to remove the set of nodes from the domain of integration.
Bound 2. We now treat the second term in (27). As belongs to a weak solution to the continuity equation and we know from Proposition 3.9.(ⅰ)) that belongs to , we infer that the mapping is absolutely continuous. Therefore,
where and .
Proposition 3.16.iiiyields the bound . As is Lipschitz continuous, (ⅰ) in the same proposition shows that . Thus, we may invoke Fatou's lemma to obtain
Using this estimate together with Proposition 3.12.iii, we obtain
Combination of both bounds. Recalling (27), we use (29) and (31) to obtain
Using Proposition 3.12.iv, the fact that , and the bound , we may pass to the limit to obtain
In view of (26), this yields the result.
Corollary 1 (Benamou–Brenier formula).For any , we have
where the minimum is taken among all weak solutions to the continuity equation satisfying and .
Proof. As is a geodesic space, we may write
where the minimum runs over all absolutely continuous curves connecting and . Therefore, the result follows from Theorem 3.7.
4.
Lack of geodesic convexity of the entropy
In this section we consider the entropy functional defined by
As is well known, this functional is lower semicontinuous on ; see, e.g., [17,Corollary 2.9].
A celebrated result by McCann asserts that is geodesically convex on , the Wasserstein space over the Euclidean space . More generally, on a Riemannian manifold , the relative entropy (with respect to the volume measure) is geodesically -convex on for , if and only if the Ricci curvature is bounded below by , everywhere on [24,12,25].
Metric graphs are prototypical examples in which such bounds fail to hold. Here we present an explicit example, which shows that the functional on the metric space over a metric graph induced by a graph with maximum degree larger than is not geodesically -convex for any .
Example 4.1. Consider a metric graph induced by a graph with 3 leaves as shown in Figure 2.
We impose an edge weight on each of the edges . Consider the probability measures with respective densities given by
Lemma 4.2. The unique optimal coupling of and is given by monotone rearrangement from each of the edges and to , i.e., by the map with
Proof. Let be an optimal coupling of and and decompose it as , where , , are couplings of , the restriction of to , and some a measure on such that . Necessarily are also optimal. By standard optimal transport theory on the interval , is given by a map , the monotone rearrangement from to . If we show that , then with as above and the claim is proven. To this end, assume by contradiction that and hence . We consider the coupling where (here, we identify and in the obvious way). Since , the support of is not contained in the graph of a function. Since is optimal, the couplings are also optimal between their marginals and , and thus have to be induced by the monotone rearrangement map , a contradiction.
Consequently, the constant speed-geodesic from to is given by where is the linear interpolation of and the identity on and respectively, more precisely
for , .
Set and . The relative entropy of is given by:
Thus, is piecewise affine; see Figure 3. It follows that is not -convex for any .
5.
Gradient flows in the Wasserstein space over a metric graph
In this section we study gradient flows in the Wasserstein space over a metric graph. Namely, we consider diffusion equations on metric graphs arising as the gradient flow of free energy functionals composed as the sum of entropy, potential, and interaction energies. We give a variational characterisation of these diffusion equations via energy-dissipation identities and we discuss the approximation of solutions via the Jordan–Kinderlehrer–Otto scheme (minimizing movement scheme). This provides natural analogues on metric graphs of the corresponding classical results in Euclidean space. We follow the approach from the Euclidean case; see in particular [2,Section 10.4], and adapt it to the current setting.
Let be Lipschitz continuous and define the weighted volume measure on (since gives no mass to vertices, the potential ambiguity of there does not matter). We consider the following functionals on :
the relative entropy defined by
the interaction energy defined by
where is symmetric and Lipschitz continuous.
Moreover, we define as the sum of the previous quantities:
Note that is bounded from below (by Jensen's inequality and finiteness of ) and lower semicontinuous with respect to weak convergence. Moreover, is bounded and continuous with respect to weak convergence.
Further note that for with we can write
where is the Boltzmann entropy on and defined by
is the potential energy. The latter is well defined for despite the potential discontinuity of at the vertices.
5.1. Diffusion equation and energy dissipation
We consider the following diffusion equation on the metric graph given by
Here is a probability on and we set . In analogy with the classical Euclidean setting we will show below that this PDE is the Wasserstein gradient flow equation of the free energy . Though the setting of metric graphs is one-dimensional, we prefer to use multi-dimensional notation such as and for the sake of clarity.
We consider the following notion of weak solution for (36).
Definition 5.1. We say that a curve of probability densities w.r.t. on is a weak solution to (36) if for we have for a.e. , and the pair given by
is a weak solution to the continuity equation in the sense of Definition 3.2, i.e., we ask that is weakly continuous and for every such that for all , we have
Remark 2. Let us briefly consider the special case where . Then (36) is simply the heat equation on a metric graph, which has been introduced already in [18]. It is known since [15,22] that the Laplacian with natural vertex conditions (continuity across the vertices along with a Kirchhoff-type condition on the fluxes) is associated with a Dirichlet form, hence the Cauchy problem for this PDE is well-posed on : more precisely, it is governed by an ultracontractive, Markovian -semigroup that extrapolates to for all , as well as to and, by duality, to the space of Radon measures on . For any initial value , defines a classical solution to the Cauchy problem for the heat equation with initial value . It is easy to see that this solution is a weak solution in the sense of Definition 5.1 as well.
The dissipation of the free energy along solutions to (36) at is formally given by
This motivates the following definition.
Definition 5.2 (Energy dissipation functional). The energy dissipation functional is defined as follows. If with and for some we set
Otherwise, we set .
Remark 3. We emphasize that continuity of on is required for finiteness of in this definition. It is not sufficient that belongs to . The continuity is important in the proof of Theorem 5.5 below, as it ensures spatial continuity for gradient flow curves (i.e., curves of maximal slope with respect to the upper gradient ) and it allows us to identify them with weak solutions to the diffusion equation (36). The requirement of spatial continuity for weak solutions to (36) in Definition 5.1 is essential in order to couple the dynamics on different edges across common vertices.
We collect the following properties of the dissipation functional.
Lemma 5.3. Let be a sequence weakly converging to such that
Then we have
Proof. First note that we can rewrite as the integral functional
where is the lower semicontinuous and convex function defined by
and . Here is any measure such that ; its choice is irrelevant by -homogeneity of .
Now, let . Lower semicontinuity of and (38) imply that . Therefore we can write for a suitable density . Superlinearity of implies that converges weakly to in .
Recall that with . Hölder's inequality and the bound (38) yield that the measures have uniformly bounded total variation. Hence up to extracting a subsequence, we have that converges weakly* to a measure on . Lower semicontinuity of the integral functional yields
This allows us to write for a and . Since , we have for any function by integration by parts,
The convergence of to weakly and to in together with the fact that gives no mass to and the boundedness of allows us to pass to the limit and obtain
We infer that and that . In particular, by the Sobolev embedding theorem.
Let us now show that . For this purpose, note that each pair of adjacent edges can be identified with the interval . Consider such that for all and on for some . Repeating the argument above, we infer that and in particular that is continuous at . We thus obtain that . Hence from (42) we obtain the claim.
Let us denote by the energy dissipation functional with , more precisely, if with with for some we set
Otherwise, we set . As already observed for the functional , we can write , with where is the function in (41). Then is a convex and lower semi-continuous functional by the previous lemma. We note that under the assumption that is Lipschitz, is finite if and only if is finite.
Next, we observe that finiteness of the implies a quantitative bound on the density.
Lemma 5.4. For with and we have with
where the constant depends on .
Proof. The continuity of follows from the definition of . Using Hölder's we obtain with ,
The claim then follows immediately from the Sobolev embedding theorem applied to each of the finitely many edges.
5.2. Energy-dissipation equality
The main result of this section is the following gradient flow characterisation of the diffusion equation (36).
Theorem 5.5. For any 2-absolutely continuous curve in with we have
Moreover, we have if and only if is a weak solution to (36) in the sense of Definition 5.1.
The main step in proving this result is to establish a chain rule for the free energy along absolutely continuous curves in . Recall the definition of from (41).
Proposition 5.6 (Chain rule). Let be a -absolutely continuous curve in satisfying .Write and let be an optimal family of momentum vector fields, i.e., solves the continuity equation and.Let as in Definition 5.2. Then, is absolutely continuous and we have
Proof. We first note that the assumptions ensure that also . Hence, we have
We proceed by a twofold regularisation. Using a family of even and smooth approximation kernels with compact support in , we regularise in time via
and set . Here we extend to a curve on the time-interval which is constant on and . Similarly defining as the time-regularisation of we obtain that is a solution to the continuity equation. Convexity of and the Benamou–Brenier functional yield that and . Hence, (45) also holds with in place of . Moreover, we must have on .
Further, we regularise the logarithm and define for the function by setting
Then we define the regularized free energy of via
Let us set . Note that by Lemma 5.4, is bounded and thus also is bounded. We will first show that
Then passing to the limit will yield the claim.
While establishing (46) we write instead of for simplicity. We have
Differentiation under the integral sign is justified by boundedness of and and the regularity in time of . Note that is an admissible test function in the continuity equation for . Thus, to establish (46) it remains to show that the continuity equation can also be used on the function . Recall that is bounded with . We can approximate uniformly by functions as follows. First extend to with constant values on the additional edge incident to vertex equal to the value of at . Then we apply the regularising procedure (14). The continuity equation for yields
Passing to the limit as then will finish the proof of (46). Convergence of the left hand side is immediate since is bounded and so is bounded uniformly in . For the right hand side we estimate with :
The second factor is finite as noted above. To estimate the first factor, we recall that is bounded. Hence for a suitable constant
Using (15) and dominated convergence, the latter term goes to zero as if we show that
But using we can estimate
Thus, (46) is established.
We will now pass to the limits in (46), starting with the right-hand side.
For the limit , note that a.e. and a.e. as with . Dominated convergence then yields that as :
Indeed, we have the majorant
for a suitable constant , using the fact is uniformly bounded. Here, denotes the time-regularisation of the function in brackets and we have used Jensen's inequality in the last step to interchange the regularisation operation with the convex function . Since by assumption and are integrable on , the majorant above indeed converges in .
To further pass to the limit , we note that a.e. on the set . Since a.e. on the set , we conclude that a.e. Using similar as before the majorant , we conclude by dominated convergence.
It remains to pass to the limit on the left-hand side in (46). The weak continuity of implies that weakly as . This is sufficient to conclude that . Note that for any we have
with and . Convexity of and Jensen's inequality yield that . Thus, lower semicontinuity of under weak convergence shows that and hence as . The limit is then easily achieved by monotone convergence. From the convergence of the right-hand side of (46) and the assumption that we finally conclude that for all and that
Hence is absolutely continuous and (44) follows.
We can now prove Theorem 5.5.
Proof of Theorem 5.5. Note that the right-hand side of (44) may be estimated by means of Hölder's and Young's inequality as
Hence, by integrating both sides of (44) from to we obtain that . Moreover, we have equality if and only if for a.e. and -a.e. we have . Now the continuity equation with becomes the weak formulation of (36)
5.3. Metric gradient flows
Here, we recast the variational characterisation of McKean–Vlasov equations on metric graphs from the previous section in the language of the theory of gradient flows in metric spaces. Let us briefly recall the basic objects. For a detailed account we refer the reader to [2].
Let be a complete metric space and let be a function with proper domain, i.e., the set is non-empty.
The following notion plays the role of the modulus of the gradient in a metric setting.
Definition 5.7 (Strong upper gradient). A function is called a strong upper gradient of if for any the function is Borel and
Note that by the definition of strong upper gradient, and Young's inequality , we have that for all :
Definition 5.8 (Curve of maximal slope). A locally -absolutely continuous curve is called a curve of maximal slope of with respect to its strong upper gradient if is non-increasing and
We say that a curve of maximal slope starts from if .
Equivalently, we can require equality in (48). If a strong upper gradient of is fixed we also call a curve of maximal slope of (relative to ) a gradient flow curve.
Finally, we define the (descending) metric slope of as the function given by
The metric slope is in general only a weak upper gradient for , see [2,Theorem 1.2.5]. In our application to metric graphs we will show that the square root of the dissipation provides a strong upper gradient for the free energy .
Corollary 2. The functional is a strong upper gradient of on . The curves of maximal slope for with respect to this strong upper gradient coincide with weak solutions to (36) satisfying .
Proof. For a -absolutely continuous curve with with , the chain rule (44) together with the estimate (47) yields
i.e., is a strong upper gradient. Theorem 5.5 yields the identification of curves of maximal slope.
The dissipation functional can be related to the metric slope of the free energy under suitable conditions on .
Lemma 5.9. For any we have
Proof. We assume that satisfies , since otherwise there is nothing to prove.
Step 1. We show first that and with .
For this purpose, take any which vanishes in a neighbourhood of every node and put . As a consequence, the mapping defined by maps each edge into itself for sufficiently small. We claim that:
To show this, note that is injective for every small enough. Therefore, the change of variables formula yields that the density of with respect to satisfies
Consequently, with we have
Note that . Dividing by and letting and noting that and we deduce (51).
For sufficiently small, we then have
This estimate, together with (51), implies
Hence, the left-hand side defines an -bounded linear functional on a subspace of . Using the Hahn–Banach theorem we may extend this functional to . The Riesz representation theorem then yields a unique element such that and
for all as above.
Considering in particular functions supported on a single edge, we infer that and with . In particular, we have by Sobolev embedding.
Step 2. Next we show that belongs to . For this purpose we repeat the argument above, for a different class of functions .
Consider a pair of adjacent edges with common vertex . For the moment, we identify the concatenation of the two edges with the interval such that corresponds to . Let be a -function vanishing in a neighbourhood of and . In particular, maps into itself for sufficiently small, where . Let the maps and be defined on by and through the identification with and by setting and on all other edges. Repeating the argument from Step 1, we infer that with the identification of with as above. In particular, is continuous at . Since the choice of the pair is arbitrary, we conclude that .
Combining both steps, we infer that and
From Lemma 5.9 and the lower semicontinuity of we immediately infer that lies below the relaxed metric slope, i.e., the lower semicontinuous relaxation of given by
Corollary 3. For all we have .
5.4. Approximation via the JKO scheme
In this section, we consider the time-discrete variational approximation scheme of Jordan–Kinderlehrer–Otto for the gradient flow [14].
Given a time step and an initial datum with , we consider a sequence in defined recursively via
Then we build a discrete gradient flow trajectory as the piecewise constant interpolation given by
Then we have the following result.
Theorem 5.10. For any and with the variational scheme (53) admits a solution . As , for any family of discrete solutions there exists a sequence and a locally -absolutely continuous curve such that
Moreover, any such limit curve is a gradient flow curve of , i.e., a weak solution to the diffusion equation (36).
Proof of Theorem 5.10. The result basically follows from general results for metric gradient flows, where the scheme is known as the minimizing movement scheme; see [2,Section 2.3]. We consider the metric space and endow it with the weak topology . It follows that [2,Assumptions 2.1 (a,b,c)] are satisfied. Existence of a solution to the variational scheme (53) and of a subsequential limit curve now follows from [2,Corollary 2.2.2,Proposition 2.2.3]. Moreover, [2,Theorem 2.3.2] gives that the limit curve is a curve of maximal slope for the strong upper gradient , i.e.,
Thus, by Corollary 3, it is also a curve of maximal slope for the strong upper gradient . Theorem 5.5 yields the identification with weak solutions to (36).
Acknowledgments
ME acknowledges funding by the Deutsche Forschungsgemeinschaft (DFG), Grant SFB 1283/2 2021 – 317210226. DF and JM were supported by the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (grant agreement No 716117). JM also acknowledges support by the Austrian Science Fund (FWF), Project SFB F65. The work of DM was partially supported by the Deutsche Forschungsgemeinschaft (DFG), Grant 397230547. This article is based upon work from COST Action 18232 MAT-DYN-NET, supported by COST (European Cooperation in Science and Technology), www.cost.eu. We wish to thank Martin Burger and Jan-Frederik Pietschmann for useful discussions. We are grateful to the anonymous referees for their careful reading and useful suggestions.