Mean field models for large data–clustering problems

Michael Herty; Lorenzo Pareschi; Giuseppe Visconti; Michael Herty; Lorenzo Pareschi; Giuseppe Visconti

doi:10.3934/nhm.2020027

Networks and Heterogeneous Media

2020, Volume 15, Issue 3: 463-487. doi: 10.3934/nhm.2020027

Previous Article Next Article

Mean field models for large data–clustering problems

1.
RWTH Aachen University, Institut für Geometrie und Praktische Mathematik, Templergraben 55, 52062 Aachen, Germany
2.
University of Ferrara, Mathematics and Computer Science Department, Via Machiavelli 35, 44121 Ferrara, Italy

Received: 01 July 2019 Revised: 01 March 2020 Published: 09 September 2020
Primary: 82C40, 94A08; Secondary: 68U10

We consider mean-field models for data–clustering problems starting from a generalization of the bounded confidence model for opinion dynamics. The microscopic model includes information on the position as well as on additional features of the particles in order to develop specific clustering effects. The corresponding mean–field limit is derived and properties of the model are investigated analytically. In particular, the mean–field formulation allows the use of a random subsets algorithm for efficient computations of the clusters. Applications to shape detection and image segmentation on standard test images are presented and discussed.

Keywords:

Citation: Michael Herty, Lorenzo Pareschi, Giuseppe Visconti. Mean field models for large data–clustering problems[J]. Networks and Heterogeneous Media, 2020, 15(3): 463-487. doi: 10.3934/nhm.2020027

Related Papers:

[1]	Michael Herty, Lorenzo Pareschi, Giuseppe Visconti . Mean field models for large data–clustering problems. Networks and Heterogeneous Media, 2020, 15(3): 463-487. doi: 10.3934/nhm.2020027
[2]	Sabrina Bonandin, Mattia Zanella . Effects of heterogeneous opinion interactions in many-agent systems for epidemic dynamics. Networks and Heterogeneous Media, 2024, 19(1): 235-261. doi: 10.3934/nhm.2024011
[3]	Michael Herty, Lorenzo Pareschi, Sonja Steffensen . Mean--field control and Riccati equations. Networks and Heterogeneous Media, 2015, 10(3): 699-715. doi: 10.3934/nhm.2015.10.699
[4]	Ciro D'Apice, Peter Kogut, Rosanna Manzo . On generalized active contour model in the anisotropic BV space and its application to satellite remote sensing of agricultural territory. Networks and Heterogeneous Media, 2025, 20(1): 113-142. doi: 10.3934/nhm.2025008
[5]	Fabio Camilli, Italo Capuzzo Dolcetta, Maurizio Falcone . Preface. Networks and Heterogeneous Media, 2012, 7(2): i-ii. doi: 10.3934/nhm.2012.7.2i
[6]	András Bátkai, Istvan Z. Kiss, Eszter Sikolya, Péter L. Simon . Differential equation approximations of stochastic network processes: An operator semigroup approach. Networks and Heterogeneous Media, 2012, 7(1): 43-58. doi: 10.3934/nhm.2012.7.43
[7]	Nastassia Pouradier Duteil . Mean-field limit of collective dynamics with time-varying weights. Networks and Heterogeneous Media, 2022, 17(2): 129-161. doi: 10.3934/nhm.2022001
[8]	Seung-Yeal Ha, Jeongho Kim, Jinyeong Park, Xiongtao Zhang . Uniform stability and mean-field limit for the augmented Kuramoto model. Networks and Heterogeneous Media, 2018, 13(2): 297-322. doi: 10.3934/nhm.2018013
[9]	Sergei Yu. Pilyugin, M. C. Campi . Opinion formation in voting processes under bounded confidence. Networks and Heterogeneous Media, 2019, 14(3): 617-632. doi: 10.3934/nhm.2019024
[10]	Diogo A. Gomes, Gabriel E. Pires, Héctor Sánchez-Morgado . A-priori estimates for stationary mean-field games. Networks and Heterogeneous Media, 2012, 7(2): 303-314. doi: 10.3934/nhm.2012.7.303

Abstract

1. Introduction

Particle and kinetic models for consensus and cluster formation appeared in recent literature for self–organized socio–economic dynamical systems as opinion formation, flocking of birds or fish, elections and referendums under influence of mass media, etc. See e.g. [5,11,18,19,20,23,25,34,16], the review articles [35,4,1] and the book [40]. A related research direction is based on using the consensus features of these models in an artificial way to solve problems of optimization or segmentation of data in large dimensions [28,41,15,29].

In this paper, we aim at formulating suitable models on the microscopic (or particle) level as well as on the mean–field (or kinetic) level to describe the partition of a large set of data in clusters. This problem is also known as data clustering problem and it is widely studied in many applications like pattern recognition, shape detection and image segmentation problems. The proposed methods do not need to have fixed a–priori number of clusters and clusters are characterized by small in–group and large out–group distances.

Since we are interested in modes characterizing distance and qualitative features without additional physical or socio–economical modeling background the proposed model will generalize the Hegselmann–Krause (HK) opinion dynamics model [25]. Originally, the HK model was proposed in a microscopic setting, in one–dimensional spatial and time discrete framework. Several extensions exist, in the sequel we will briefly review the basic model before discussing its extension towards the image clustering problems.

To this aim, let us consider a group of $n$ particles with a (scalar) initial state $x_i(0)\in \mathbb{R}$ , $i = 1,\dots,n$ , and the state of each particle varies depending on the state of the others. The key idea of the Hegselmann–Krause (HK) model [25] is that particles with completely different opinions do not influence each other, and a sort of mediation occurs among agents whose opinions are within a bounded confidence interval described by a parameter $\epsilon\geq0.$ Let ${\bf{{x}}}(t) = [x_1(t),\dots,x_n(t)]^T$ be the state of the system at time $t\geq 0$ . Then the dynamic of the $i$ –th particle is given by

$\begin{equation} \frac{ \mathrm{d}}{ \mathrm{d} t} x_i(t) = \sum\limits_{j = 1}^n A_{ij}(t,\epsilon) \left( x_j(t) - x_i(t) \right), \quad i = 1,\dots,n \end{equation}$

(1)

where ${\bf{{A}}}(t,\epsilon)\in \mathbb{R}^{n\times n}$ is the time–varying adjacency matrix whose entries are in the form

$\begin{equation} A_{ij}(t,\epsilon) : = \begin{cases} {\frac{1}{\sigma_i}}, & \text{if $j\in \mathcal{N}_i(t,\epsilon)$}\\[2ex] 0, & \text{otherwise} \end{cases} \end{equation}$

(2)

with

$\begin{equation} \mathcal{N}_i(t,\epsilon) : = \left\{ j\in\{1,\dots,n\} : \left|{x_i(t)-x_j(t)}\right|\leq\epsilon \right\}, \quad i = 1,\dots,n \end{equation}$

(3)

defining the neighborhood of the $i$ –th particle at time $t$ , and

$\begin{equation} \sigma_i : = \begin{cases} {\left|{ \mathcal{N}_i(t,\epsilon)}\right|},\\[2ex] n \end{cases} \end{equation}$

(4)

the type of interactions. Precisely, when $\sigma_i = n$ the interactions are symmetric since $A_{ij} = A_{ji}$ , $\forall\,i,j$ , but ${\bf{{A}}}$ is not a stochastic matrix. Instead, when $\sigma_i = \left|{ \mathcal{N}_i(t,\epsilon)}\right|$ , then ${\bf{{A}}}$ is a right stochastic matrix but interactions are no longer symmetric.

Several works have been proposed in the literature which analyze the properties of the HK model. For instance, for the analysis in the time discrete setting we refer to [27]. In [7] it is proven that, during the evolution of the system (1), the order of the states is preserved. Thanks to the definition of the interaction kernel (2)–(4), if $\left|{x_i (t) - x_{i+1} (t)}\right| > \epsilon$ , at some time $t$ , it remains true for larger times. Therefore, the HK model tends to group the initial states in a finite number of clusters as proved in [35]. Following [35], we define a cluster $\mathcal{C}(t)$ at time $t\geq 0$ a subset of particles separated from all the other particles

$A_{ij}(t,\epsilon) \neq 0 \ \text{ for all } \ i,j\in\mathcal{C}(t), \quad A_{ij}(t,\epsilon) = 0 \ \text{ whenever } \ i\in\mathcal{C}(t), j\notin\mathcal{C}(t).$

In [7,22,30] the stability of the dynamical model is investigated. In particular, the fact that the system converges to a steady profile in finite time is proved in [7]. For further results on the one–dimensional local and symmetric model we refer also to [8]. We point out that also behavior of cluster formation in the transient is of interest in the mathematical literature [21].

In [36,10], the one-dimensional Hegselmann–Krause is generalized to the case of a multi–dimensional data–set. Subsequently, the multi–dimensional HK model has been used as a technique to cluster a big amount of data into a small number of subsets with some common features [38] and to compare its performance with the $k$ –means algorithm [31]. Recently, in [29,37] new approaches to clustering problems and image segmentation have been proposed based on the Kuramoto model.

Here, we introduce a generalization of the multi–dimensional formulation of the HK model to solve data clustering problems. This amounts to take into account clustering with respect to different features. We deal with data having both time dependent and static features. The latter describe intrinsic properties of a datum, such as the measure of a trustworthy information or the color intensity of pixels in images. We derive the corresponding mean–field limit and investigate analytically the properties of the model. In particular, following [2] the mean–field formulation allows the use of a random subset algorithm for efficient computations of the clusters.

The rest of the manuscript is organized as follows. The microscopic model is introduced at the beginning of Section 2 and briefly discussed in Section 2.1. The proposed model is still a microscopic model and we describe the case of large data using a mean–field equation in Section 3. Analytical properties of the kinetic equation, such as a–priori estimation on the evolution of the moments and characterization of the limit distribution, are discussed in Section 3.1. Numerical evidence of the theoretical results is provided in Section 4.1. Further, we propose applications on detection and compression of data, such as shape detection, in Section 4.2, and image segmentation, in Section 4.3. We finally conclude with some remarks and future research directions in Section 5.

2. Microscopic models for data–clustering

Each particle $i = 1,\dots,N$ is endowed with a time–dependent state vector ${\bf{{x}}}_i = {\bf{{x}}}_i(t)\in \mathbb{R}^{d_1}$ as well as features ${\bf{{c}}}_i = [c_{i,1},\dots,c_{i,d_2}] \in \mathbb{R}^{d_2}$ representing static characteristics of the system, i.e., ${\bf{{c}}}_i$ is independent of time. As a motivation example consider an image segmentation problem where ${\bf{{x}}}_i$ are the center point of a pixel or voxels of the image and ${\bf{{c}}}_i$ the color coding at the center point.

As in the HK model we define the neighborhood of the particles by

$\begin{equation} \mathcal{N}_i(t,\epsilon_1,\epsilon_2) : = \left\{ j\in\{1,\dots,n\} : \left\lVert{{\bf{{x}}}_i(t)-{\bf{{x}}}_j(t)}\right\rVert_{ \mathbb{R}^{d_1}}\leq\epsilon_1, \left\lVert{{\bf{{c}}}_i-{\bf{{c}}}_j}\right\rVert_{ \mathbb{R}^{d_2}}\leq\epsilon_2 \right\}, \end{equation}$

(5)

for $i = 1,\dots,n$ , and $A_{ij}(t,\epsilon_1,\epsilon_2)$ are entries of the time–varying matrix ${\bf{{A}}}(t,\epsilon_1,\epsilon_2)\in \mathbb{R}^{n\times n}$ defined as

$\begin{equation} A_{ij}(t,\epsilon_1,\epsilon_2) : = \begin{cases} {\frac{1}{\sigma_i}}, & \text{if $j\in \mathcal{N}_i(t,\epsilon_1,\epsilon_2)$}\\[2ex] 0, & \text{otherwise} \end{cases} \end{equation}$

(6)

with $\sigma_i$ defined analogously to (4). Here, $\epsilon_1\geq 0$ and $\epsilon_2\geq 0$ are two bounded confidence levels. The two metrics $\left\lVert{\cdot}\right\rVert_{ \mathbb{R}^{d_1}}$ and $\left\lVert{\cdot}\right\rVert_{ \mathbb{R}^{d_2}}$ need to be properly defined according to the specific context of the problem. Then, the mathematical model for any $t\geq 0$ is given by

$\begin{align} c_{i,k}(t) & = c_{i,k}(0), \quad i = 1,\dots,n, \quad k = 1,\dots,d_2, \end{align}$

(7)

$\begin{align} \frac{ \mathrm{d}}{ \mathrm{d} t} x_{i,k}(t) & = \sum\limits_{j = 1}^n A_{ij}(t,\epsilon_1,\epsilon_2) \left( x_{j,k}(t) - x_{i,k}(t) \right), \quad i = 1,\dots,n, \quad k = 1,\dots,d_1 \end{align}$

(8)

and initial condition $c_{i,k}(0) = c_{i,k}^0$ and $x_{i,k}(0) = x_{i,k}^0.$ Notice that a sufficient condition to reduce model (8) to the multi–dimensional version of the HK model (1) is $\epsilon_2 > \max_{i,j = 1,\dots,n} \left\lVert{{\bf{{c}}}_i-{\bf{{c}}}_j}\right\rVert_{ \mathbb{R}^{d_2}}$ .

2.1. Properties of the microscopic model

Existence and convergence of solutions to system (8) can be established by using same techniques as in [8] where the original HK in the one–dimensional case was analyzed. In fact, system (7)- (8) belongs to the same class of state-switched systems.

In the case of symmetric interactions we can recover from the previous model similar results on the moment behavior as presented for the HK model in [8,26]. We only record the results here since the proofs are slight variations of existing results (see for example [8]).

Define the moments ${\bf{{m}}}_1(t) \in \mathbb{R}^{d_1}$ and ${\bf{{m}}}_2(t) \in \mathbb{R}^{d_1\times d_1}$ with respect to the time dependent feature as

$\begin{equation} {\bf{{m}}}_1(t) : = \sum\limits_{i = 1}^n {\bf{{x}}}_i(t), \quad \quad {\bf{{m}}}_2(t) : = \sum\limits_{i = 1}^n {\bf{{x}}}_i(t) \otimes {\bf{{x}}}_i(t), \end{equation}$

(9)

then the following result holds true.

Lemma 2.1. Let $({\bf{{x}}}_i(t))_{1\leq i \leq n}$ be the solution of the dynamical system (8) with symmetric interactions, i.e. $\sigma_i = n$ in (6). Then, we obtain

${\bf{{m}}}_1(t) = {\bf{{m}}}_1(0), \quad \frac{ \mathrm{d}}{ \mathrm{d} t} \left( {\bf{{m}}}_2(t) \right)_{kk} \leq 0, \; k = 1,\dots,d_1.$

Corollary 1. Let $({\bf{{x}}}_i(t))_{1\leq i \leq n}$ be the solution of the dynamical system (8). Assume that $\epsilon_1$ and $\epsilon_2$ are sufficiently large so that interactions are global. Then the first moment is conserved and the following decay estimate holds:

$\begin{align*} \frac{\mathrm{d}}{\mathrm{d}t} \left( {\bf{{m}}}_2(t) \right)_{k\ell}& = \frac{2}{n} \left( {\bf{{m}}}_1(0)\right )_k \left( {\bf{{m}}}_1(0)\right )_\ell - 2 \left( {\bf{{m}}}_2(t) \right)_{k\ell}, \\ \lim\limits_{t \to \infty} \left( {\bf{{m}}}_2(t) \right)_{k\ell} & = \frac{\left( {\bf{{m}}}_1(0)\right )_k \left( {\bf{{m}}}_1(0)\right )_\ell}{n}. \end{align*}$

Later, we will show that similar results hold for the continuous model. Some remarks on further properties concerning the particle model (8) are in order.

Remark 1. ● Formation of clusters in the large time behavior is extensively investigated in the review article [35] for general systems of the form (8). However, number of clusters cannot be a–priori predicted starting from a given initial configuration.

● In the non–symmetric case, conservation of the first moment does not hold true. Also, in general, it is not possible to show the decay of the second moment although clustering still appears, see [35].

● Extensions of the model to take into account non static features are obtained by including an interaction term on the right hand side of (7). The determination of such interaction term, however, would be rather problem dependent and in this paper we will not explore further this direction.

3. Mean–field description

In the case of many particles we derive the formal mean–field equation. Let $\Omega_1 \subseteq \mathbb{R}^{d_1}$ , $\Omega_2 \subseteq \mathbb{R}^{d_2}$ be compact domains and $\Omega = \Omega_1 \times \Omega_2$ . For $n\geq 1$ we denote by $f_n:\mathbb{R}^+\times \Omega \to \mathbb{R}$ the empirical distribution on $\Omega \subset \mathbb{R}^{d_1\times d_2}$ given by

$f_n(t,{\bf{{x}}},{\bf{{c}}}) : = \frac{1}{n} \sum\limits_{i = 1}^n \delta({\bf{{x}}}-{\bf{{x}}}_i(t)) \delta({\bf{{c}}} - {\bf{{c}}}_i(t)).$

Let us consider a test function $\varphi({\bf{{x}}},{\bf{{c}}}) \in C^1_0(\Omega)$ , i.e. the space of continuous and compactly supported functions on $\Omega$ with continuous derivative. Denote by $\langle \cdot,\cdot \rangle$ the integration of $f_n$ against the test function $\varphi$ on $\Omega$ . We have

$\begin{gather*} \frac{ \mathrm{d}}{ \mathrm{d} t} \langle f_n(t),\varphi \rangle = \frac{1}{n} \sum\limits_{i = 1}^n \frac{ \mathrm{d}}{ \mathrm{d} t } \varphi({\bf{{x}}}_i(t),{\bf{{c}}}_i(t)) \\ = \frac{1}{n} \sum\limits_{i = 1}^n \frac{1}{\sigma_i} \sum\limits_{j\in \mathcal{N}_i(t,\epsilon_1,\epsilon_2)} \nabla_{\bf{{x}}} \varphi({\bf{{x}}}_i(t),{\bf{{c}}}_i(t))\cdot \left( {\bf{{x}}}_j(t) - {\bf{{x}}}_i(t) \right)\\ = \langle f_n(t), \frac{1}{n\,\sigma(t,{\bf{{x}}},{\bf{{c}}})} \sum\limits_{j = 1}^n \chi_{\epsilon_1}\left( \left\lVert{{\bf{{x}}}_j(t)-{\bf{{x}}}}\right\rVert \right) \chi_{\epsilon_2}\left( \left\lVert{{\bf{{c}}}_j(t)-{\bf{{c}}}}\right\rVert \right) ({\bf{{x}}}_j(t)-{\bf{{x}}}) \cdot \nabla_{\bf{{x}}} \varphi \rangle \end{gather*}$

where we defined

$\chi_\epsilon(x) = \begin{cases} 1, \quad x\leq \epsilon\\0, \quad \text{else}\end{cases}$

and we used the fact that equation (8) can be re-written as

$\frac{ \mathrm{d}}{ \mathrm{d} t} {\bf{{x}}}_i(t) = \frac{1}{n\,\sigma(t,{\bf{{x}}}_i(t),{\bf{{c}}}_i(t))} \sum\limits_{j = 1}^n \chi_{\epsilon_1}\left( \left\lVert{{\bf{{x}}}_j(t)-{\bf{{x}}}_i(t)}\right\rVert \right) \chi_{\epsilon_2}\left( \left\lVert{{\bf{{c}}}_j(t)-{\bf{{c}}}_i(t)}\right\rVert \right)\left( {\bf{{x}}}_j(t) - {\bf{{x}}}_i(t) \right),$

with $\sigma_i = n \sigma(t,{\bf{{x}}}_i(t),{\bf{{c}}}_i(t))$ and

$\sigma(t,{\bf{{x}}}_i(t),{\bf{{c}}}_i(t)) = \frac1{n} \sum\limits_{j = 1}^n \chi_{\epsilon_1}\left( \left\lVert{{\bf{{x}}}_j(t)-{\bf{{x}}}_i(t)}\right\rVert \right) \chi_{\epsilon_2}\left( \left\lVert{{\bf{{c}}}_j(t)-{\bf{{c}}}_i(t)}\right\rVert \right).$

We can easily compute

$\begin{align*} \sigma(t,{\bf{{x}}},{\bf{{c}}})& = \frac1{n}\sum\limits_{j = 1}^n \chi_{\epsilon_1}\left( \left\lVert{{\bf{{x}}}_j(t)-{\bf{{x}}}}\right\rVert \right) \chi_{\epsilon_2}\left( \left\lVert{{\bf{{c}}}_j(t)-{\bf{{c}}}}\right\rVert \right)\\ & = \langle \chi_{\epsilon_1}\left( \left\lVert{{\bf{{y}}}-{\bf{{x}}}}\right\rVert \right) \chi_{\epsilon_2}\left( \left\lVert{{\bf{{z}}}-{\bf{{c}}}}\right\rVert \right),\delta ({\bf{{y}}}- {\bf{{x}}}_j(t))\delta ({\bf{{z}}}- {\bf{{c}}}_j(t)) \rangle\\ &\int_{\Omega}\chi_{\epsilon_1}\left( \left\lVert{{\bf{{y}}}-{\bf{{x}}}}\right\rVert \right) \chi_{\epsilon_2}\left( \left\lVert{{\bf{{z}}}-{\bf{{c}}}}\right\rVert \right)f_n(t,{\bf{{y}}},{\bf{{z}}})\, \mathrm{d} {\bf{{z}}}\, \mathrm{d} {\bf{{y}}}, \end{align*}$

and similarly

$\begin{align*} &\frac1{n\,\sigma(t,{\bf{{x}}},{\bf{{c}}})}\sum\limits_{j = 1}^n \chi_{\epsilon_1}\left( \left\lVert{{\bf{{x}}}_j(t)-{\bf{{x}}}}\right\rVert \right) \chi_{\epsilon_2}\left( \left\lVert{{\bf{{c}}}_j(t)-{\bf{{c}}}}\right\rVert \right) ({\bf{{x}}}_j(t)-{\bf{{x}}})\\ & = \frac1{\sigma(t,{\bf{{x}}},{\bf{{c}}})} \int_{\Omega} \chi_{\epsilon_1}\left( \left\lVert{{\bf{{y}}}-{\bf{{x}}}}\right\rVert \right) \chi_{\epsilon_2}\left( \left\lVert{{\bf{{z}}}-{\bf{{c}}}}\right\rVert \right) ({\bf{{y}}}-{\bf{{x}}}) f_n(t,{\bf{{y}}},{\bf{{z}}}\,)\, \mathrm{d} {\bf{{z}}}\, \mathrm{d} {\bf{{y}}}. \end{align*}$

Collecting these formal computations, after integration by part in ${\bf{{x}}}$ , we obtain the weak form of the mean–field equation

$\begin{align*} &\frac{ \mathrm{d}}{ \mathrm{d} t}\langle f_n,\varphi \rangle+\\& \langle \nabla_{\bf{{x}}} \cdot \left(f_n(t,{\bf{{x}}},{\bf{{c}}})\int_{\Omega} \frac{\chi_{\epsilon_1}\left( \left\lVert{{\bf{{y}}}-{\bf{{x}}}}\right\rVert \right) \chi_{\epsilon_2}\left( \left\lVert{{\bf{{z}}}-{\bf{{c}}}}\right\rVert \right)}{\sigma(t,{\bf{{x}}},{\bf{{c}}})} ({\bf{{y}}}-{\bf{{x}}}) f_n(t,{\bf{{y}}},{\bf{{z}}})\, \mathrm{d} {\bf{{z}}}\, \mathrm{d} {\bf{{y}}} \right),\varphi \rangle = 0. \end{align*}$

If we now define a kernel $\mathcal{A}$ as continuous extension of the adjacency matrix (6)

$\begin{equation} \mathcal{A}_{\epsilon_1,\epsilon_2}(t,{\bf{{x}}},{\bf{{c}}},{\bf{{y}}},{\bf{{z}}}) = \frac{\chi_{\epsilon_1}(\left\lVert{{\bf{{y}}}-{\bf{{x}}}}\right\rVert) \chi_{\epsilon_2}\left( \left\lVert{{\bf{{z}}}-{\bf{{c}}}}\right\rVert \right)}{\sigma(t,{\bf{{x}}},{\bf{{c}}})}, \end{equation}$

(10)

and ${\bf{{V}}}$ given by

$\begin{equation} {\bf{{V}}}_{\epsilon_1,\epsilon_2}(t,{\bf{{x}}},{\bf{{c}}}) = \int_{\Omega} \mathcal{A}_{\epsilon_1,\epsilon_2}(t,{\bf{{x}}},{\bf{{c}}},{\bf{{y}}},{\bf{{z}}}) ({\bf{{y}}}-{\bf{{x}}}) f(t,{\bf{{y}}},{\bf{{z}}})\, \mathrm{d}{\bf{{z}}}\, \mathrm{d}{\bf{{y}}} \end{equation}$

(11)

in the limit $n\to\infty$ , assuming that the empirical measure $f_n(t,{\bf{{x}}},{\bf{{c}}})$ converge to $f(t,{\bf{{x}}},{\bf{{c}}})$ , we formally obtain the strong form of the kinetic equation as

$\begin{equation} \partial_t f(t,{\bf{{x}}},{\bf{{c}}}) + \nabla_{\bf{{x}}} \cdot \left( {\bf{{V}}}_{\epsilon_1,\epsilon_2}(t,{\bf{{x}}},{\bf{{c}}}) f(t,{\bf{{x}}},{\bf{{c}}}) \right) = 0. \end{equation}$

(12)

Rigorous analytical results on convergence in the case of $\epsilon_2$ very large and symmetric interactions have been already discussed, for instance in [13,14]. These results guarantee convergence of the distribution $f$ in (12) to a probability limit distribution $f^\infty$ , provided the initial distribution $f_0$ at time $t = 0$ has finite second moment and $\mathcal{A}$ is a non–negative, bounded, measurable and symmetric kernel. Observe that these assumptions are verified also in our framework. For a more detailed discussion on analytical results on convergence in mean–field models, we refer e.g. to [12,16]

We briefly discuss the relation to similar kinetic models.

In [9] the analysis of a homogeneous kinetic model for opinion dynamics under bounded confidence is studied. Therein, the model is derived by using a Boltzmann–like approach with instantaneous binary interactions describing compromise. An analogous derivation of the kinetic equation (12) is also possible using a binary interaction model based on an explicit Euler discretization of the underlying particle dynamics (8) with $n = 2$ and performing a grazing collision limit. We omit this computation.

In [9] the authors prove the weak convergence of the solution to a convex combination of Dirac delta functions. In [13] a similar asymptotic distribution is found for the mean–field limit of the classical Hegselmann–Krause model.

A further symmetric clustering model with weighted interactions with respect a fixed number of closest neighbors and corresponding mean–field limit has been introduced in [3]. Moments and long–time behavior could also be studied therein. Finally, models for other applications are also able to cluster information e.g. in traffic flow modeling [42] where the physical acceleration has the role of the bounded confidence.

3.1. Properties of the mean–field model

As preliminary remark we observe that the marginal distribution $\tilde{f}^c(t,{\bf{{c}}}): = \int_{\Omega_1} f(t,{\bf{{x}}},{\bf{{c}}}) \mathrm{d}{\bf{{x}}}$ is preserved in time by the kinetic equation (12). Instead, it is easy to check that the marginal distribution $\tilde{f}^x(t,{\bf{{x}}}): = \int_{\Omega_2} f(t,{\bf{{x}}},{\bf{{c}}}) \mathrm{d}{\bf{{c}}}$ is not stationary and its behavior in time is still influenced by a ${\bf{{c}}}$ dependent kernel. These considerations are direct consequence of the microscopic model (8) and (7). For this reason, in the following results and if not otherwise stated, we mainly focus on the analysis of moments with respect to the variable ${\bf{{x}}}$ .

As in the discrete case we define the $p$ –th moment of the kinetic distribution with respect to ${\bf{{x}}}$ as

$\begin{equation*} \langle {\bf{{x}}}^{\boldsymbol{{\alpha}}} \rangle(t) = \int_{\Omega} {\bf{{x}}}^{\boldsymbol{{\alpha}}} f(t,{\bf{{x}}},{\bf{{c}}}) \mathrm{d}{\bf{{x}}} \mathrm{d}{\bf{{c}}}, \quad \left|{\boldsymbol{{\alpha}}}\right| = p\in\mathbb{N}, \ \alpha_i\in\mathbb{N}. \end{equation*}$

Here, we used the following multi–index notation ${\bf{{x}}}^{\boldsymbol{{\alpha}}} = x_1^{\alpha_1} \cdots x_{d_1}^{\alpha_{d_1}}$ and $\left|{\boldsymbol{{\alpha}}}\right| = \sum_{i = 1}^{d_1} \alpha_i$ . In particular, for each $k,j = 1,\dots,d_1$ we will denote

$\begin{align*} u_k(t) & = \langle {\bf{{x}}}^{\boldsymbol{{\alpha}}} \rangle(t), \quad \left|{\boldsymbol{{\alpha}}}\right| = 1, \ \alpha_i = \delta_{ik}, \ i = 1,\dots,d_1 \end{align*}$

$\begin{align*} E_{kj}(t) & = \langle {\bf{{x}}}^{\boldsymbol{{\alpha}}} \rangle(t), \quad \left|{\boldsymbol{{\alpha}}}\right| = 2, \ \alpha_i = \delta_{ik}+\delta_{ij}, \ i = 1,\dots,d_1, \end{align*}$

the first moment and the second moment, respectively. Then, for a symmetric kernel $\mathcal{A}$ we obtain by the integration of the kinetic equation

Lemma 3.1. Let $f(t,{\bf{{x}}},{\bf{{c}}})$ be the solution of the model (12) with symmetric kernel $\mathcal{A}$ , i.e., $\sigma(t,{\bf{{x}}},{\bf{{c}}}) = 1$ . Then, the following a–priori estimates hold true:

$\frac{ \mathrm{d}}{ \mathrm{d} t} u_k(t) = 0, \quad \frac{ \mathrm{d}}{ \mathrm{d} t} E_{kk}(t) \leq 0, \quad \left|{E_{kj}(t)}\right| \leq \frac12 \left( E_{kk} + E_{jj} \right)(0)$

for each $k,j = 1,\dots,d_1$ .

Proof. For each fixed $k = 1,\dots,d_1$ we have

$\begin{align*} &\frac{ \mathrm{d}}{ \mathrm{d} t} u_k(t) = -\int_\Omega x_k \nabla_{\bf{{x}}} \cdot \left( {\bf{{V}}}_{\epsilon_1,\epsilon_2}(t,{\bf{{x}}},{\bf{{c}}}) f(t,{\bf{{x}}},{\bf{{c}}}) \right) \mathrm{d}{\bf{{x}}} \mathrm{d}{\bf{{c}}} \\& = \int_{\Omega} \int_{\Omega} \chi_{\epsilon_1}(\left\lVert{{\bf{{y}}}-{\bf{{x}}}}\right\rVert) \chi_{\epsilon_2}(\left\lVert{{\bf{{z}}}-{\bf{{c}}}}\right\rVert) (y_k-x_k) f(t,{\bf{{y}}},{\bf{{z}}}) f(t,{\bf{{x}}},{\bf{{c}}}) \mathrm{d}{\bf{{y}}} \mathrm{d}{\bf{{z}}} \mathrm{d}{\bf{{x}}} \mathrm{d}{\bf{{c}}} = 0. \end{align*}$

The last term vanishes due to anti–symmetry of the integrand by the interchange of variables ${\bf{{x}}} \leftrightarrow {\bf{{y}}}$ . The second statement follows by observing that for each $k = 1,\dots,d_1$ we have

$\begin{gather*} \frac{ \mathrm{d}}{ \mathrm{d} t} E_k(t) = -\int_\Omega x_k^2 \nabla_{\bf{{x}}} \cdot \left( {\bf{{V}}}_{\epsilon_1,\epsilon_2}(t,{\bf{{x}}},{\bf{{c}}}) f(t,{\bf{{x}}},{\bf{{c}}}) \right) \mathrm{d}{\bf{{x}}} \mathrm{d}{\bf{{c}}}\\ = - \int_{\Omega} \int_{\Omega} \chi_{\epsilon_1}(\left\lVert{{\bf{{y}}}-{\bf{{x}}}}\right\rVert) \chi_{\epsilon_2}(\left\lVert{{\bf{{z}}}-{\bf{{c}}}}\right\rVert) (y_k-x_k)^2 f(t,{\bf{{y}}},{\bf{{z}}}) f(t,{\bf{{x}}},{\bf{{c}}}) \mathrm{d}{\bf{{y}}} \mathrm{d}{\bf{{z}}} \mathrm{d}{\bf{{x}}} \mathrm{d}{\bf{{c}}}\leq 0. \end{gather*}$

Hence, we obtain

$0 \leq \left|{E_{kj}(t)}\right| \leq \int_{\Omega} \left|{x_k x_j}\right| f(t,{\bf{{x}}},{\bf{{c}}}) \mathrm{d}{\bf{{x}}} \mathrm{d}{\bf{{c}}} \leq \frac12 \left( E_{kk} + E_{jj} \right)(t) \leq \frac12 \left( E_{kk} + E_{jj} \right)(0).$

Instead, in the case of global interactions and for a general kernel $\mathcal{A}$ and we have

Corollary 2. Assume that $\epsilon_1$ and $\epsilon_2$ are sufficiently large so that $\mathcal{A}(t,{\bf{{x}}},{\bf{{c}}},{\bf{{y}}},{\bf{{z}}};\epsilon_1, \epsilon_2) = 1$ , for all ${\bf{{x}}},{\bf{{y}}}\in\Omega_1$ , ${\bf{{c}}},{\bf{{z}}}\in\Omega_2$ and $t\geq 0$ . Then, for all $k,j = 1,\dots,d_1$ , the following relations hold true:

$u_k(t) = u_k(0), \ \forall\,t\geq 0 \quad \mathit{\mbox{and}} \quad E_{kj}(t) \xrightarrow{ t \to \infty } u_k(0)u_j(0).$

Proof. Multiplying (12) by ${\bf{{x}}}^{\boldsymbol{{\alpha}}}$ with $\left|{\boldsymbol{{\alpha}}}\right| = p\in\mathbb{N}$ , and integrating over $\Omega$ , we compute

$\frac{ \mathrm{d}}{ \mathrm{d} t} \langle {\bf{{x}}}^{\boldsymbol{{\alpha}}} \rangle(t) = -p \langle {\bf{{x}}}^{\boldsymbol{{\alpha}}} \rangle(t) + \sum\limits_{\ell = 1}^{d_1} \alpha_\ell u_\ell(t) \langle {\bf{{x}}}^{\boldsymbol{{\alpha}}^{(\ell)}} \rangle(t)$

where $\boldsymbol{{\alpha}}^{(\ell)} = \boldsymbol{{\alpha}} - \bf{{e}}_\ell$ with $\bf{{e}}_\ell$ being the $\ell$ –th element of the standard basis vector of $\mathbb{R}^{d_1}$ and $\left|{\boldsymbol{{\alpha}}^{(\ell)}}\right| = p-1$ . Then, for $p = 1$ we still have have conservation of the first moment since we obtain

$\frac{ \mathrm{d}}{ \mathrm{d} t} u_k(t) = -u_k(t) + u_k(t) = 0, \quad k = 1,\dots,d_1.$

For $p = 2$ we have

$\frac{ \mathrm{d}}{ \mathrm{d} t} E_{kj}(t) = -2E_{kj}(t) + 2 u_k(t)u_j(t) = -2E_{kj}(t) + 2 u_k(0)u_j(0), \quad k,j = 1,\dots,d_1$

and therefore

$E_{kj}(t) = E_{kj}(0) e^{-2t} + u_k(0)u_j(0) \left( 1 - e^{-2t} \right) \xrightarrow{ t \to \infty } u_k(0)u_j(0), \qquad k,j = 1,\dots,d_1.$

Observe that, compared to the local interaction case analyzed in Theorem 3.1 where in general the energy decay property of all second order moments is not guaranteed, in global interactions also the mixed second order moments decay in time. In other words, while in the local interaction model only variances go to zero, in the global interaction model both variances and covariances go to zero in the large time behavior.

Lemma 3.1 and Corollary 2 suggest that for $t\to \infty$ any initial distribution will tend to a stationary distribution that is concentrated on a finite number of points in $\Omega_1$ and represents therefore clusters. Hence, in the following we investigate the asymptotic distribution of the model (12).

Lemma 3.2. Let $\epsilon_1$ and $\epsilon_2$ be arbitrary positive bounded confidence levels. Then, the distribution

$\begin{equation} f^\infty({\bf{{x}}},{\bf{{c}}}) = \sum\limits_{k = 1}^{n_1} f_k \delta({\bf{{x}}}-{\bf{{x}}}_k) \sum\limits_{\ell = 1}^{n_2(k)} \delta({\bf{{c}}}-{\bf{{c}}}_{k_\ell}) \end{equation}$

(13)

with $f_k>0$ and $\sum_{k = 1}^{n_1} \sum_{\ell = 1}^{n_2(k)} f_k = 1$ is a weak stationary solution of the model (12) if and only if either $\left\lVert{{\bf{{x}}}_i-{\bf{{x}}}_k}\right\rVert_{ \mathbb{R}^{d_1}}>\epsilon_1$ for all $i\neq k$ or $\left\lVert{{\bf{{c}}}_{i_j}-{\bf{{c}}}_{k_\ell}}\right\rVert_{ \mathbb{R}^{d_2}}>\epsilon_2$ for all $i\neq k$ , for all $j,\ell$ , or both hold true.

Proof. Let us assume that at least one of $\left\lVert{{\bf{{x}}}_i-{\bf{{x}}}_k}\right\rVert_{ \mathbb{R}^{d_1}}>\epsilon_1$ for all $i\neq k$ and $\left\lVert{{\bf{{c}}}_{i_j}-{\bf{{c}}}_{k_\ell}}\right\rVert_{ \mathbb{R}^{d_2}}>\epsilon_2$ for all $i\neq k$ , for all $j,\ell$ , is verified and prove that $f^\infty$ is the asymptotic distribution of the equation (12), i.e. given a test function $\varphi\in C_0^1(\Omega)$

$\int_{\Omega} \varphi({\bf{{x}}},{\bf{{c}}}) \nabla_{\bf{{x}}} \cdot \left({\bf{{V}}}_{\epsilon_1,\epsilon_2}(t,{\bf{{x}}},{\bf{{c}}}) f^\infty({\bf{{x}}},{\bf{{c}}})\right) \mathrm{d}{\bf{{x}}} \mathrm{d}{\bf{{c}}} = 0.$

We obtain

$\begin{gather*} \int_{\Omega} \varphi({\bf{{x}}},{\bf{{c}}}) \nabla_{\bf{{x}}} \cdot \left({\bf{{V}}}_{\epsilon_1,\epsilon_2}(t,{\bf{{x}}},{\bf{{c}}}) f^\infty({\bf{{x}}},{\bf{{c}}})\right) \mathrm{d}{\bf{{x}}} \mathrm{d}{\bf{{c}}} \\ = -\sum\limits_{k = 1}^{n_1} f_k \sum\limits_{\ell = 1}^{n_2(k)} {\bf{{V}}}(t,{\bf{{x}}}_k,{\bf{{c}}}_{k_\ell};\epsilon_1,\epsilon_2) \cdot \nabla_{\bf{{x}}} \varphi({\bf{{x}}},{\bf{{c}}})|_{{\bf{{x}}} = {\bf{{x}}}_k} \\ = -\sum\limits_{k = 1}^{n_1} f_k^2 \sum\limits_{\ell = 1}^{n_2(k)} \frac{({\bf{{x}}}_k-{\bf{{x}}}_k)}{\sigma({\bf{{x}}}_k,{\bf{{c}}}_{k_\ell})} \cdot \nabla_{\bf{{x}}} \varphi({\bf{{x}}})|_{{\bf{{x}}} = {\bf{{x}}}_k} = 0. \end{gather*}$

Conversely, assume that $f^\infty$ as in (13) is the steady state of (12). Assume by contradiction that there exist $k,i\in\{1\dots,n_1\}$ such that $\left\lVert{{\bf{{x}}}_k-{\bf{{x}}}_i}\right\rVert_{ \mathbb{R}^{d_1}}\leq\epsilon_1$ and $\ell\in\{1\dots,n_2(k)\}$ , $j\in\{1\dots,n_2(i)\}$ such that $\left\lVert{{\bf{{c}}}_{i_j}-{\bf{{c}}}_{k_\ell}}\right\rVert_{ \mathbb{R}^{d_2}}\leq\epsilon_2$ . Then, using similar computations as before we obtain

$\begin{align*} &\int_{\Omega} \varphi({\bf{{x}}},{\bf{{c}}}) \nabla_{\bf{{x}}} \cdot \left({\bf{{V}}}_{\epsilon_1,\epsilon_2}(t,{\bf{{x}}},{\bf{{c}}}) f^\infty({\bf{{x}}},{\bf{{c}}})\right) \mathrm{d}{\bf{{x}}} \mathrm{d}{\bf{{c}}} \\ = & - f_k f_i \frac{({\bf{{x}}}_k-{\bf{{x}}}_i)}{\sigma({\bf{{x}}}_k,{\bf{{c}}}_{k_\ell})} \cdot \nabla_{\bf{{x}}} \varphi({\bf{{x}}},{\bf{{c}}})|_{{\bf{{x}}} = {\bf{{x}}}_k} \\ &- f_i f_k \frac{({\bf{{x}}}_i-{\bf{{x}}}_k)}{\sigma({\bf{{x}}}_i,{\bf{{c}}}_{i_j})} \cdot \nabla_{\bf{{x}}} \varphi({\bf{{x}}},{\bf{{c}}})|_{{\bf{{x}}} = {\bf{{x}}}_i} \neq 0 \end{align*}$

which contradicts the hypothesis.

Lemma 3.3. Assume that $\epsilon_1$ and $\epsilon_2$ are sufficiently large so that $\mathcal{A}(t,{\bf{{x}}},{\bf{{c}}},{\bf{{y}}},{\bf{{z}}};\epsilon_1,\epsilon_2) = 1$ , for all ${\bf{{x}}},{\bf{{y}}}\in\Omega_1$ , ${\bf{{c}}},{\bf{{z}}}\in\Omega_2$ and $t\geq 0$ . Then, the distribution

$f^\infty({\bf{{x}}},{\bf{{c}}}) = \delta({\bf{{x}}}-\bf{{\bar{x}}}) \tilde{f}^c(0,{\bf{{c}}}) = \prod\limits_{k = 1}^{d_1} \delta(x_k-\bar{x}_k) \tilde{f}^c(0,{\bf{{c}}})$

with $\tilde{f}^c(0,{\bf{{c}}}) = \int_{\Omega_1} f_0({\bf{{x}}},{\bf{{c}}}) \mathrm{d}{\bf{{x}}}$ is a weak stationary solution of the model (12) if and only if $\bar{x}_k = u_k(0)$ .

Proof. Substituting the expression of $f^\infty$ in the weak form of the kinetic equation, integrating by parts and using the fact that for $\epsilon_1$ and $\epsilon_2$ large enough so that

${\bf{{V}}}_{\epsilon_1,\epsilon_2}(t,{\bf{{x}}},{\bf{{c}}}) = \int_{\Omega} {\bf{{y}}} f(t,{\bf{{y}}},{\bf{{z}}}) \mathrm{d}{\bf{{z}}} \mathrm{d}{\bf{{y}}} - {\bf{{x}}} = \int_{\Omega} {\bf{{y}}} f_0({\bf{{y}}},{\bf{{z}}}) \mathrm{d}{\bf{{z}}} \mathrm{d}{\bf{{y}}} - {\bf{{x}}}$

we obtain

$\begin{gather*} \int_{\Omega} \varphi({\bf{{x}}},{\bf{{c}}}) \nabla_{\bf{{x}}} \cdot \left({\bf{{V}}}_{\epsilon_1,\epsilon_2}(t,{\bf{{x}}},{\bf{{c}}}) f^\infty({\bf{{x}}},{\bf{{c}}})\right) \mathrm{d}{\bf{{x}}} \mathrm{d}{\bf{{c}}} \\ = \int_{\Omega_2} \tilde{f}({\bf{{c}}}) \left( \bf{{\bar{x}}} - \int_{\Omega} {\bf{{y}}} f_0({\bf{{y}}},{\bf{{z}}}) \mathrm{d}{\bf{{z}}} \mathrm{d}{\bf{{y}}} \right) \cdot \nabla_{\bf{{x}}} \varphi({\bf{{x}}},{\bf{{c}}})|_{{\bf{{x}}} = \bf{{\bar{x}}}} \mathrm{d}{\bf{{c}}} \end{gather*}$

which is zero if $\bf{{\bar{x}}} = \int_{\Omega} {\bf{{y}}} f_0({\bf{{y}}},{\bf{{z}}}) \mathrm{d}{\bf{{z}}} \mathrm{d}{\bf{{y}}}$ .

Remark 2. As direct consequence of Theorem 3.2 and Theorem 3.3 we have that taking $\epsilon_2$ sufficiently large or an initial distribution $f_0({\bf{{x}}},{\bf{{c}}})$ being atomic with respect the second variable, the same quantized steady–state of the kinetic formulation of the microscopic model (1) are preserved, cf. [13].

Remark 3. In Theorem 3.2 the number of clusters $n_1$ , as well the values $f_j$ and the positions ${\bf{{x}}}_j$ of the clusters, are functions of the initial distribution $f_0({\bf{{x}}},{\bf{{c}}})$ and of the bounded confidence levels $\epsilon_1$ and $\epsilon_2$ . As pointed–out also in [14], in general it is not possible to predict the number of Dirac deltas in the asymptotic configuration from a given initial distribution. However, for $\epsilon_2$ sufficiently large and assuming $\Omega = [0,1]^d$ , the number of clusters is $1\leq \tilde{n} \leq \lfloor \frac{1}{\epsilon_1^d} \rfloor$ . This consideration is also observed at the Boltzmann level, see [9].

4. Numerical experiments and applications

The theoretical results on the asymptotic behavior of the mean–field equation (12) introduced in the previous section are here also numerically investigated. Moreover, we show the efficiency of the model as technique to solve realistic data–clustering problems and to this end we focus on applications to the field of shape detection and image segmentation.

In order to efficiently solve the kinetic model (12) we employ the Mean Field Interaction Algorithm introduced in [2] which is based on random subset evaluation of the kernel term $\mathcal{A}_{\epsilon_1,\epsilon_2}(t,{\bf{{x}}},{\bf{{c}}},{\bf{{y}}},{\bf{{z}}})$ given in (10). The algorithm used in the numerical experiments is summarized by the steps in Algorithm 1 for a time interval $[0,T]$ discretized in $k_{tot}$ subintervals of size $\Delta t$ . The computational cost of the algorithm is $O = (MN)$ , where $M$ is the size of the subset of interacting particles, and for $M = N$ we obtain the explicit Euler scheme for the original $N$ particle system (8) whose cost is $O(N^2)$ . For further details on the Mean Field Interaction Algorithm we refer to [2].

Algorithm 1 Mean Field Interaction Algorithm for the kinetic equation (12).

1: Given

$N$

sample pairs

$(x_i^0,c_i^0)$

, with

$i=1,\dots,N$

computed from the initial distribution

$f^0(x,c)$

and

$M\leq N$

;
2: for

$k=0$

$k_{tot}-1$

do
3: for

$i=1$

$N$

do
4: sample

$M$

data

$j_1,\dots,j_M$

uniformly without repetition among all data;
5: compute

$\bar{A}(x_i^k,c_i^0) = \sum\limits_{\ell = 1}^M \mathcal{A}_{\epsilon_1,\epsilon_2}(x_i^k,c_i^0,x_{j_\ell}^k,c_{j_\ell}^0), \quad \bar{x}_i^k = \sum\limits_{\ell = 1}^M \frac{\mathcal{A}_{\epsilon_1,\epsilon_2}(x_i^k,c_i^0,x_{j_\ell}^k,c_{j_\ell}^0)}{\bar{A}(x_i^k,c_i^0)}x_{j_\ell}$

6: compute the data change

$x_i^{k+1} = x_i^k \left(1-\Delta t \bar{A}(x_i^k,c_i^0)\right) + \Delta t \bar{A}(x_i^k,c_i^0) \bar{x}_i^k$

7: end for
8: end for

| Show Table

DownLoad: CSV

4.1. Numerical steady–states and moment evolution

We use the Mean Field Interaction Algorithm to numerically investigate the properties of the kinetic model proved in the previous section. We analyze two typical situations which lead to the two applications we show later in this section.

4.1.1. Constant static feature.

First, we consider an initial distribution being the uniform distribution with respect to ${\bf{{x}}}$ and a constant distribution along the static feature variable ${\bf{{c}}}$ , i.e. $f_0({\bf{{x}}},{\bf{{c}}}) = \chi_{[0,1]^{d_1}}(\bf{{\vec{x}}})$ . This choice is obviously also consistent with considering the bounded confidence level $\epsilon_2$ very large. We show that, as discussed in Remark 2, equation (12) provides the same steady–states of the Hegselmann–Krause model.

One–dimension. The numerical steady–states of the mean–field equation (12) are first computed for the one–dimensional case and compared to the steady–states of the microscopic model (8). Observe that, under the assumption of an initial constant distribution along ${\bf{{c}}}$ , the asymptotic behavior of (8) is the classical one prescribed by the one–dimensional Hegselmann–Krause model (1). The evolution in time of the first and second moment is also provided.

In Figure 1 we show the steady–state provided by the mean–field kinetic model (12) for the one–dimensional initial uniform distribution on $[0,1]$ . The final time is $T = 20$ and the time step is $\Delta t = 0.5$ . We consider $N = 5\times 10^5$ particles in the Monte Carlo method so that we reduce error due to the sampling. The number of interacting particles is taken as $M = 10$ . We show the results for two values of the confidence bound, $\epsilon_1 = 0.5$ in the left panel and $\epsilon_1 = 0.15$ in the right panel. The evolution of the distributions up to final time is showed by normalizing with respect to the maximum value at each fixed time. For $\epsilon_1 = 0.5$ we observe the formation of a consensus state in large time behavior. In fact, the final distribution is a Dirac delta centered in the initial value of the first moment. For $\epsilon_1 = 0.15$ , instead, three clusters arise at equilibrium, similarly the classical Hegselmann–Krause model for the same confidence bounds and equally distributed initial data.

Figure 1.

Trend to the steady–state of the one–dimensional Hegselmann–Krause model (1) with $n = 100$ agents equally spaced at initial time and non–symmetric interactions (top row) and of the mean–field model (12) computed with Algorithm 1 (bottom row) up to final time $T = 20$ . Left panels show the case for $\epsilon_1 = 0.5$ , the right panels show the case for $\epsilon_1 = 0.15$

$\alpha = 5\%$			$\alpha = 7.5\%$			$\alpha = 10\%$
$\epsilon_1$	$\mathcal{E}$	$\tilde{n}$	$\epsilon_1$	$\mathcal{E}$	$\tilde{n}$	$\epsilon_1$	$\mathcal{E}$	$\tilde{n}$
0.03	1.25e-02	30	0.03	3.47e-02	51	0.06	2.64e-02	11
0.04	4.10e-03	16	0.05	1.21e-02	14	0.07	1.48e-02	8
0.05	4.00e-03	12	0.07	7.70e-03	8	0.08	1.12e-02	8
0.06	4.60e-03	9	0.09	7.90e-03	8	0.09	1.63e-02	5
0.07	5.40e-03	8	0.11	1.66e-02	3	0.10	1.63e-02	5

$\alpha = 5\%$			$\alpha = 5.5\%$			$\alpha = 6\%$
$\epsilon_1$	$\mathcal{E}$	$\tilde{n}$	$\epsilon_1$	$\mathcal{E}$	$\tilde{n}$	$\epsilon_1$	$\mathcal{E}$	$\tilde{n}$
0.05	4.44e-02	23	0.05	4.73e-02	24	0.05	6.37e-02	30
0.06	1.36e-02	11	0.06	2.62e-02	13	0.06	4.16e-02	16
0.065	6.40e-03	9	0.065	1.63e-02	11	0.07	2.12e-02	9
0.07	6.70e-03	7	0.0675	7.40e-03	10	0.075	9.70e-03	7
0.08	8.50e-03	7	0.07	8.00e-03	9	0.08	9.20e-03	7
0.09	1.00e-02	6	0.08	9.80e-03	7	0.085	1.10e-02	5

[1]	G. Albi, N. Bellomo, L. Fermo, S.-Y. Ha and J. Kim, et al., Vehicular traffic, crowds, and swarms.: From kinetic theory and multiscale methods to applications and research perspectives, Math. Models Methods Appl. Sci., 29 (2019), 1901-2005.
[2]	Binary interaction algorithms for the simulation of flocking and swarming dynamics. Multiscale Model. Simul. (2013) 11: 1-29.
[3]	Modeling of self-organized systems interacting with a few individuals: From microscopic to macroscopic dynamics. Appl. Math. Lett. (2013) 26: 397-401.
[4]	G. Albi, L. Pareschi, G. Toscani and M. Zanella, Recent advances in opinion modeling: Control and social influence, in Active Particles, Vol. 1, Model. Simul. Sci. Eng. Technol., Birkhäuser/Springer, Cham, 2017, 49–98.
[5]	Opinion dynamics over complex networks: Kinetic modeling and numerical methods. Kinet. Relat. Models (2017) 10: 1-32.
[6]	Contour detection and hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. (2011) 33: 898-916.
[7]	On Krauses's multi-agent consensus model with state-dependent connectivity. IEEE Trans. Automat. Control (2009) 54: 2586-2597.
[8]	Continuous time average-preserving opinion dynamics with opinion-dependent communications. SIAM J. Control Optim. (2010) 48: 5214-5240.
[9]	Asymptotic analysis of continuous opinion dynamics models under bounded confidence. Commun. Pure Appl. Anal. (2013) 12: 1487-1499.
[10]	L. Boudin, R. Monaco and F. Salvarani, Kinetic model for multidimensional opinion formation, Phys. Rev. E, 81 (2010), 9pp.
[11]	Opinion dynamics: Kinetic modelling with mass media, application to the Scottish independence referendum. Phys. A (2016) 444: 448-457.
[12]	A well-posedness theory in measures for some kinetic models of collective motion. Math. Models Methods Appl. Sci. (2011) 21: 519-539.
[13]	An Eulerian approach to the analysis of rendez-vous algorithms. IFAC Proceedings Volumes (2008) 41: 9039-9044.
[14]	C. Canuto, F. Fagnani and P. Tilli, An Eulerian approach to the analysis of Krause's consensus models, SIAM J. Control Optim., 50 (2012), 243–265.
[15]	An analytical framework for consensus-based global optimization method. Math. Models Methods Appl. Sci. (2018) 28: 1037-1066.
[16]	J. A. Carrillo, M. Fornasier, G. Toscani and F. Vecil, Particle, kinetic, and hydrodynamic models of swarming, in Mathematical Modeling of Collective Behavior in Socio-Economic and Life Sciences, Model. Simul. Sci. Eng. Technol., Birkhäuser Boston, Boston, MA, 2010, 297–336.
[17]	DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Machine Intelligence (2018) 40: 834-848.
[18]	Effective leadership and decision-making in animal groups on the move. Nature (2005) 433: 513-516.
[19]	Emergent behavior in flocks. IEEE Trans. Automat. Control (2007) 52: 852-862.
[20]	A macroscopic model for a system of swarming agents using curvature control. J. Stat. Phys. (2011) 143: 685-714.
[21]	F. Dietrich, S. Martin and M. Jungers, Transient cluster formation in generalized Hegselmann-Krause opinion dynamics, 2016 European Control Conference (ECC), Aalborg, Denmark, 2016.
[22]	Consensus formation under bounded confidence. Nonlinear Anal. (2001) 47: 4615-4622.
[23]	Boltzmann and Fokker-Planck equations modelling opinion formation in the presence of strong leaders. Proc. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. (2009) 465: 3687-3708.
[24]	S. Gould, R. Fulton and D. Koller, Decomposing a scene into geometric and semantically consistent regions, 2009 IEEE 12th International Conference on Computer Vision, Kyoto, Japan, 2009.
[25]	R. Hegselmann and U. Krause, Opinion dynamics and bounded confidence: Models, analysis and simulation, J. Artifical Societies Social Simulation, 5 (2002).
[26]	J. M. Hendrickx, Graphs and Networks for the Analysis of Autonomous Agent Systems, Ph.D thesis, Ecole Polytechnique de Louvain, 2008.
[27]	Clustering and asymptotic behavior in opinion formation. J. Differential Equations (2014) 257: 4165-4187.
[28]	J. Kennedy and R. Eberhart, Particle swarm optimization, Proc. Internat. Conference Neural Networks, Perth, Australia, 1995.
[29]	Color image segmentation based on modified Kuramoto model. Procedia Comp. Sci. (2016) 88: 245-258.
[30]	A stabilization theorem for dynamics of continuous opinions. Phys. A (2005) 355: 217-223.
[31]	J. MacQueen, Some methods for classification and analysis of multivariate observations, Proc. Fifth Berkeley Sympos. Math. Statist. and Probability, Vol. Ⅰ: Statistics, Univ. California Press, Berkeley, CA, 1967, 281-297.
[32]	D. Martin, C. Fowlkes, D. Tal and J. Malik, A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics, Proc. Eighth IEEE Internat. Conference Comp. Vision, Vancouver, Canada, 2001.
[33]	Kernel possibilistic fuzzy $c$ -means clustering with local information for image segmentation. Internat. J. Fuzzy Syst. (2019) 21: 321-332.
[34]	A new model for self-organized dynamics and its flocking behavior. J. Stat. Phys. (2011) 144: 923-947.
[35]	Heterophilious dynamics enhances consensus. SIAM Review (2014) 56: 577-621.
[36]	A. Nedić and B. Touri, Multi-dimensional Hegselmann-Krause dynamics, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC), Maui, HI, 2012.
[37]	Oscillatory neural networks based on the Kuramoto model for cluster analysis. Pattern Recognition Image Anal. (2014) 24: 365-371.
[38]	G. Oliva, D. La Manna, A. Fagiolini and R. Setola, Distributed data clustering via opinion dynamics, Internat. J. Distributed Sensor Networks, 11 (2015).
[39]	Fronts propagating with curvature-dependent speed: Algorithms based on Hamilton-Jacobi formulations. J. Comput. Phys. (1988) 79: 12-49.
[40]	L. Pareschi and G. Toscani, Interacting Multiagent Systems. Kinetic Equations and Monte Carlo Methods, Oxford University Press, 2013.
[41]	A consensus-based model for global optimization and its mean-field limit. Math. Models Methods Appl. Sci. (2017) 27: 183-204.
[42]	Kinetic models for traffic flow resulting in a reduced space of microscopic velocities. Kinet. Relat. Models (2017) 10: 823-854.
[43]	J. A. Sethian, Level Set Methods and Fast Marching Methods. Evolving Interfaces in Computational Geometry, Fluid Mechanics, Computer Vision, and Materials Science, Cambridge Monographs on Applied and Computational Mathematics, 3, Cambridge University Press, Cambridge, 1999.
[44]	P. Shan, Image segmentation method based on K-mean algorithm, EURASIP J. Image Video Processing, 81 (2018).
[45]	Research on image segmentation based on clustering algorithm. Internat. J. Signal Process. Image Process. Pattern Recognition (2016) 9: 1-12.
[46]	Representative discovery of structure cues for weakly-supervised image segmentation. IEEE Transac. Multimedia (2014) 16: 470-479.
[47]	X. Zheng, Q. Lei, R. Yao, Y. Gong and Q. Yin, Image segmentation based on adaptive $K$ -means algorithm, EURASIP J. Image Video Processing, 68 (2018).

1.	Alessandro Benfenati, Giacomo Borghi, Lorenzo Pareschi, Binary Interaction Methods for High Dimensional Global Optimization and Machine Learning, 2022, 86, 0095-4616, 10.1007/s00245-022-09836-5
2.	Christian Fiedler, Michael Herty, Michael Rom, Chiara Segala, Sebastian Trimpe, Reproducing kernel Hilbert spaces in the mean field limit, 2023, 0, 1937-5093, 0, 10.3934/krm.2023010
3.	Christian Fiedler, Michael Herty, Chiara Segala, Sebastian Trimpe, Recent kernel methods for interacting particle systems: first numerical results, 2024, 0956-7925, 1, 10.1017/S0956792524000706
4.	Lorenzo Pareschi, Mattia Zanella, Reduced Variance Random Batch Methods for Nonlocal PDEs, 2024, 191, 0167-8019, 10.1007/s10440-024-00656-z
5.	Raffaella Fiamma Cabini, Anna Pichiecchio, Alessandro Lascialfari, Silvia Figini, Mattia Zanella, A kinetic approach to consensus-based segmentation of biomedical images, 2024, 0, 1937-5093, 0, 10.3934/krm.2024017
6.	Raffaella Fiamma Cabini, Horacio Tettamanti, Mattia Zanella, Understanding the Impact of Evaluation Metrics in Kinetic Models for Consensus-Based Segmentation, 2025, 27, 1099-4300, 149, 10.3390/e27020149

Networks and Heterogeneous Media

Mean field models for large data–clustering problems

Related Papers:

Abstract

1. Introduction

2. Microscopic models for data–clustering

2.1. Properties of the microscopic model

3. Mean–field description

3.1. Properties of the mean–field model

4. Numerical experiments and applications

4.1. Numerical steady–states and moment evolution

4.1.1. Constant static feature.

4.1.2. Non–constant static feature.

4.2. Clustering and shape detection

4.3. Color image segmentation

5. Conclusions

Acknowledgments

References

This article has been cited by:

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Related pages

Tools

Export File

Citation

Format

Content

Catalog