### Applied Computing and Intelligence

2023, Issue 1: 93-115. doi: 10.3934/aci.2023006
Research article

# All-pairwise squared distances lead to more balanced clustering

• Received: 09 December 2022 Revised: 14 March 2023 Accepted: 19 April 2023 Published: 15 May 2023
• In clustering, the cost function that is commonly used involves calculating all-pairwise squared distances. In this paper, we formulate the cost function using mean squared error and show that this leads to more balanced clustering compared to centroid-based distance functions, like the sum of squared distances in $k$-means. The clustering method has been formulated as a cut-based approach, more intuitively called Squared cut (Scut). We introduce an algorithm for the problem which is faster than the existing one based on the Stirling approximation. Our algorithm is a sequential variant of a local search algorithm. We show by experiments that the proposed approach provides better overall optimization of both mean squared error and cluster balance compared to existing methods.

Citation: Mikko I. Malinen, Pasi Fränti. All-pairwise squared distances lead to more balanced clustering[J]. Applied Computing and Intelligence, 2023, 3(1): 93-115. doi: 10.3934/aci.2023006

### Related Papers:

