Kernel K-means clustering of distributional data
M. Sánchez Signorini, A. Baíllo Moreno, J. R. Berrendero Díaz
We approach the problem of clustering a sample of probability distributions from a random distribution on R^p. By means of the maximum mean discrepancy (MMD) one can measure the distance between two probability distributions. As such, our proposed method considers a symmetric, positive-definite kernel k and its associated reproducing kernel Hilbert space H. After mapping the probability distributions to their kernel mean embedding in H, the K-means clustering algorithm is then applied in H, providing an unsupervised classification of the original sample. This procedure is straightforward and computationally feasible even for dimension p > 1. We present simulation studies to provide insight into the choice of the kernel and its tuning parameter. Furthermore, we illustrate the performance of our proposed clustering method on a collection of Synthetic Aperture Radar (SAR) images.
Keywords: Functional data, maximum mean discrepancy, nonparametric, unsupervised classification
Scheduled
GT AMyC II: Advances in Clustering and Regression Analysis
September 4, 2026 11:10 AM
Aula 28
Other papers in the same session
M. Cuesta Santa Teresa, C. D'Ambrosio, M. Durban, V. Guerrero
L. A. García-Escudero, A. Mayo-Iscar, L. Trapote Reglero
M. Romero Madroñal, M. R. Sillero Denamiel, M. D. Jiménez Gamero
L. Vicente González, F. J. del Río Olvera, J. L. Vicente Villardón