Kernel K-means clustering of distributional data

M. Sánchez Signorini, A. Baíllo Moreno, J. R. Berrendero Díaz

We approach the problem of clustering a sample of probability distributions from a random distribution on R^p. By means of the maximum mean discrepancy (MMD) one can measure the distance between two probability distributions. As such, our proposed method considers a symmetric, positive-definite kernel k and its associated reproducing kernel Hilbert space H. After mapping the probability distributions to their kernel mean embedding in H, the K-means clustering algorithm is then applied in H, providing an unsupervised classification of the original sample. This procedure is straightforward and computationally feasible even for dimension p > 1. We present simulation studies to provide insight into the choice of the kernel and its tuning parameter. Furthermore, we illustrate the performance of our proposed clustering method on a collection of Synthetic Aperture Radar (SAR) images.

Keywords: Functional data maximum mean discrepancy nonparametric unsupervised classification

Scheduled

GT AMyC II: Advances in Clustering and Regression Analysis

September 4, 2026 11:10 AM

Aula 28

Other papers in the same session

Sparse smooth additive regression models with shape constraints

M. Cuesta Santa Teresa, C. D'Ambrosio, M. Durban, V. Guerrero

Diagnostic tools for outlier detection based on robust clustering

L. A. García-Escudero, A. Mayo-Iscar, L. Trapote Reglero

Testing the equality of estimable parameters

M. Romero Madroñal, M. R. Sillero Denamiel, M. D. Jiménez Gamero

Representaciones biplot para el análisis de la coinercia

L. Vicente González, F. J. del Río Olvera, J. L. Vicente Villardón

Kernel K-means clustering of distributional data

Other papers in the same session

Cookie policy