Hybrid Inferential Cluster Analysis by Statistical Modeling of Dissimilarity Data
M. Vichi, M. Bottazzi Schenone, G. D’Andrea
Cluster Analysis groups statistical units into homogeneous clusters
using multivariate statistics and machine learning. Classical methods
include hierarchical approaches that build dendrograms without
optimizing objective functions (OFs), and non-hierarchical methods that
partition data into k clusters via iterative OF optimization.
Hierarchical methods are often costly and lack explicit statistical
modeling, while model-based clustering requires raw data instead of
dissimilarity matrices.
Hybrid Inferential Cluster Analysis (HICA) is a new methodology that
models dissimilarity matrices, exploiting ultrametricity and
Least-Squares Estimation to optimize an OF for both partitions and
hierarchies. It is "hybrid" because it combines partitioning,
agglomerative, and divisive clustering, starting from a k-cluster
partition and proceeding bottom-up and top-down. It is "inferential"
because resampling-based tests determine partition quality, number of
clusters, and clustering type. Despite quadratic complexity,
parsimonious hierarchies and efficient updates make HICA feasible for
relatively large datasets.
Palabras clave: clustering, hierarchies
Programado
SI Sesión Hispano-Italiana
3 de septiembre de 2026 11:10
Aula 20
Otros trabajos en la misma sesión
G. Loffredo, E. Romano, A. M. Aguilera del Pino, F. Maturo, M. Vidal
F. Porro, M. Restaino, J. E. Ruiz Castro, M. Zenga
A. Sarra, T. Di Battista, A. Evangelista, E. Nissi, N. Di Deo