A High-Dimensional Extension of TCLUST
Outliers distort traditional clustering, leading to unreliable partitions. Robust methods like TCLUST address this by extending trimming to multi-cluster settings. While effective in low dimensions, TCLUST struggles in high-dimensional spaces due to parameter estimation complexity. Robust Linear Grouping (RLG) offers an alternative by assuming clusters lie near lower-dimensional subspaces, yet it fails when subspaces intersect or errors are non-isotropic.
We propose a robust method extending TCLUST by integrating the High Dimensional Data Clustering (HDDC) framework, incorporating trimming and eigenvalue constraints. This approach bridges TCLUST and RLG through a careful adaptation of implementation steps. We present its theoretical properties, a feasible algorithm, and a strategy for selecting input parameters. The methodology's performance is demonstrated via a simulation study and a real-data example, proving its effectiveness in complex, high-dimensional scenarios.
Palabras clave: Robust clustering Trimming High-dimensional data TCLUST Eigenvalue constraints