Outliers by Design: A Framework for the Controlled Construction of Multivariate Outliers
F. J. Arteaga Moreno, A. González Cebrián, A. J. Ferrer Riquelme
We present DOC, a general framework for the controlled generation of multivariate outliers with explicit control over their statistical deviation from a reference data structure. PCA is used here as a geometric scaffold to describe that structure and to quantify outlyingness through Hotelling’s T² and the Squared Prediction Error (SPE). Starting from a seed observation, DOC defines tailored displacement directions in the model and residual subspaces, with closed-form expressions for the magnitudes required to generate outliers with prescribed T² and SPE values. This allows explicit control over both the type of outlier produced and its degree of outlyingness. The framework also supports the imposition of structural constraints on the generated outliers, such as preserving selected variables or linear combinations of variables. The method enables the construction of reproducible and interpretable outlier profiles for simulation, benchmarking, sensitivity analysis, and teaching.
Keywords: Designed Outlier Construction, multivariate outliers, controlled outlier generation, Hotelling's T², Squared Prediction Error, PCA, structural constraints, linear constraints
Scheduled
Multivariate Analysis
September 2, 2026 3:30 PM
Aula 24
Other papers in the same session
C. Gandia Tortosa, M. J. Nueda Roldán, M. D. Molina Vila, S. García Ponsoda
J. Martín Arevalillo, H. Navarro Veguillas
J. Saperas-Riera, G. Mateu-Figueras
L. Trapote Reglero, L. Á. García-Escudero, A. Mayo Íscar