P. García Arce, R. Naveiro, D. Ríos Insua

The vulnerability of machine learning models to adversarial attacks remains a critical security challenge. Traditional defenses, such as adversarial training, typically robustify models by minimizing a worst-case loss. However, these deterministic approaches do not account for uncertainty in the adversary’s attack. While stochastic defenses placing a probability distribution on the adversary exist, they often lack statistical rigor and fail to make explicit their underlying assumptions. We introduce a formal Bayesian framework that models adversarial uncertainty through a stochastic channel, articulating all probabilistic assumptions. This yields two strategies: a proactive defense
enacted during training, aligned with adversarial training, and a reactive defense enacted during operations, aligned with adversarial purification. Several previous defenses can be recovered as limiting cases of our model. We validate our methodology, showcasing the benefits of modeling uncertainty.

Keywords: Adversarial, Bayesian, Machine Learning

Scheduled

GT Inferencia Bayesiana: Sesión de Jóvenes Bayesianos en honor a Mª Eugenia Castellanos
September 5, 2026  10:00 AM
Aula 20


Other papers in the same session


Cookie policy

We use cookies in order to be able to identify and authenticate you on the website. They are necessary for the correct functioning of it, and therefore they can not be disabled. If you continue browsing the website, you are agreeing with their acceptance, as well as our Privacy Policy.

Additionally, we use Google Analytics in order to analyze the website traffic. They also use cookies and you can accept or refuse them with the buttons below.

You can read more details about our Cookie Policy and our Privacy Policy.