Skip to Main content Skip to Navigation
Theses

Statistical and Computational Complexities of Robust and High-Dimensional Estimation Problems

Abstract : Statistical learning theory aims at providing a better understanding of the statistical properties of learning algorithms. These properties are often derived assuming the underlying data are gathered by sampling independent and identically distributed gaussian (or subgaussian) random variables. These properties can thus be drastically affected by the presence of gross errors (also called "outliers") in the data, and by data being heavy-tailed. We are interested in procedures that have good properties even when part of the data is corrupted and heavy-tailed, procedures that we call extit{robusts}, that we often get in this thesis by using the Median-Of-Mean heuristic.We are especially interested in procedures that are robust in high-dimensional set-ups, and we study (i) how dimensionality affects the statistical properties of robust procedures, and (ii) how dimensionality affects the computational complexity of the associated algorithms. In the study of the statistical properties (i), we find that for a large range of problems, the statistical complexity of the problems and its "robustness" can be in a sense "decoupled", leading to bounds where the dimension-dependent term is added to the term that depends on the corruption, rather than multiplied by it. We propose ways of measuring the statistical complexities of some problems in that corrupted framework, using for instance VC-dimension. We also provide lower bounds for some of those problems.In the study of computational complexity of the associated algorithm (ii), we show that in two special cases, namely robust mean-estimation with respect to the euclidean norm and robust regression, one can relax the associated optimization problems that becomes exponentially hard with the dimension to get tractable algorithm that behaves polynomially in the dimension.
Complete list of metadata

https://tel.archives-ouvertes.fr/tel-03544727
Contributor : ABES STAR :  Contact
Submitted on : Wednesday, January 26, 2022 - 7:17:14 PM
Last modification on : Saturday, June 25, 2022 - 7:27:45 PM
Long-term archiving on: : Wednesday, April 27, 2022 - 7:33:12 PM

File

101496_DEPERSIN_2021_archivage...
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-03544727, version 1

Collections

Citation

Jules Depersin. Statistical and Computational Complexities of Robust and High-Dimensional Estimation Problems. Statistics [math.ST]. Institut Polytechnique de Paris, 2021. English. ⟨NNT : 2021IPPAG009⟩. ⟨tel-03544727⟩

Share

Metrics

Record views

104

Files downloads

55