A new look at the statistical model identification, IEEE Transactions on Automatic Control, vol.19, pp.716-723, 1974. ,
A convergence theory for deep learning via over-parameterization, Proceedings of the 36th International Conference on Machine Learning, vol.97, pp.242-252, 2019. ,
The space complexity of approximating the frequency moments, Journal of Computer and system sciences, vol.58, p.1, 2008. ,
Density estimation with quadratic loss: a confidence intervals method, ESAIM: Probability and Statistics, vol.12, pp.438-463, 2008. ,
URL : https://hal.archives-ouvertes.fr/hal-00020740
Pac-bayesian bounds for randomized empirical risk minimizers, Mathematical Methods of Statistics, vol.17, issue.4, pp.279-304, 2008. ,
URL : https://hal.archives-ouvertes.fr/hal-00354922
Concentration of tempered posteriors and of their variational approximations, 2017. ,
On the properties of variational approximations of Gibbs posteriors, The Journal of Machine Learning Research, vol.17, issue.1, pp.8374-8414, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-02403354
Tighter pacbayes bounds, Advances in neural information processing systems, pp.9-16, 2007. ,
Regression depth and center points, Discrete and Computational Geometry, vol.23, issue.3, pp.305-323, 2000. ,
Non-strong mixing autoregressive processes, Journal of Applied Probability, vol.21, issue.4, pp.930-934, 1984. ,
An introduction to mcmc for machine learning, Machine learning, vol.50, issue.1-2, pp.5-43, 2003. ,
A kernel multiple change-point algorithm via model selection, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00671174
Fast learning rates in statistical inference through aggregation, The Annals of Statistics, vol.37, issue.4, pp.1591-1646, 2009. ,
URL : https://hal.archives-ouvertes.fr/hal-00139030
Layered representation of motion video using robust maximum-likelihood estimation of mixture models and mdl encoding, International Conference on Computer Vision, 1995. ,
Approximation of probability distributions by convex mixtures of Gaussian measures, Proceedings of the American of the, vol.138, pp.2619-2628, 2010. ,
Neural networks and principal component analysis: Learning from examples without local minima, Neural Networks, vol.2, pp.53-58, 1989. ,
On Bayesian bounds, Proceedings of ICML, pp.81-88, 2006. ,
Bayesian inference in high-dimensional models, 2020. ,
Rho-estimators revisited, General theory and applications, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01314781
A new method for estimation and model selection: rho-estimation, Inventiones mathematicae, vol.207, issue.2, pp.425-517, 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-00966808
Universal approximation bounds for superpositions of a sigmoidal function, Information Theory, IEEE Transactions on, vol.39, pp.930-945, 1993. ,
The consistency of posterior distributions in nonparametric problems, The Annals of Statistics, vol.27, issue.2, pp.536-561, 1999. ,
Approximation and estimation bounds for artificial neural networks, Machine Learning, vol.14, pp.115-133, 1994. ,
Spectrally-normalized margin bounds for neural networks, Advances in Neural Information Processing Systems, vol.30, pp.6240-6249, 2017. ,
Statistical inference for probabilistic functions of finite state markov chains, Ann. Math. Statist, vol.37, issue.6, pp.1554-1563, 1966. ,
An essay towards solving a problem in the doctrine of chances. Philosophical transactions of the, Royal Society of London, issue.53, pp.370-418, 1763. ,
Tuning tempered transitions, Statistics and computing, vol.22, issue.1, pp.65-78, 2012. ,
Deep rewiring: Training very sparse deep networks, International Conference on Learning Representations, 2018. ,
On the expressive power of deep architectures, Proceedings of the 22Nd International Conference on Algorithmic Learning Theory, ALT'11, pp.18-36, 2011. ,
Minimum hellinger distance estimates for parametric models, The annals of Statistics, vol.5, issue.3, pp.445-463, 1977. ,
Theory of probability, 1917. ,
, Inference in generative models using the wasserstein distance, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01517550
, Bayesian fractional posteriors, 2016.
, On statistical optimality of variational Bayes. PMLR: Proceedings of AISTAT, p.84, 2018.
, Some theoretical properties of GANs, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01737975
Another look at robustness: A review of reviews and some new developments, Scand J. Statis, vol.3, pp.145-168, 1976. ,
An improvement of the NEC criterion for assessing the number of clusters in a mixture model, Pattern Recognition Letters, vol.20, issue.3, pp.267-272, 1999. ,
Approximation dans les espaces métriques et théorie de l'estimation. Annales de l'Institut Henri Poincare (B) Probability and Statistics, vol.65, pp.181-237, 1983. ,
Model selection via testing: an alternative to (penalized) maximum likelihood estimators, 2006. ,
Variational principal components, Proceedings Ninth International Conference on Artificial Neural Networks, ICANN'99, vol.1, pp.509-514, 1999. ,
, Pattern Recognition and Machine Learning, 2006.
A general framework for updating belief distributions, Journal of the Royal Statistical Society: Series B, vol.78, issue.5, pp.1103-1130, 2016. ,
Variational inference: A review for statisticians, Journal of the American Statistical Association, vol.112, issue.518, pp.859-877, 2017. ,
Dynamic topic models, Proceedings of the 23rd International Conference on Machine Learning, pp.113-120, 2006. ,
Latent Dirichlet allocation, The Journal of Machine Learning Research, vol.3, pp.993-1022, 2003. ,
Weight uncertainty in neural networks, Proceedings of the 32Nd International Conference on International Conference on Machine Learning, vol.37, pp.1613-1622, 2015. ,
Concentration inequalities using the entropy method, Ann. Probab, vol.31, issue.3, pp.1583-1614, 2003. ,
, Concentration inequalities, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00751496
Model-based clustering of highdimensional data: a review, Computational Statistics and Data Analysis, vol.71, pp.52-78, 2014. ,
URL : https://hal.archives-ouvertes.fr/hal-00750909
Statistical modeling: The two cultures (with comments and a rejoinder by the author), Statistical science, vol.16, issue.3, pp.199-231, 2001. ,
Consistent estimates and zero-one sets, The Annals of Mathematical Statistics, vol.35, issue.1, pp.157-161, 1964. ,
Statistical inference for generative models via maximum mean discrepancy, 2019. ,
Streaming variational Bayes, NIPS, pp.1727-1735, 2013. ,
Introduction to online optimization. Lecture notes (Princeton University), 2011. ,
Quasi-Monte Carlo variational inference, Proceedings of the 35th International Conference on Machine Learning, vol.80, pp.668-677, 2018. ,
Sparse density estimation with 1 penalties, Conference on Computational Learning Theory, 2007. ,
URL : https://hal.archives-ouvertes.fr/hal-00160850
Spades and mixture models, The Annals of Statistics, vol.38, issue.4, pp.2525-2558, 2010. ,
URL : https://hal.archives-ouvertes.fr/hal-00514124
Optimal estimation and rank detection for sparse spiked covariance matrices. Probability Theory and Related Fields, vol.161, pp.781-815, 2015. ,
, Universal boosting variational inference, 2019.
Scalable variational inference for Bayesian variable selection in regression, and its accuracy in genetic association studies, Bayesian analysis, vol.7, issue.1, pp.73-108, 2012. ,
Simultaneous dimension reduction and clustering via the nmf-em algorithm, 2017. ,
Bayesian nonparametrics, convergence and limiting shape of posterior distributions. Habilitation à diriger des recherches, 2014. ,
URL : https://hal.archives-ouvertes.fr/tel-01096755
Bayesian linear regression with sparse priors, The Annals of Statistics, vol.43, issue.5, pp.1986-2018, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01226832
Statistical Learning Theory and Stochastic Optimization. Saint-Flour Summer School on Probability Theory, Lecture Notes in Mathematics, 2001. ,
URL : https://hal.archives-ouvertes.fr/hal-00104952
PAC-Bayesian supervised classification: the thermodynamics of statistical learning, Institute of Mathematical Statistics Lecture Notes-Monograph Series, vol.56, 2007. ,
URL : https://hal.archives-ouvertes.fr/hal-00206119
Challenging the empirical mean and empirical variance: a deviation study, Annales de l'IHP Probabilités et statistiques, vol.48, issue.4, pp.1148-1185, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00517206
Dimension free PAC-Bayesian bounds for the estimation of the mean of a random vector. PAC-Bayesian trends and insights, NIPS-2017 Workshop (Almost) 50 Shades of Bayesian Learning, 2017. ,
Handbook of Mixture Analysis, 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01928103
Prediction, learning, and games, 2006. ,
Gaussian Kullback-Leibler approximate inference, The Journal of Machine Learning Research, vol.14, issue.1, pp.2239-2286, 2013. ,
A minimum description length approach to hidden markov models with poisson and gaussian emissions. application to order identification, Journal of Statistical Planning and Inference, vol.139, issue.3, pp.962-977, 2009. ,
An optimal randomized algorithm for maximum tukey depth, Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms, 2004. ,
Robust covariance and scatter matrix estimation under huber's contamination model, The Annals of Statistics, vol.46, issue.5, pp.1932-1960, 2018. ,
High-dimensional robust mean estimation in nearly-linear time, Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, pp.2755-2771, 2019. ,
, Fast mean estimation with sub-gaussian rates, 2019.
Consistency of ELBO maximization for model selection, Proceedings of The 1st Symposium on Advances in Approximate Bayesian Inference, vol.96, pp.11-31, 2019. ,
, Convergence rates of variational inference in sparse deep learning, 2019.
Consistency of variational Bayes inference for estimation and model selection in mixtures, Electronic Journal of Statistics, vol.12, issue.2, pp.2995-3035, 2018. ,
Finite sample properties of parametric MMD estimation: robustness to misspecification and dependence, 2019. ,
MMD-Bayes: Robust Bayesian estimation via Maximum Mean Discrepancy, Proceedings of The 2nd Symposium on Advances in Approximate Bayesian Inference, vol.118, pp.1-21, 2020. ,
A generalization bound for online variational inference, Proceedings of The Eleventh Asian Conference on Machine Learning, vol.101, pp.662-677, 2019. ,
Robust statistical learning with lipschitz and convex loss functions. Probability Theory and Related Fields, vol.0, pp.1-44, 2019. ,
URL : https://hal.archives-ouvertes.fr/hal-01923033
Minimax estimation of a p-dimensional linear functional in sparse Gaussian models and robust estimation of the mean, 2017. ,
1-bit matrix completion: PAC-Bayesian analysis of a variational approximation, Machine Learning, vol.107, issue.3, pp.579-603, 2018. ,
Generalization and robustness of batched weighted average algorithm with V-geometrically ergodic Markov data, International Conference on Algorithmic Learning Theory, pp.264-278, 2013. ,
Approximation by superpositions of a sigmoidal function, Mathematics of Control, Signals, and Systems (MCSS), vol.2, pp.303-314, 1989. ,
Provable Bayesian inference via particle mirror descent, AISTAT, pp.985-994, 2016. ,
On the exponentially weighted aggregate with the laplace prior, The Annals of Statistics, vol.46, issue.5, pp.2452-2478, 2018. ,
Optimal kullback-leibler aggregation in mixture density estimation by maximum likelihood, 2017. ,
Aggregation by exponential weighting and sharp oracle inequalities, Learning Theory, vol.4539, pp.97-111, 2007. ,
URL : https://hal.archives-ouvertes.fr/hal-00160857
The effect of job loss on overweight and drinking, Journal of Health Economics, 2011. ,
Weak dependence: With examples and applications, 2007. ,
URL : https://hal.archives-ouvertes.fr/hal-00141567
Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society. Series B (Methodological), vol.39, issue.1, pp.1-38, 1977. ,
Robust subgaussian estimation of a mean vector in nearly linear time, 2019. ,
Sub-gaussian mean estimators, The Annals of Statistics, vol.44, issue.6, pp.2695-2725, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01204519
Combinatorial methods in density estimation, 2001. ,
Robustly learning a gaussian: Getting optimal error, efficiently, Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, pp.2683-2702, 2018. ,
, Robust estimators in high dimensions without the computational intractability. Foundations of Computer Science (FOCS), 2016 IEEE 57th Annual Symposium, 2016.
, Recent advances in algorithmic high-dimensional robust statistics, 2019.
Statistical query lower bounds for robust estimation of high-dimensional gaussians and gaussian mixtures, 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS), pp.73-84, 2017. ,
List-decodable robust mean estimation and learning mixtures of spherical gaussians, Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, 2018. ,
Efficient algorithms and lower bounds for robust linear regression, 2018. ,
Fast approximation of Kullback-Leibler distance for dependence trees and hidden Markov models, IEEE Signal Processing Letters, vol.10, issue.4, pp.115-118, 2003. ,
Provable gradient variance guarantees for black-box variational inference, Advances in Neural Information Processing Systems, pp.328-337, 2019. ,
The automatic robustness of minimum distance functionals, The Annals of Statistics, pp.552-586, 1988. ,
Pathologies of some minimum distance estimators. The Annals of Statistics, pp.587-608, 1988. ,
Application of the theory of martingales. Le Calcul des Probabilités et ses Applications, Colloques Internationaux du CNRS, issue.13, pp.23-27, 1949. ,
Sequential Monte Carlo Methods in Practice, 2001. ,
A tutorial on particle filtering and smoothing: Fifteen years later. Handbook of Nonlinear Filtering, p.12, 2009. ,
Mixing: properties and examples, vol.85, 1994. ,
A new weak dependence condition and applications to moment inequalities, Stochastic Processes and their Applications, p.84, 1999. ,
Gradient descent finds global minima of deep neural networks, Proceedings of the 36th International Conference on Machine Learning, vol.97, pp.1675-1685, 2019. ,
Computationally efficient robust estimation of sparse functionals, 2017. ,
, Training generative neural networks via maximum mean discrepancy optimization, 2015.
Recent Advances in Algorithmic Differentiation, 2014. ,
Kernel bayes' rule: Bayesian inference with positive definite kernels, Journal of Machine Learning Research, vol.14, pp.3002-3048, 2013. ,
Variational inference based on robust divergences, Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, vol.84, pp.813-822, 2018. ,
Uncertainty in Deep Learning, 2016. ,
Robust estimation and generative adversarial nets, 2019. ,
Efficient semiparametric estimation and model selection for multidimensional mixtures, Electronic Journal of Statistics, vol.12, issue.1, pp.703-740, 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01345919
Sparsity regret bounds for individual sequences in online linear regression, The Journal of Machine Learning Research, vol.14, issue.1, pp.729-769, 2013. ,
URL : https://hal.archives-ouvertes.fr/inria-00552267
Pac-bayesian theory meets bayesian inference, Advances in Neural Information Processing Systems, pp.1884-1892, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01324072
Consistency issues in bayesian nonparametric asymptotic, Nonparametrics and Time Series: A Tribute to Madan Lal Puri, pp.639-667, 1999. ,
Convergence rates of posterior distributions, Annals of Statistics, pp.500-531, 2000. ,
Fundamentals of nonparametric Bayesian inference, vol.44, 2017. ,
Convergence rates of posterior distributions for noniid observations, The Annals of Statistics, vol.35, issue.1, pp.192-223, 2007. ,
Entropies and rates of convergence for maximum likelihood and bayes estimation for mixtures of normal densities, Annals of Statistics, pp.1233-1263, 2001. ,
Robust Bayes estimation using the density power divergence, The Annals of Statistics, pp.500-531, 2016. ,
Robust pca and pairs of projections in a hilbert space, Electronic Journal of Statistics, vol.11, issue.2, pp.3903-3926, 2017. ,
Robust dimension-free gram operator estimates, Bernoulli, vol.11, issue.2, pp.3864-3923, 2018. ,
, Deep Learning, 2016.
Generative adversarial nets, Advances in Neural Information Processing Systems, vol.27, pp.2672-2680, 2014. ,
Practical variational inference for neural networks, Advances in Neural Information Processing Systems, vol.24, pp.2348-2356, 2011. ,
A kernel two-sample test, Journal of Machine Learning Research, vol.13, pp.723-773, 2012. ,
A fast, consistent kernel two-sample test, Advances in neural information processing systems, pp.673-681, 2009. ,
, Deep neural network approximation theory, 2019.
Model selection based on minimum description length, Journal of Mathematical Psychology, vol.44, issue.1, pp.133-152, 2000. ,
Inconsistency of bayesian inference for misspecified linear models, and a proposal for repairing it, Bayesian Analysis, vol.12, issue.4, pp.1069-1103, 2017. ,
A primer on PAC-Bayesian learning, 2019. ,
URL : https://hal.archives-ouvertes.fr/hal-01983732
Approximated Bayesian inference for massive streaming data, 2013. ,
Large sample properties of generalized method of moments estimators, Econometrica: Journal of the Econometric Society, pp.1029-1054, 1982. ,
Model selection and the principle of minimum description length, Journal of the American Statistical Association, vol.96, issue.454, pp.746-774, 2001. ,
On the minimax optimality and superiority of deep neural network learning over sparse parameter spaces, 2019. ,
Introduction to online convex optimization, Foundations and Trends R in Optimization, vol.2, issue.3-4, pp.157-325, 2016. ,
Approximating the Kullback Leibler divergence between Gaussian mixture models, IEEE International Conference on Acoustics, Speech and Signal Processing, p.4, 2007. ,
Keeping the neural networks simple by minimizing the description length of the weights, Proceedings of the Sixth Annual Conference on Computational Learning Theory, COLT '93, pp.5-13, 1993. ,
Online learning for latent dirichlet allocation, advances in neural information processing systems, pp.856-864, 2010. ,
Stochastic variational inference, The Journal of Machine Learning Research, vol.14, issue.1, pp.1303-1347, 2013. ,
Distribution-robust mean estimation via smoothed random perturbations, 2019. ,
, Pac-bayes under potentially heavy tails, 2019.
Bayesian model robustness via disparities, Test, vol.23, issue.3, pp.556-584, 2014. ,
Sub-gaussian mean estimation in polynomial time, 2019. ,
Loss minimization and parameter estimation with heavy tails, JMLR, vol.17, pp.1-40, 2016. ,
Robust estimation of a location parameter. The annals of mathematical statistics, vol.35, pp.73-101, 1964. ,
Practical bounds on the error of Bayesian posterior approximations: A nonasymptotic approach, 2018. ,
Deep neural networks learn non-smooth functions effectively, Proceedings of Machine Learning Research, vol.89, pp.869-878, 2019. ,
, Risk-sensitive variational bayes: Formulations and bounds, 2019.
, Asymptotic consistency of ?-rényiapproximate posteriors, 2019.
An information theoretic interpretation of variational inference based on the mdl principle and the bits-back coding scheme, 2017. ,
Random generation of combinatorial structures from a uniform distribution, Theoretical Computer Science, vol.43, pp.186-188, 1986. ,
Principles of Bayesian inference using general divergence criteria, Advances in Neural Information Processing Systems (NeurIPS), pp.262-271, 2018. ,
A linear-time kernel goodness-of-fit test, Advances in Neural Information Processing Systems, pp.262-271, 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-01527717
An introduction to variational methods for graphical models, Machine Learning, vol.37, pp.183-233, 1999. ,
Efficient algorithms for online decision problems, Journal of Computer and System Sciences, vol.71, issue.3, pp.291-307, 2005. ,
A new approach to linear filtering and prediction problems, Transactions of the ASME-Journal of Basic Engineering, vol.82, pp.35-45, 1960. ,
Deep learning without poor local minima, Advances in Neural Information Processing Systems, vol.29, pp.586-594, 2016. ,
Effect of depth and width on local minima in deep learning, Neural Computation, vol.31, issue.6, pp.1462-1498, 2019. ,
Fast and scalable Bayesian deep learning by weight-perturbation in Adam, Proceedings of the 35th International Conference on Machine Learning, vol.80, pp.2611-2620, 2018. ,
Conjugate-computation variational inference: Converting variational inference in non-conjugate models to inferences in conjugate models, Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, vol.54, pp.878-887, 2017. ,
Fast yet simple natural-gradient descent for variational inference in complex models, 2018. ,
Auto-encoding variational Bayes, International Conference on Learning Representations, 2013. ,
, Generalized variational inference, 2019.
Robust moment estimation and improved clustering via sum of squares, Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, 2018. ,
, , 2010.
, Adaptive Bayesian density estimation with location-scale mixtures, Electronic Journal of Statistics, vol.4, pp.1225-1257
On some inequalities for the gamma function, Advances in Dynamical Systems and Applications, vol.8, pp.261-267, 2013. ,
Agnostic estimation of mean and covariance, Foundations of Computer Science (FOCS), 2016 IEEE 57th Annual Symposium, 2016. ,
Mémoire sur les approximations des formules qui sont fonctions de très grands nombres et sur leur applications aux probabilités, 1810. ,
Mémoire sur la probabilité des causes par les évènements. Mémoires de Mathematique et de Physique, Presentés à l'Académie Royale des Sciences, Par Divers Savans & Lus Dans ses Assemblées, pp.621-656, 1774. ,
On the assumptions used to prove asymptotic normality of maximum likelihood estimates, 1970. ,
Convergence of estimates under dimensionality restrictions, The Annals of Statistics, vol.1, pp.38-53, 1973. ,
On local and global properties in the theory of asymptotic normality of experiments, Stochastic processes and related topics (Proc. Summer Res. Inst. Statist. Inference for Stochastic Processes, vol.1, pp.13-54, 1974. ,
Robust classification via mom minimization, 2018. ,
Deep learning, Nature, vol.521, issue.7553, pp.436-444, 2015. ,
Gradient-based learning applied to document recognition, Proceedings of the IEEE, pp.2278-2324, 1998. ,
Monk-outlier-robust mean embedding estimation by median-of-means, International Conference on Machine Learning, 2019. ,
URL : https://hal.archives-ouvertes.fr/hal-01705881
Information theory and mixing least-squares regressions. Information Theory, IEEE Transactions on, vol.52, pp.3396-3410, 2006. ,
Generative moment matching networks, International Conference on Machine Learning, pp.1718-1727, 2015. ,
The weighted majority algorithm. Information and computation, vol.108, pp.212-261, 1994. ,
Jointly embedding multiple single-cell omics measurements, BioRxiv, p.644310, 2019. ,
URL : https://hal.archives-ouvertes.fr/hal-02444746
Théorèmes limites pour des suites positivement ou faiblement dépendantes, 1998. ,
Learning sparse neural networks through l 0 -regularization, International Conference on Learning Representations, 2018. ,
Risk minimization by median-of-means tournaments, 2016. ,
Mean estimation and regression under heavy-tailed distributions-a survey, 2019. ,
Model selection principles in misspecified models, Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol.76, issue.1, pp.141-167, 2013. ,
Bayesian methods for adaptive models, 1992. ,
A practical bayesian framework for backpropagation networks, Neural Computation, vol.4, issue.3, pp.448-472, 1992. ,
Lectures from the 33rd Summer School on Probability Theory, Lecture Notes in Mathematics, vol.1896, 2003. ,
Some PAC-Bayesian theorems, Machine Learning, vol.37, pp.355-363, 1999. ,
On the method of bounded differences, Surveys of Combinatorics. Mathematical Society Lecture Notes Series, vol.141, 1989. ,
Model-based clustering, Journal of Classification, vol.33, issue.3, pp.331-373, 2016. ,
Robust estimation via minimum distance methods, vol.55, pp.73-89, 1981. ,
Expectation propagation for approximate bayesian inference, Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence, UAI '01, pp.362-369, 2001. ,
Geometric median and robust estimation in banach spaces, Bernoulli, vol.21, pp.2308-2335, 2015. ,
Slang: Fast structured covariance approximations for bayesian deep learning with natural gradient, Advances in Neural Information Processing Systems, vol.31, pp.6245-6255, 2018. ,
Online linear optimization with the log-determinant regularizer, IEICE Transactions on Information and Systems, vol.101, issue.6, pp.1511-1520, 2018. ,
Kernel mean embedding of distributions: A review and beyond. Foundations and Trends R in Machine Learning, vol.10, pp.1-141, 2017. ,
Robust bayesian inference via ?-divergence, Communications in Statistics-Theory and Methods, pp.1-18, 2019. ,
Variational learning for Gaussian mixture models, IEEE Transactions on Systems, Man, and Cybernetics, vol.36, pp.849-862, 2006. ,
Bayesian learning for neural networks, 1995. ,
Sampling from multimodal distributions using tempered transitions, Statistics and computing, vol.6, issue.4, pp.353-366, 1996. ,
Robust stochastic approximation approach to stochastic programming, SIAM Journal on optimization, vol.19, issue.4, pp.1574-1609, 2009. ,
URL : https://hal.archives-ouvertes.fr/hal-00976649
Problem complexity and method efficiency in optimization, 1983. ,
A PAC-bayesian approach to spectrally-normalized margin bounds for neural networks, International Conference on Learning Representations, 2018. ,
Online variational Bayesian inference: Algorithms for sparse gaussian processes and theoretical bounds, Time Series Workshop, 2017. ,
, , 2017.
On the loss landscape of a class of deep neural networks with no bad local valleys, International Conference on Learning Representations, 2019. ,
Computational aspects of fitting mixture models via the expectation-maximization algorithm, Computational Statistics & Data Analysis, vol.56, issue.12, pp.3843-3864, 2012. ,
The variational gaussian approximation revisited, Neural computation, vol.21, pp.786-92, 2008. ,
Practical deep learning with bayesian principles, Advances in Neural Information Processing Systems, pp.4289-4301, 2019. ,
A mixture model approach to detecting differentially expressed genes with microarray data, Functional & Integrative Genomics, vol.3, pp.117-124, 2003. ,
K2-abc: Approximate Bayesian computation with kernel embeddings, Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, vol.51, p.51, 2016. ,
Minimum distance estimation: a bibliography, Communications in Statistics: Theory and Methods, vol.10, issue.12, pp.1205-1224, 1981. ,
Minimum distance and robust estimation, Journal of the American Statistical Association, vol.75, issue.371, pp.616-624, 1980. ,
Pacbayes bounds with data dependent priors, Journal of Machine Learning Research, vol.13, pp.3507-3531, 2012. ,
Optimal approximation of piecewise smooth functions using deep relu neural networks, Neural Networks, 2017. ,
Computational optimal transport. Foundations and Trends R in Machine Learning, vol.11, pp.355-607, 2019. ,
On model selection, Lecture Notes-Monograph Series, vol.38, pp.1-57, 2001. ,
Probably approximate Bayesian computation: nonasymptotic convergence of abc under misspecification, 2017. ,
PAC-Bayesian AUC classification and scoring, Advances in Neural Information Processing Systems, vol.27, pp.658-666, 2014. ,
Inference and evaluation of the multinomial mixture model for text clustering, Information Processing & Management, vol.43, pp.1260-1280, 2007. ,
URL : https://hal.archives-ouvertes.fr/hal-00080133
On mcdiarmid's concentration inequality. Electronic Communications in Probability, vol.18, 2013. ,
Asymptotic theory of weakly dependent random processes, 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-02063543
Inégalités de hoeffding pour les fonctions lipschitziennes de suites dépendantes, Comptes Rendus de l'Académie des Sciences -Series I -Mathematics, vol.330, issue.10, pp.905-908, 2017. ,
Modeling by shortest data description, Automatica, vol.14, issue.5, pp.465-471, 1978. ,
Posterior concentration rates for infinite dimensional exponential families, Bayesian Analysis, vol.7, issue.2, pp.311-334, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00634432
The Bayesian choice: from decision-theoretic foundations to computational implementation, 2007. ,
, , 2013.
, Monte Carlo statistical methods
Posterior concentration for sparse deep learning, Advances in Neural Information Processing Systems, vol.31, pp.930-941, 2018. ,
The power of deeper networks for expressing natural functions, 6th International Conference on Learning Representations, 2018. ,
A central limit corollary and a strong mixing condition, Proc. Natl, pp.43-47, 1956. ,
On the frequentist properties of bayesian nonparametric methods, Annual Review of Statistics and Its Application, vol.3, pp.211-231, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01252919
Learning Representations by Back-propagating Errors, Nature, vol.323, issue.6088, pp.533-536, 1986. ,
Optimal aggregation of affine estimators, Proceedings of the 24th Annual Conference on Learning Theory, vol.19, pp.635-660, 2011. ,
URL : https://hal.archives-ouvertes.fr/hal-00654251
Online model selection based on the variational Bayes, Neural computation, vol.13, issue.7, pp.1649-1681, 2001. ,
, Nonparametric regression using deep neural networks with relu activation function. arXiv, 2017.
On bayes procedures. Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete, vol.4, pp.10-26, 1965. ,
Estimating the dimension of a model, The Annals of Statistics, vol.6, issue.2, pp.461-464, 1978. ,
Pac-bayesian analysis of contextual bandits, Advances in Neural Information Processing Systems, pp.1683-1691, 2011. ,
PAC-Bayesian analysis of co-clustering and beyond, Journal of Machine Learning Research, vol.11, pp.3595-3646, 2010. ,
Online learning and online convex optimization. Foundations and Trends R in Machine Learning, vol.4, pp.107-194, 2012. ,
A pac analysis of a bayesian estimator, Proceedings of the Tenth Annual Conference on Computational Learning Theory, COLT '97, pp.2-9, 1997. ,
A PAC analysis of a Bayesian estimator, Tenth annual conference on Computational learning theory, vol.6, pp.2-9, 1997. ,
Asymptotic normality of semiparametric and nonparametric posterior distributions, Journal of the American Statistical Association, vol.97, issue.457, pp.222-235, 2002. ,
Excess risk bounds for the bayes risk using variational inference in latent gaussian models, Advances in Neural Information Processing Systems, vol.30, pp.5151-5161, 2017. ,
Mastering the game of go without human knowledge, Nature, vol.550, p.354, 2017. ,
Batch and on-line parameter estimation of Gaussian mixtures based on the joint entropy, Advances in Neural Information Processing Systems 11, 1999. ,
Learning via Hilbert Space Embedding of Distributions, 2008. ,
and bickson, Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pp.707-715, 2011. ,
No bad local minima: Data independent training error guarantees for multilayer neural networks, 2016. ,
Dropout: A simple way to prevent neural networks from overfitting, Journal of Machine Learning Research, vol.15, pp.1929-1958, 2014. ,
Local minima and plateaus in hierarchical structures of multilayer perceptrons, Neural Networks, p.13, 2000. ,
Bayesian inference of Gaussian mixture models with noninformative priors, 2014. ,
Shared segmentation of natural scenes using dependent pitman-yor processes, Advances in Neural Information Processing Systems, pp.1585-1592, 2009. ,
PAC-Bayesian bound for Gaussian process regression and multiple kernel additive model, Conference on Learning Theory, pp.8-9, 2012. ,
Fast generalization error bound of deep learning from a kernel perspective, Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, vol.84, pp.1397-1406, 2018. ,
Adaptivity of deep reLU network for learning in besov and mixed smooth besov spaces: optimal rate and curse of dimensionality, International Conference on Learning Representations, 2019. ,
Calibrating general posterior credible regions, Biometrika, vol.106, issue.2, pp.479-486, 2018. ,
, Algorithmic Theory of ODEs and Sampling from Well-conditioned Logconcave Densities. arXiv e-prints, 2018.
Spike and slab variational inference for multi-task and multiple kernel learning, Advances in Neural Information Processing Systems, vol.24, pp.2339-2347, 2011. ,
Minimax estimation of kernel mean embeddings, Journal of Machine Learning Research, vol.18, issue.1, pp.3002-3048, 2017. ,
, Variational sparse coding. Conference on Uncertainty in Artificial Intelligence, 2019.
Normalized flat minima: Exploring scale invariant definition of flat minima for neural networks using PAC-Bayesian analysis, 2019. ,
Introduction to Nonparametric Estimation, 2008. ,
Mathematics and the picturing of data, Proceedings of the International Congress of Mathematicians, 1975. ,
, Asymptotic Statistics. Cambridge Series in Statistical and Probabilistic Mathematics, 2000.
Rényi divergence and kullback-leibler divergence, IEEE Transactions on Information Theory, vol.60, issue.7, pp.3797-3820, 2014. ,
Principles of risk minimization for learning theory, Advances in neural information processing systems, pp.831-838, 1992. ,
Bayesian leave-oneout cross validation approximations for gaussian latent variable models, Journal of Machine Learning Research, p.17, 2014. ,
Understanding priors in Bayesian neural networks at the unit level, Proceedings of the 36th International Conference on Machine Learning, vol.97, pp.6458-6467, 2019. ,
URL : https://hal.archives-ouvertes.fr/hal-02177151
, Wahrscheinlichkeitsrechnung. Vienna: Deuticke, 1931.
Aggregating strategies, Proceedings of the Third Annual Workshop on Computational Learning Theory, 1990. ,
On bayesian consistency, Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol.63, issue.4, pp.811-821, 2001. ,
Online variational inference for the hierarchical Dirichlet process, Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pp.752-760, 2011. ,
Frequentist consistency of variational Bayes, Journal of the American Statistical Association, pp.1-85, 2018. ,
Using gaussian mixtures with unknown number of components for mixed model estimation, 14th International Workshop on Statistical Modeling, 1999. ,
The minimum distance method, The Annals of Mathematical Statistics, vol.28, issue.1, pp.75-88, 1957. ,
Optimal estimation of gaussian mixtures via denoised method of moments, 2018. ,
Can the strengths of AIC and BIC be shared? A conflict between model identification and regression estimation, Biometrika, vol.92, issue.4, pp.937-950, 2005. ,
Error bounds for approximations with deep relu networks, Neural Networks, p.94, 2016. ,
Rates of convergence of minimum distance estimators and kolmogorov's entropy. The Annals of Statistics, pp.768-774, 1985. ,
Bayesian gradient descent: Online variational Bayes learning with increased robustness to catastrophic forgetting and weight pruning, 2018. ,
Understanding deep learning requires rethinking generalization, International Conference on Learning Representations, 2017. ,
, Convergence rates of variational posterior distributions, 2017.
From -entropy to kl-entropy: Analysis of minimum information complexity density estimation, The Annals of Statistics, vol.34, issue.5, pp.2180-2210, 2006. ,
, InfoVAE: Information maximizing variational autoencoders. arXiv 1706, p.2262, 2017.
Online convex programming and generalized infinitesimal gradient ascent, Proceedings of the Twentieth International Conference on International Conference on Machine Learning, ICML'03, pp.928-935, 2003. ,