LCSL logo
MaLGa logo

MaLGa Seminar Series

We are involved in the organization of the MaLGa Seminar Series, in particular those on Statistical Learning and Optimization. The MaLGa seminars are divided in four main threads, including Statistical Learning and Optimization as well as Analysis and Learning, Machine Learning and Vision, Machine Learning for Data Science.

An up-to-date list of ongoing seminars is available on the MaLGa webpage.

Seminars will be streamed on our YouTube channel.

Optimal transport and gradient flows

Speaker: Giuseppe Savarè
Speaker Affiliation: University of Pavia
Host: Lorenzo Rosasco
Host Affiliation:Laboratory for Computational and Statistical Learning, MIT-IIT

Date: 2019-03-20
Time: 3:00 pm (subject to variability)
Location: DIBRIS- Room 704, VII floor, via Dodecaneso 35, Genova, IT.

Abstract
Monge Optimal Transport problems (1781) have been formulated as a distinguished example of Linear Programming by Kantorovich: his contributions “to the theory of optimum allocation of resources” was awarded the Nobel prize for economics in 1975. More recently, after the pioneering papers by Brenier and by Ambrosio, Evans, McCann, Otto, Villani, Optimal Transport theory attracted a lot of attention and has been developed in many directions, with beatiful applications to probability, statistics, kinetic models, measure theory, functional analysis, partial differential equations, Riemannian geometry. After a brief introduction to the main aspects of the theory, the talk aims to discuss its dynamical formulation and its connection with evolution problems and gradient flows.

Bio
Giuseppe Savaré is professor of Mathematical Analysis at Pavia University since 2000. His current research interests involve Optimal Entropy-Transport problems and variational methods for gradient flows and rate-independent evolutions.

Overview on speech processing: fundamentals, techniques and applications

Speaker: Zied Mnasri
Speaker Affiliation: DIBRIS
Host: Lorenzo Rosasco
Host Affiliation:Laboratory for Computational and Statistical Learning, MIT-IIT

Date: 2019-02-19
Time: 3:00 pm (subject to variability)
Location: DIBRIS- Conference Hall, III floor, via Dodecaneso 35, Genova, IT.

Abstract
Nowadays applications are more and more interactive, which means that an optimal human-machine interaction is required. Then speech is an obvious and a natural way to achieve this goal. However, many challenges are still to be looked after, such as speech/speaker variability, noise and reverberation reduction, emotion recognition/synthesis...etc. Therefore, speech processing applications have been taking advantage from the development of many related fields, such as digital signal processing (DSP), computational linguistics, and more recently machine learning. In this talk, an introduction to speech processing will be presented, including a) fundamental aspects, such as speech signal modeling and representation, b) an overview of speech signal analysis methods, c) some well-known applications, like speech recognition and text-to-speech synthesis, and d) the current and future trends of speech processing using machine learning. Though this topic is too broad to be presented with more depth, the basic notions presented here could be deepened in every direction, to explore the techniques and the possibilities of speech communication.

Bio
Dr. Zied Mnasri obtained his PhD. in Feb. 2011 in electrical engineering. Since Sept. 2011, He has been working as assistant professor at University Tunis El Manar, Tunisia, teaching electronics, microcontrollers and digital signal processing. Since Sept. 2018, he's been a research fellow at DIBRIS, working with Prof. Francesco Masulli and Prof. Stefano Rovetta. His main research interests inlcude speech processing using machine learning.

The multi-agent approach to artificial general intelligence

Speaker: Andrea Tacchetti
Speaker Affiliation: Deep Mind
Host: Lorenzo Rosasco
Host Affiliation:Laboratory for Computational and Statistical Learning, MIT-IIT

Date: 2019-02-18
Time: 3:00 pm (subject to variability)
Location: DIBRIS- Conference Hall, III floor, via Dodecaneso 35, Genova, IT.

Abstract
Human inspired AI aims at constructing agents that are capable of completing complex tasks in diverse environments, and exhibit human-like cognitive flexibility. However, the human capacity for learning is not nearly as adaptive as we usually assume: individual humans cannot even figure out how to survive in our own ancestral ecological niche (hunting and gathering). We owe our success to our uniquely developed ability to learn from others. In this talk I will argue that this observation can be leveraged to advance artificial intelligence research through multi-agent reinforcement learning (MARL). I will outline the main advantages and key challenges of this approach, and I will survey recent results from the MARL research group at DeepMind. The technical part of my talk will focus on multi-agent learning dynamics, social dilemmas, emergent coordination and social influence.

Bio
Andrea Tacchetti is a Research Scientist at DeepMind in London, UK. His research spans across Multi-agent Reinforcement Learning, Social Perception and Relational Reasoning. Before joining DeepMind he obtained his PhD from MIT under the supervision of Prof. Tomaso Poggio. In 2018 Dr. Tacchetti received the APS-select award for distinction in scholarship from The American Physiological Society.

On the Existence and on the Role of Wide Flat Minima in Deep Learning

Speaker: Riccardo Zecchina
Speaker Affiliation: Bocconi University
Host: Lorenzo Rosasco
Host Affiliation:Laboratory for Computational and Statistical Learning, MIT-IIT

Date: 2019-02-05
Time: 2:30 pm
Location: DIBRIS - room 705, VII floor, via Dodecaneso 35, Genova, IT.

Abstract
In this talk we will try to answer to the following question: Were does the propensity to learn efficiently and to generalize well come from in large scale artificial neural networks (ANN)? These two properties, which in principle are independent and often in competition, are known to coexist in deep learning and yet a unifying theoretical framework is missing. We discuss two fundamental features which could represent the building block of such a theory: we first show that ANN possess the peculiar structural property of having very wide flat minima in their weight space which are crucial to achieve good generalization performance and avoid overfitting. These regions are rare and coexist with narrow minima and saddles. Second, we show that the “deep learning” algorithms which have been developed in the last decade do in fact target those rare regions.

Bio
The research interests of Riccardo Zecchina (RZ) lie at the interface between statistical physics, computer science and information theory. His current research interests are focused on machine learning and optimization. RZ obtained the master’s degree in Electronic Engineering from the Politecnico di Torino in 1989 and the PhD in theoretical Physics from the University of Torino in 1993. He has been research scientist and head of the Statistical Physics Group at the International Centre for Theoretical Physics in Trieste (Italy) between 1997 and 2007. In 2007 he moved as full professor to the Politecnico di Torino University. He has been visiting scientist at Microsoft Research (Redmond and Boston) and visiting professor at the University of Orsay. Since 2017 he is full professor at the Bocconi University in Milan, with a chair in Machine Learning. A new lab on Artificial Intelligence has been created in 2019 (www.artlab.unibocconi.it). The papers of RZ can be found on the ArXiv or on Scholar. They have been published on multidisciplinary scientific journals such as Nature, Science, PNAS, Physical Review Letters or on specialized journals in theoretical physics, computer science and applied mathematics. International Awards: - 2016, Lars Onsager prize in Theoretical Statistical Physics by the American Physical Society for the design of new classes of efficient algorithms and for the study of phase transitions in optimization problems. - 2011 Advanced grantee of the European Research Council (ERC) for the project “Optimization and inference algorithms from the theory of disordered systems” 2010-2015.

Weak interactions

Speaker: Andreas Maurer
Host: Ernesto De Vito and Lorenzo Rosasco
Host Affiliation:Laboratory for Computational and Statistical Learning, MIT-IIT

Date: 2018-12-11
Time: 3:00 pm
Location: DIBRIS, Univerity of Genova, Room 705 7th floor, Via Dodecaneso 35, Genova.

Abstract
The talk is about a class of statistics with the following second-order stability property: their variation in any variable does not change much when another variable is modified. Besides linear statistics this class contains U- and V-statistics, Lipschitz L-statistics and some extremal estimators. Respective examples relevant to machine learning would be the area under the ROC curve, smoothened quantiles and various error functionals of l_2-regularized algorithms.These statistics behave in many ways like linear statistics: they obey a version of Bernstein's inequality, their variances can be tightly estimated from iid samples, and there is a Berry-Esseen type bound on normal approximation. There even is a uniform concentration result, which generalizes the popular generalization bounds obtained with Gaussian and Rademacher complexities.

Advances on first order algorithms for constrained optimization problems in Machine Learning

Speaker: Francesco Rinaldi
Speaker Affiliation: University of Padova
Host: Lorenzo Rosasco and Silvia Villa
Host Affiliation:Laboratory for Computational and Statistical Learning, MIT-IIT

Date: 2018-10-19
Time: 2:00 pm
Location: DIBRIS, Univerity of Genova, Room 326 3rd floor, Via Dodecaneso 35, Genova.

Abstract
Thanks to the advent of the Big Data era, simple iterative first-order optimization approaches for constrained optimization have re-gained popularity in the last few years. In the talk, we first review a few classic methods (i.e., conditional and projected gradient method) in the context of Big Data applications. Then, we discuss both theoretical and computational aspects of some variants of those classic methods. Finally, we examine current challenges and future research perspectives.

Bio
Francesco Rinaldi is currently an Associate Professor with the Department of Mathematics, University of Padova, Padova, Italy. He received the M.S. degree in computer engineering and the Ph.D. degree in operations research from the Sapienza University of Rome Italy, in 2005 and 2009, respectively. His current research interests include nonlinear optimization, data mining, and machine learning. He has published over 35 papers in top academic journals including SIAM Journal on Optimization, Mathematical Programming Computation, Mathematics of Operations Research, Bioinformatics, IEEE Transactions, Computational Optimization and Applications, Optimization Methods and Software.

Ranking Median Regression: Learning to Order through Local Consensus

Speaker: Anna Korba
Speaker Affiliation: Télécom ParisTech
Host: Lorenzo Rosasco
Host Affiliation:Laboratory for Computational and Statistical Learning, MIT-IIT

Date: 2018-06-27
Time: 3:00 pm (subject to variability)
Location: DIBRIS- Conference Hall, III floor, via Dodecaneso 35, Genova, IT.

Abstract
In this talk, I will present our recent work which is devoted to the problem of predicting the value taken by a random ranking/permutation Sigma, (e.g. describing the preferences of an individual over a set of items), based on the observation of an explanatory random variable X (e.g. characteristics of the individual). In the probabilistic formulation of the Learning to Order problem we propose, which extends the framework we previously developped for Kemeny ranking aggregation, this boils down to recovering conditional Kemeny medians of Sigma given X. For this reason, this statistical learning problem is referred to as ranking median regression here. Our contribution is twofold. We first propose a probabilistic theory of ranking median regression: the set of optimal elements is characterized, the performance of empirical risk minimizers is investigated in this context and situations where fast learning rates can be achieved are also exhibited. Next we introduce the concept of local consensus/median, in order to derive efficient methods for ranking median regression. The major advantage of this local learning approach lies in its close connection with the ranking aggregation problem. From an algorithmic perspective, this permits to build predictive rules for ranking median regression by implementing efficient techniques for Kemeny median computations at a local level in a tractable manner. In particular, versions of k-nearest neighbor and tree-based methods, tailored to ranking median regression, are investigated.

Bio
I am currently a third-year PhD student at Télécom ParisTech in Paris, France, under the supervision of Stephan Clémençon in the S2A (Signal, Statistics and Learning) team. In 2015, I graduated the Master MVA (Machine Learning and Computer Vision) from ENS Cachan and obtained the engineering degree of ENSAE ParisTech. My main line of research is in statistical machine learning. In my PhD, I study how to analyze preference data in the form of total orders/permutations or pairwise comparisons, and how to handle statistical problems related to such data, including ranking aggregation, distribution estimation, and prediction.

Multi-penalty regularization

Speaker: Rastogi Abhishake
Speaker Affiliation: University of Potsdam
Host: Lorenzo Rosasco and Nicole Muecke
Host Affiliation:Laboratory for Computational and Statistical Learning, MIT-IIT

Date: 2018-06-01
Time: 3:00 pm (subject to variability)
Location: DIBRIS- Conference Hall, III floor, via Dodecaneso 35, Genova, IT.

Abstract
In this talk, I will discuss the convergence issues of the regularization algorithms in learning theory. It is well known that Tikhonov regularization can be profitably used in the context of supervised learning. In semi-supervised framework, manifold regularization is an approach which exploits the geometry of the marginal distribution. We proposed a more general multi-penalty regularization in semi-supervised framework and established the optimal convergence rates in vector-valued function setting. A theoretical analysis of the performance of one-parameter regularization and the multi-penalty regularization over the reproducing kernel Hilbert space will be discussed under the general smoothness assumption. The optimal rates of the regularization schemes are established under some prior assumptions for the joint probability measure on the sample space. Finally, I will discuss an aggregation approach based on the linear functional strategy to combine various estimators. This is joint work with Prof. Dr. S. Sivananthan, IIT Delhi.

Bio
Rastogi Abhishake received his M.Sc. (2013) and Ph.D. (2017) from the Indian Institute of Technology Delhi, India. The title of PhD degree is Convergence analysis of regularization algorithms in learning theory. Currently, he is a postdoctoral researcher at the University of Potsdam, Germany, since March 2018. His research interests include learning theory, regularization algorithms and kernel methods. He is working on the SFB project Nonlinear statistical inverse problems with random observations and he is the author of international publications in refereed journals and conferences.

Fit without fear: an interpolation perspective on modern deep and shallow learning

Speaker: Mikhail Belkin
Speaker Affiliation: Ohio State University
Host: Lorenzo Rosasco and Gian Maria Marconi
Host Affiliation:Laboratory for Computational and Statistical Learning, MIT-IIT

Date: 2018-05-24
Time: 3:00 pm (subject to variability)
Location: DIBRIS- Conference Hall, III floor, via Dodecaneso 35, Genova, IT.

Abstract
A striking feature of modern supervised machine learning is its pervasive over-parametrization. Deep networks contain millions of parameters, often exceeding the number of data points by orders of magnitude. These networks are trained to nearly interpolate the data by driving the training error to zero. Yet, at odds with most theory, they show excellent test performance. It has become accepted wisdom that these properties are special to deep networks and require non-convex analysis to understand.In this talk I will show that classical (convex) kernel methods do, in fact, exhibit these unusual properties. Moreover, kernel methods provide a competitive practical alternative to deep learning, after we address the non-trivial challenges of scaling to modern big data. I will also present theoretical and empirical results indicating that we are unlikely to make progress on understanding deep learning until we develop a fundamental understanding of classical shallow kernel classifiers in the modern interpolated setting. Finally, I will show that ubiquitously used stochastic gradient descent (SGD) is very effective at driving the training error to zero in the interpolated regime, a finding that sheds light on the effectiveness of modern methods and provides specific guidance for parameter selection. These results present a perspective and a challenge. Much of the success of modern learning comes into focus when considered from over-parametrization and interpolation point of view. The next step is to address the basic question of why classifiers in the modern interpolated setting generalize so well to unseen data. Kernel methods provide both a compelling set of practical algorithms and an analytical platform for resolving this fundamental issue. Based on joint work with Siyuan Ma, Raef Bassily, Daniel Hsu, Partha Mitra and Soumik Mandal.

Bio
Mikhail Belkin is a Professor in the departments of Computer Science and Engineering and Department of Statistics at the Ohio State University. He received a PhD from the University of Chicago in Mathematics in 2003. His research focuses on understanding the fundamental structure in data, the principles of recovering these structures and their computational, mathematical and statistical properties. This understanding, in turn, leads to algorithms for dealing with real-world data. A number of his algorithms, including Laplacian Eigenmaps, have been widely used in applications. Prof. Belkin is a recipient of an NSF Career Award and a number of best paper and other awards. He has served on the editorial boards of the Journal of Machine Learning Research and IEEE PAMI.

Learning from random moments

Speaker: Remi Gribonval
Speaker Affiliation: INRIA - Institut National de Recherce en Informatique et Automatique
Host: Lorenzo Rosasco and Gian Maria Marconi
Host Affiliation:Laboratory for Computational and Statistical Learning, MIT-IIT

Date: 2018-04-08
Time: 3:00 pm (subject to variability)
Location: DIBRIS- Conference Hall, III floor, via Dodecaneso 35, Genova, IT.

Abstract
The talk will outline the main features of a recent framework for large scale learning called compressive statistical learning. Inspired by compressive sensing, the framework allows drastic volume and dimension reduction to learn from large/distributed/streamed data collections . Its principle is to compute a low‐dimensional (nonlinear) sketch (a vector of random empirical generalized moments), in essentially one pass on the training collection. For certain learning problems, small sketches have been shown to capture the information relevant to the considered learning task, and empirical learning algorithms have been proposed to learn from such sketches. As a proof of concept, more than a thousands hours of speech recordings can be distilled to a sketch of only a few kilo‐bytes, while capturing enough information estimate a Gaussian Mixture Model for speaker verification. The framework, which is endowed with statistical guarantees in terms of learning error, will be illustrated on sketched clustering, and sketched PCA, using empirical algorithms inspired by sparse recovery algorithms used in compressive sensing. Finally, we will discuss the promises of the framework in terms of privacy aware learning, and its connections with information preservation along pooling layers of certain convolutional neural networks. Joint work with Nicolas Keriven (ENS Paris, France), Yann Traonmilin (Univ Bordeaux, France) and Gilles Blanchard (Universitt Potsdam, Germany).

Decentralized Dictionary Learning over Time-Varying Digraphs

Speaker: Francisco Facchinei
Speaker Affiliation: Università La Sapienza
Host: Lorenzo Rosasco and Gian Maria Marconi
Host Affiliation:Laboratory for Computational and Statistical Learning, MIT-IIT

Date: 2018-03-26
Time: 3:00 pm (subject to variability)
Location: DIBRIS- Conference Hall, III floor, via Dodecaneso 35, Genova, IT.

Abstract
This paper studies Dictionary Learning problems wherein the learning task is distributed over a multi-agent network, modeled as a time-varying directed graph. This formulation is relevant, for instance, in Big Data scenarios where massive amounts of data are collected/stored in different locations (e.g., sensors, clouds) and aggregating and/or processing all data in a fusion center might be inefficient or unfeasible, due to resource limitations, communication overheads or privacy issues. We develop a unified decentralized algorithmic framework for this class of nonconvex problems, and we establish its asymptotic convergence to stationary solutions. The new method hinges on Successive Convex Approximation techniques, coupled with a decentralized tracking mechanism aiming at locally estimating the gradient of the smooth part of the sum-utility. To the best of our knowledge, this is the first provably convergent decentralized algorithm for Dictionary Learning and, more generally, bi-convex problems over (time-varying di)graphs.

Iterate averaging as regularization for stochastic gradient descent

Speaker: Gergely Neu
Speaker Affiliation: Universitat Pompeu Fabr
Host: Lorenzo Rosasco and Gian Maria Marconi
Host Affiliation:Laboratory for Computational and Statistical Learning, MIT-IIT

Date: 2018-03-19
Time: 3:00 pm (subject to variability)
Location: DIBRIS- Conference Hall, III floor, via Dodecaneso 35, Genova, IT.

Learning Hierarchical Representations of Relational Data

Speaker: Maximilian Nickel
Speaker Affiliation: Facebook AI Research New York
Host: Lorenzo Rosasco and Gian Maria Marconi
Host Affiliation:Laboratory for Computational and Statistical Learning, MIT-IIT

Date: 2018-03-02
Time: 3:00 pm (subject to variability)
Location: DIBRIS- Conference Hall, III floor, via Dodecaneso 35, Genova, IT.

Random Feature Expansions for Deep Gaussian Processes

Speaker: Maurizio Filippone
Speaker Affiliation: Eurecom University - AXA
Host: Lorenzo Rosasco and Gian Maria Marconi
Host Affiliation:Laboratory for Computational and Statistical Learning, MIT-IIT

Date: 2018-02-02
Time: 11:30 am
Location: DIBRIS- Conference Hall, III floor, via Dodecaneso 35, Genova, IT.

Abstract
Drawing meaningful conclusions on the way complex real life phenomena work and being able to predict the behavior of systems of interest require developing accurate and highly interpretable mathematical models whose parameters need to be estimated from observations. In modern applications, however, we are often challenged with the lack of such models, and even when these are available they are too computational demanding to be suitable for standard parameter optimization/inference methods. While probabilistic models based on Deep Gaussian Processes (DGPs) offer attractive tools to tackle these challenges in a principled way and to allow for a sound quantification of uncertainty, carrying out inference for these models poses huge computational challenges that arguably hinder their wide adoption. In this talk, I will present our contribution to the development of practical and scalable inference for DGPs, which can exploit distributed and GPU computing. In particular, I will introduce a formulation of DGPs based on random features that we infer using stochastic variational inference. Through a series of experiments, I will illustrate how our proposal enables scalable deep probabilistic nonparametric modeling and significantly advances the state-of-the-art on inference methods for DGPs.

Iterative regularization for general inverse problems

Speaker: Guillaume Garrigos
Speaker Affiliation: CNRS, École Normale Supérieure
Host: Lorenzo Rosasco and Gian Maria Marconi
Host Affiliation:Laboratory for Computational and Statistical Learning, MIT-IIT

Date: 2018-02-29
Time: 02:00 pm
Location: DIBRIS- Conference Hall, III floor, via Dodecaneso 35, Genova, IT.

Abstract
In the context of linear inverse problems, we propose and study a general iterative regularization method allowing to consider large classes of regularizers and data-fit terms. We were particularly motivated by dealing with non-smooth data-fit terms, such like a Kullback-Liebler divergence, or an L1 distance. We treat these problems by designing an algorithm, based on a primal-dual diagonal descent method, designed to solve hierarchical optimization problems. The key point of our approach is that, in presence of noise, the number of iterations of our algorithm acts as a regularization parameter. In practice this means that the algorithm must be stopped after a certain number of iterations. This is what is called regularization by early stopping, an approach which gained in popularity in statistical learning. Our main results establishes convergence and stability of our algorithm, and are illustrated by experiments on image denoising, comparing our approach with a more classical Tikhonov regularization method.

Stochastic Modelling of Urban Structure

Speaker: Mark Girolami
Speaker Affiliation: The Alan Turing Institute, Imperial college of London
Host: Lorenzo Rosasco and Gian Maria Marconi
Host Affiliation:Laboratory for Computational and Statistical Learning, MIT-IIT

Date: 2018-01-22
Time: 03:00 pm
Location: DIBRIS- Conference Hall, III floor, via Dodecaneso 35, Genova, IT.

Abstract
Urban systems are complex in nature and comprise of a large number of individuals that act according to utility, a measure of net benefit pertaining to preferences. The actions of individuals give rise to an emergent behaviour, creating the so-called urban structure that we observe. In this talk, I develop a stochastic model of urban structure to formally account for uncertainty arising from the complex behaviour. We further use this stochastic model to infer the components of a utility function from observed urban structure. This is a more powerful modelling framework in comparison to the ubiquitous discrete choice models that are of limited use for complex systems, in which the overall preferences of individuals are difficult to ascertain. We model urban structure as a realization of a Boltzmann distribution that is the invariant distribution of a related stochastic differential equation (SDE) that describes the dynamics of the urban system. Our specification of Boltzmann distribution assigns higher probability to stable configurations, in the sense that consumer surplus (demand) is balanced with running costs (supply), as characterized by a potential function. We specify a Bayesian hierarchical model to infer the components of a utility function from observed structure. Our model is doubly-intractable and poses significant computational challenges that we overcome using recent advances in Markov chain Monte Carlo (MCMC) methods. We demonstrate our methodology with case studies on the London retail system and airports in England.

Optimization Challenges in Deep Learning

Speaker: Benjamin Recht
Speaker Affiliation: University of California
Host: Lorenzo Rosasco and Ernesto De Vito
Host Affiliation:DIBRIS, Universita' di Genova; Laboratory for Computational and Statistical Learning, MIT-IIT; DIMA, Universita' di Genova

Date: 2017-04-21
Time: 3:00 pm
Location: Conference Room 363, DIBRIS Valletta Puggia. Via Dodecaneso 35, Genova, IT.

Abstract
When training large-scale deep neural networks for pattern recognition, hundreds of hours on clusters of GPUs are required to achieve state-of-the-art performance. Improved optimization algorithms could potentially enable faster industrial prototyping and make training contemporary models more accessible. In this talk, I will attempt to distill the key difficulties in optimizing large, deep neural networks for pattern recognition. In particular, I will emphasize that many of the popularized notions of what make these problems 'hard' are not true impediments at all. I will show that it is not only easy to globally optimize neural networks, but that such global optimization remains easy when fitting completely random data. I will argue instead that the source of difficulty in deep learning is a lack of understanding of generalization. I will provide empirical evidence of high-dimensional function classes that are able to achieve state-of-the-art performance on several benchmarks without any obvious forms of regularization or capacity control. I will close by discussing possible mechanisms to explain generalization in such large models, appealing to insights from linear predictors.

Random design least squares polynomial approximation of high dimensional functions

Speaker: Fabio Nobile
Speaker Affiliation: École Polytechnique Fédérale de Lausanne
Host: Lorenzo Rosasco and Ernesto De Vito
Host Affiliation:DIBRIS, Universita' di Genova; Laboratory for Computational and Statistical Learning, MIT-IIT; DIMA, Universita' di Genova

Date: 2017-04-20
Time: 3:00 pm
Location: Conference Room 363, DIBRIS Valletta Puggia. Via Dodecaneso 35, Genova, IT.

Abstract
We consider a general problem F (u, y) = 0 where u is the unknown solution, possibly Hilbert-space valued, and y a set of uncertain parameters. We specifically address the situation in which the parameter-to-solution map u(y) is smooth, however y could be very high (or even infinite) dimensional. In particular, we are interested in cases in which F is a partial differential operator, u a Hilbert-space valued function and y a distributed, space and/or time varying, random field. We aim at reconstructing the parameter-to-solution map u(y) from noise-free or noisy observations in random points by discrete least squares on polynomial spaces associated to downward closed index sets. The noise-free case is relevant whenever the technique is used to construct metamodels, based on polynomial expansions, for the output of computer experiments. In the case of PDEs with random parameters, the metamodel is then used to approximate statistics of the output quantities. We discuss the stability of (weighted) least squares on random points and present error bounds both in expectation and probability for a priori chosen index sets. We will also discuss theoretical bounds on the minimal error achievable if an optimal choice of the index set for a given sample is performed among all possible downward closed index sets of given cardinality. Finally, we discuss the possibility of exploiting different discretization levels of the underlying PDE in a multi-level fashion to reduce the overall computational cost.

The Balancing Principle in kernel learning for fast rates of convergence

Speaker: Nicole Muecke
Speaker Affiliation: Universitaet Potsdam
Host: Lorenzo Rosasco
Host Affiliation:DIBRIS, Universita' di Genova; Laboratory for Computational and Statistical Learning, MIT-IIT

Date: 2017-04-04
Time: 3:00 pm
Location: Conference Room 363, DIBRIS Valletta Puggia. Via Dodecaneso 35, Genova, IT.

Abstract
The balancing principle is a general method to achieve adaptivity to unknown degrees of smoothness and/or ill-posedness in direct or inverse statistical learning problems. More specifically, we restrict attention to non-parametric supervised kernel learning, for dif- ferent classes of source conditions and classes of decay of eigenvalues of the covariance operator. For such classes of models minimax optimality of rates of convergence have been established (e.g. in ealier work of Caponetto, De Vito, Rosasco but recently also of Blancard/Muecke). The more recent works show minimax optimality for fast rates of convergence (which take into account the eigenvalues of the covariance operator). Both types of models come with a different notion of adaptivity: The earlier models require adaptivity only with respect to unknown smoothness, while for fast rates one aims at adaptivity to the unknown spectral properties of the covariance operator also. In order to get adaptivity in this more refined setting it is crucial to obtain good estimates of the true effective dimension in terms of the empirical effective dimension. We shall present a new version of the balancing principle which serves to give an adaptive estimator (with respect to unknown smoothness and unknown spectral properties) for such classes of models possessing fast rates as minimax optimal rates of convergence.

Optimal rates of estimation for the multi-reference alignment problem

Speaker: Jonathan Weed
Speaker Affiliation: MIT, Department of Mathematics
Host: Lorenzo Rosasco
Host Affiliation:DIBRIS, Universita' di Genova; Laboratory for Computational and Statistical Learning, MIT-IIT

Date: 2017-03-23
Time: 3:00 pm
Location: Conference Room 363, DIBRIS Valletta Puggia. Via Dodecaneso 35, Genova, IT.

Abstract
How should one estimate a signal, given only access to noisy versions of the signal corrupted by unknown circular shifts? This simple problem has surprisingly broad applications, in fields from structural biology to aircraft radar imaging. We describe how this model can be viewed as a multivariate Gaussian mixture model whose centers belong to an orbit of a group of orthogonal transformations. This enables us to derive matching lower and upper bounds for the optimal rate of statistical estimation for the underlying signal. These bounds show a striking dependence on the signal-to-noise ratio of the problem

Date

Speaker

Title

Location

Mar 20, 2019 Giuseppe Savarè Optimal transport and gradient flows Genova
Feb 19, 2019 Zied Mnasri Overview on speech processing: fundamentals, techniques and applications Genova
Feb 18, 2019 Andrea Tacchetti The multi-agent approach to artificial general intelligence Genova
Feb 5, 2019 Riccardo Zecchina On the Existence and on the Role of Wide Flat Minima in Deep Learning Genova
Dec 11, 2018 Andreas Maurer Weak interactions Genova
Oct 19, 2018 Francesco Rinaldi Advances on first order algorithms for constrained optimization problems in Machine Learning Genova
Jun 27, 2018 Anna Korba Ranking Median Regression: Learning to Order through Local Consensus Genova
Jun 1, 2018 Rastogi Abhishake Multi-penalty regularization Genova
May 24, 2018 Mikhail Belkin Fit without fear: an interpolation perspective on modern deep and shallow learning Genova
Apr 8, 2018 Remi Gribonval Learning from random moments Genova
Mar 26, 2018 Francisco Facchinei Decentralized Dictionary Learning over Time-Varying Digraphs Genova
Mar 19, 2018 Gergely Neu Iterate averaging as regularization for stochastic gradient descent Genova
Mar 2, 2018 Maximilian Nickel Learning Hierarchical Representations of Relational Data Genova
Feb 2, 2018 Maurizio Filippone Random Feature Expansion for Deep Gaussian Processes Genova
Jan 29, 2018 Guillaume Garrigos Iterative Regularization for General Inverse Problems Genova
Jan 22, 2018 Mark Girolami Stochastic Modelling of Urban Structure Genova
Apr 21, 2017 Benjamin Recht Optimization Challenges in Deep Learning Genova
Apr 20, 2017 Fabio Nobile Random design least squares polynomial approximation of high dimensional functions Genova
Apr 4, 2017 Nicole Mucke The Balancing Principle in kernel learning for fast rates of convergence Genova
Mar 23, 2017 Jonathan Weed Optimal rates of estimation for the multi-reference alignment problem Genova

Showing 61-80 of 129 results