Statlearn 2019
from April 4, 2019 to April 5, 2019
Invited speakers
Nicolas Bonneel, Liris, University of Lyon 1, FranceLiliana Forzani, Universidad Nacional del Litoral, Argentina
Aapo Hyvarinen, University College London, UK
Julie Josse, Ecole Polytechnique, France
Eyke Hüllermeier, Paderborn University, Germany
Diane Larlus, Naver Labs Europe, France
Gabriel Peyré, ENS Paris, France
Nelly Pustelnik, ENS Lyon, France
Judith Rousseau, University of Oxford, UK
Gaël Varoquaux, INRIA Paris, France
Max Welling, University of Amsterdam and Qualcomm, Netherlands
Abstracts and slides
Nicolas Bonneel, Liris, University of Lyon 1, France
Sliced Partial Optimal Transport
Liliana Forzani, Universidad Nacional del Litoral, Argentina
Partial Least Square: Statistics for the Chemometrics
Partial least squares (PLS) is one of the first methods for prediction in high-dimensional linear regressions in which the sample size need not be large relative to the number of predictors. Since its development, PLS regression has taken place mainly within the chemometrics community, where empirical prediction is the main issue, but PLS is now a core method for big data. However, studies of PLS have appeared in mainline statistics literature only from time to time and there have been no positive results on the theoretical properties of the chemometrics community's use of PLS. In a joint work with R. Dennis Cook we study the theoretical properties of prediction using PLS in the same context that chemometrics community use. This is a joint work with R. Dennis Cook.
Eyke Hüllermeier, Paderborn University, Germany
Analyzing and Learning from Ranking Data: New Problems and Challenges
The analysis of ranking data has a long tradition in statistics, and corresponding methods have been used in various fields of application, such as psychology and the social sciences. More recently, applications in information retrieval and machine learning have caused a renewed interest in the analysis of rankings and topics such as "learning to rank" and preference learning. This talk provides a snapshot of ranking in the field of machine learning, with a specific focus on new problems and challenges from a statistical point of view. In addition to problems of unsupervised learning on ranking data and different types of ranking tasks in the realm of supervised learning, this also includes recent work on preference learning and ranking in an online setting.
Slides
Aapo Hyvarinen, University College London, UK
Nonlinear Independent Component Analysis: a Principled Framework for Unsupervised Deep Learning
Unsupervised learning, in particular learning general nonlinear representations, is one of the deepest problems in machine learning. Estimating latent quantities in a generative model provides a principled framework, and has been successfully used in the linear case, e.g. with independent component analysis (ICA) and sparse coding. However, extending ICA to the nonlinear case has proven to be extremely difficult: A straight-forward extension is unidentifiable, i.e. it is not possible to recover those latent components that actually generated the data. Here, we show that this problem can be solved by using additional information either in the form of temporal structure or an additional, auxiliary variable. We start by formulating two generative models in which the data is an arbitrary but invertible nonlinear transformation of time series (components) which are statistically independent of each other. Drawing from the theory of linear ICA, we formulate two distinct classes of temporal structure of the components which enable identification, i.e. recovery of the original independent components. We show that in both cases, the actual learning can be performed by ordinary neural network training where only the input is defined in an unconventional manner, making software implementations straight-forward. We further generalize the framework to the case where instead of temporal structure, an additional auxiliary variable is observed (e.g. audio in addition to video). Our methods are closely related to "self-supervised" methods heuristically proposed in computer vision, and also provide a theoretical foundation for such methods.
The talk is based on the following papers:
http://www.cs.helsinki.fi/u/ahyvarin/papers/NIPS16.pdf
http://www.cs.helsinki.fi/u/ahyvarin/papers/AISTATS17.pdf
https://arxiv.org/pdf/1805.08651
Julie Josse, Ecole Polytechnique, France
On the Consistency of Supervised Learning with Missing Values
In this work, we show the consistency of two approaches to estimating the regression function.
The most striking one which has important consequences in practice, shows that mean imputation is consistent for supervised learning when missing values are not informative. This is as far as we know the first result justifying this very convenient practice of handling missing values. We then focus on decision trees as they also offer a natural way for empirical risk minimization with missing values, especially when using the missing in attributes method.
Slides
Diane Larlus, Naver Labs Europe, France
Learning Image Representations for Efficient Visual Search
Querying with an example image is a simple and intuitive interface to retrieve relevant information from a collection of images. In general, this is done by computing simple similarity functions between the representation of the visual query and the representations of the images in the collection, provided that the representation is suitable for this task, i.e. assuming that relevant items have similar representations and non-relevant items do not. This presentation will show how to train an embedding function that maps visual content into an appropriate representation space where the previous assumption holds, producing a solution to visual search hat is both effective and computationally efficient.
In a second part, the presentation will move beyond instance-level search and consider the task of semantic image search in complex scenes, where the goal is to retrieve images that share the same semantics as the query image. Despite being more subjective and more complex, one can show that the task of semantically ranking visual scenes is consistently implemented across a pool of human annotators, and that suitable embedding spaces can also be learnt for this task of semantic retrieval.
Gabriel Peyré, ENS Paris, France
Optimal Transport for Machine Learning
Nelly Pustelnik, ENS Lyon, France
Discrete Mumford-Shah Model: from Image Restoration to Graph Analysis
Slides
Soufiane Hayou, A. Doucet and Judith Rousseau, University of Oxford, UK
On the Impact of the Activation Function on Deep Neural Networks Training
Slides
Gaël Varoquaux, INRIA Paris, France
Statistics on Dirty Categories: neither Categories, nor Free Text
Here, we consider statistical analysis directly on non standardized data. We introduce the notion of "Dirty categories", which are neither well separated categories nor natural language. We show that accounting for their string representation helps the statistical analysis, for instance improving the prediction in supervised learning. Finally, we discuss approaches to represent such entries in ways that can be interpreted as categories, without loosing information on the morphological variants. Such data encoding is based on string similarities and character-level modeling. We show that these always improve on the common practice of one-hot encoding.
Slides
Max Welling, University of Amsterdam and Qualcomm, Netherlands
Gauge Fields in Deep Learning
Gauge field theory is the foundation of modern physics, including general relativity and the standard model of physics. It describes how a theory of physics should transform under symmetry transformations. For instance, in electrodynamics, electric forces may transform into magnetic forces if we transform a static observer to one that moves at constant speed. Similarly, in general relativity acceleration and gravity are equated to each other under symmetry transformations. Gauge fields also play a crucial role in modern quantum field theory and the standard model of physics, where they describe the forces between particles that transform into each other under (abstract) symmetry transformations.
In this work we describe how the mathematics of gauge groups becomes inevitable when you are interested in deep learning on manifolds. Defining a convolution on a manifold involves transporting geometric objects such as feature vectors and kernels across the manifold, which due to curvature become path dependent. As such it becomes very difficult to represent these objects in a global reference frame and one is forced to consider local frames. These reference frames are arbitrary and changing between them is called a (local) gauge transformation. Since we do not want our computations to depend on the specific choice of frames we are in turn forced to consider equivariance of our convolutions under gauge transformations. These considerations result in the first fully general theory of deep learning on manifolds, with gauge equivariant convolutions as the necessary key ingredient.
We develop a highly efficient gauge equivariant deep neural network (Unet) for segmentation on a sphere by approximating the sphere by a icosahedron. This model is tested on global climate data as well as omnidirectional indoor scenes data.
Slides
Practical informations
Location
Institut Fourier - Amphi Chabauty
100 Rue des Mathématiques
Campus Universitaire
The Gala Dinner took place in the Fort de la Bastille accessible by cable car with a voucher.
Contact and organization
Scientific committee
- Pierre-Olivier Amblard (CNRS, Université Grenoble Alpes)
- Julyan Arbel (Inria, Université Grenoble Alpes)
- Michaël Blum (CNRS, Université Grenoble Alpes)
- Charles Bouveyron (Université Cote d’Azur & Inria)
- Florent Chatelain (Université Grenoble Alpes)
- Stéphane Girard (Inria,Université Grenoble Alpes)
- Adeline Leclercq-Samson (Université Grenoble Alpes)