About the seminar

This seminar aims to increase the links between the different laboratories in Saclay in the field of Applied Maths, Statistics and Machine Learning. The Seminar is organized every first Tuesday of the month with 2 presentations followed by a small refreshment. The localization of the seminar will change to accommodate the different labs.

Organization

Due to access restriction, you need to register for the seminar. A link is provided in the description and should also be sent with the seminar announcement. It will also help us organize for the food quantities. If you think you will come, please register! (even if you are unsure)

To not miss the next seminar, please subscribe to the announcement mailing list palaisien@inria.fr.
You can also add the calendar from the seminar to your own calendar (see below).

Next seminars

REGISTER 06 Jan 2026, 12h At Inria Saclay - Amphi Sophie Germain
Clément Bonnet - Comparing and Flowing Labeled Datasets with Optimal Transport
Many applications in machine learning involve data represented as probability distributions. The emergence of such data requires radically novel techniques to design tractable distances and gradient flows on probability distributions over this type of (infinite-dimensional) objects. For instance, being able to flow labeled datasets is a core task for applications ranging from domain adaptation to transfer learning or dataset distillation. In this setting, we propose to represent each class by...
Many applications in machine learning involve data represented as probability distributions. The emergence of such data requires radically novel techniques to design tractable distances and gradient flows on probability distributions over this type of (infinite-dimensional) objects. For instance, being able to flow labeled datasets is a core task for applications ranging from domain adaptation to transfer learning or dataset distillation. In this setting, we propose to represent each class by the associated conditional distribution of features, and to model the dataset as a mixture distribution supported on these classes (which are themselves probability distributions), meaning that labeled datasets can be seen as probability distributions over probability distributions. We endow this space with a metric structure from optimal transport, namely the Wasserstein over Wasserstein (WoW) distance, derive a differential structure on this space, and define WoW gradient flows. The latter enables to design dynamics over this space that decrease a given objective functional. We apply our framework to transfer learning and dataset distillation tasks, leveraging our gradient flow construction as well as novel tractable functionals that take the form of Maximum Mean Discrepancies with Sliced-Wasserstein based kernels between probability distributions. We also introduce a new sliced distance specifically designed to the space of probability over probability distributions through projections using the Busemann function.

Based on this paper and this preprint
Avetik Karagulyan - Federated Langevin Algorithms with Primal, Dual and Bidirectional Compression
Federated sampling algorithms have recently gained great popularity in the community of machine learning and statistics. In this talk we will present the problem of sampling in the federated learning framework and discuss variants of algorithms called Error Feedback Langevin algorithms (ELF) that are designed for this setting. In particular, we will discuss how the combinations of EF21 and EF21-P with the federated Langevin Monte-Carlo improve on prior methods.
Federated sampling algorithms have recently gained great popularity in the community of machine learning and statistics. In this talk we will present the problem of sampling in the federated learning framework and discuss variants of algorithms called Error Feedback Langevin algorithms (ELF) that are designed for this setting. In particular, we will discuss how the combinations of EF21 and EF21-P with the federated Langevin Monte-Carlo improve on prior methods.
REGISTER 03 Feb 2026, 12h At Inria Saclay - Amphi Sophie Germain
Luiz Chamon - The 5 W's and H of constrained learning
Machine learning (ML) and artificial intelligence (AI) now automate entire systems rather than individual tasks. As such, ML/AI models are no longer responsible for a single top-line metric (e.g., prediction accuracy), but must face a growing set of potentially conflicting system requirements, such as robustness, fairness, safety, and alignment with prior knowledge. These challenges are exacerbated in uncertain, data-driven settings and further complicated by the scale and heterogeneity of ...
Machine learning (ML) and artificial intelligence (AI) now automate entire systems rather than individual tasks. As such, ML/AI models are no longer responsible for a single top-line metric (e.g., prediction accuracy), but must face a growing set of potentially conflicting system requirements, such as robustness, fairness, safety, and alignment with prior knowledge. These challenges are exacerbated in uncertain, data-driven settings and further complicated by the scale and heterogeneity of modern ML/AI applications that involve from static, discriminative models (e.g., neural network classifiers) to dynamic, generative models (e.g., Langevin diffusions used for sampling). This keynote defines WHAT constitutes a requirement and explains WHY incorporating them into learning is critical. It then shows HOW to do so using constrained learning and illustrates WHEN and WHERE this approach is effective by presenting use cases in ML for science, safe reinforcement learning, and sampling. Ultimately, this talk aims to show you (WHO) that constrained learning is key to building trustworthy ML/AI systems, enabling a shift from a paradigm of artificial intelligence that is supposed to implicitly emerge from data to one of engineered intelligence that explicitly does what we want.
Charlotte Dion-Blanc - Supervised classification for stochastic processes (in high dimension)
In this talk, I will present some results on supervised classification for continuous-time processes. The idea is to be able to propose some consistent classifier, for different type of temporal data. The methodology relies on mimicking the Bayes rule in the multiclass classification setting, in order to derive a classifier that is consistent as the number of labeled observations increases. I will first consider the case of repeated observations of paths of stochastic differential equations...
In this talk, I will present some results on supervised classification for continuous-time processes. The idea is to be able to propose some consistent classifier, for different type of temporal data.
The methodology relies on mimicking the Bayes rule in the multiclass classification setting, in order to derive a classifier that is consistent as the number of labeled observations increases. I will first consider the case of repeated observations of paths of stochastic differential equations driven by Brownian motion, with different labels.
In this setting, I use a plug-in technique to estimate the conditional probability that the label equals k given the observation, and I will provide some cases in which the optimal rate of convergence is achieved. I will then move on to the case of interacting particle systems of McKean–Vlasov type. This framework generalizes the previous example and raises additional challenges.
The third case I will discuss concerns multivariate Hawkes processes with
K classes, distinguished by the parameters of their intensity functions. Here, the observations are event-time data. The multivariate Hawkes process is a nice way to model interactive agents for which some events are recorded continuously. To derive a consistent classification rule in this context, I will present an Empirical Risk Minimization strategy with a refitting step using a Lasso criterion.

The main novelty of the presented results is that they do not rely on the asymptotic properties of the underlying processes, and that the considered models are high-dimensional.

Scientific Committee

The program and the organization of this seminar is driven by a scientific committee composed of members of the different laboratories in Saclay. The members of the committee are currently:

Funding

This seminar is made possible with financial support of the ENSAE and DataIA.