This seminar aims to increase the links between the different laboratories in Saclay in the field of Applied Maths, Statistics and Machine Learning. The Seminar is organized every first Tuesday of the month with 2 presentations followed by a small refreshment. The localization of the seminar will change to accommodate the different labs.

Organization

Due to access restriction, you need to register for the seminar. A link is provided in the description and should also be sent with the seminar announcement. It will also help us organize for the food quantities. If you think you will come, please register! (even if you are unsure)

To not miss the next seminar, please subscribe to the announcement mailing list palaisien@inria.fr.
You can also add the calendar from the seminar to your own calendar (see below).

Assuming we have i.i.d observations from two unknown probability density functions (pdfs), p and p′, the likelihood-ratio estimation (LRE) is an elegant approach to compare the two pdfs just by relying on the available data, and without knowing the pdfs explicitly. In this paper we introduce a graph-based extension of this problem: Suppose each node v of a fixed graph has access to observations coming from two unknown node-specific pdfs, pv and p′v; the goal is then to compare ...

Assuming we have i.i.d observations from two unknown probability density functions (pdfs), p and p′, the likelihood-ratio estimation (LRE) is an elegant approach to compare the two pdfs just by relying on the available data, and without knowing the pdfs explicitly. In this paper we introduce a graph-based extension of this problem: Suppose each node v of a fixed graph has access to observations coming from two unknown node-specific pdfs, pv and p′v; the goal is then to compare the respective pv and p′v of each node by also integrating information provided by the graph structure. This setting is interesting when the graph conveys some sort of `similarity' between the node-wise estimation tasks, which suggests that the nodes can collaborate to solve more efficiently their individual tasks, while on the other hand trying to limit the data sharing among them. Our main contribution is a distributed non-parametric framework for graph-based LRE, called GRULSIF, that incorporates in a novel way elements from f-divengence functionals, Kernel methods, and Multitask Learning. Among the several applications of LRE, we choose the two-sample hypothesis testing to develop a proof of concept for our graph-based learning framework. Our experiments compare favorably the performance of our approach against state-of-the-art non-parametric statistical tests that apply at each node independently, and thus disregard the graph structure.

We investigate the problem of algorithmic fairness in the case where sensitive and non-sensitive features are available and one aims to generate new, ‘oblivious’, features that closely approximate the non-sensitive features, and are only minimally dependent on the sensitive ones. We study this question in the context of kernel methods. We analyze a relaxed version of the Maximum Mean Discrepancy criterion which does not guarantee full independence but makes ...

We investigate the problem of algorithmic fairness in the case where sensitive and non-sensitive features are available and one aims to generate new, ‘oblivious’, features that closely approximate the non-sensitive features, and are only minimally dependent on the sensitive ones. We study this question in the context of kernel methods. We analyze a relaxed version of the Maximum Mean Discrepancy criterion which does not guarantee full independence but makes the optimization problem tractable. We derive a closed-form solution for this relaxed optimization problem and complement the result with a study of the dependencies between the newly generated features and the sensitive ones. Our key ingredient for generating such oblivious features is a Hilbert-space-valued conditional expectation, which needs to be estimated from data. We propose a plug-in approach and demonstrate how the estimation errors can be controlled. While our techniques help reduce the bias, we would like to point out that no post-processing of any dataset could possibly serve as an alternative to well-designed experiments.

The program and the organization of this seminar is driven by a scientific committee composed of members of the different laboratories in Saclay. The members of the committee are currently: