12ème Séminaire POC

Organisateur:

Intitulé: 
"Big Data et Optimisation Combinatoire"
Date: 
Vendredi, 5 Décembre, 2014

12ème Séminaire POC

Le VENDREDI 5 DECEMBRE 2014

Au laboratoire LIP6 de l’Université Pierre et Marie Curie - Paris 6
4 place Jussieu Paris 6 CEDEX 05
 

Plus de renseignement sur http://www.lamsade.dauphine.fr/~poc/

Sur le thème

"Big Data et Optimisation Combinatoire"
 
9h30-10h15 Accueil des participants
10h15-11h30 Social Data Exploration

Sihem Amer-Yahia
(Laboratoire d’informatique de Grenoble)

Résumé

Social data exploration is in its infancy. We propose one formalization, the scalable discovery of user groups from collaborative rating datasets. Such datasets are in the form of where u is a user, i an item, and s an integer rating that user u has assigned to item i. Each user and item has a set of attributes that are used to build labeled groups such as Young computer scientists in France and American female designers. We formalize the problem of finding user groups that satisfy or optimize intuitive dimensions such as size, diversity, rating uniformity and coverage. We examine one instance of this problem and describe other instances that have not been solved yet.

11h30-11h45 QUESTIONS
11h45-14h00 REPAS
14h00-15h00 Graphs and Big data : research issues

Hamamache Kheddouci
(Laboratoier LIRIS, Université Lyon 1)

Résumé

The recent advances in data acquisition and production have led to the generation of very large volumes of (complex) data (Mc Kinsey’s report 2011). Indeed, conventional systems are not able to respond to the issues related to the storage, exploration and analysis of Big Data. Graphs are powerful models that can capture different relationships between data, give them a better representation and facilitate their exploration. However, the graphs obtained from these large datasets are usually too large. Their exploration requires new and effective tools. This brings new research issues. In this talk, we will discuss some big graph data locks. We will present some issues that could contribute to big data analytics.

15h00-15h15 QUESTIONS
15h15-15h30 PAUSE
15h30-16h30 Seriation and Ranking : a Spectral Approach

Alexandre d’Aspremont
(École Normale Supérieure, Paris)

Résumé

Joint work with Fajwel Fogel, Rodolphe Jenatton, Francis Bach and Milan Vojnovic

Seriation seeks to reconstruct a linear order between variables using unsorted similarity information. It has direct applications in archeology and shotgun gene sequencing for example. We prove the equivalence between the seriation and the combinatorial 2-sum problem (a quadratic minimization problem over permutations) over a class of similarity matrices. The seriation problem can be solved exactly by a spectral algorithm in the noiseless case and we produce a convex relaxation for the 2-sum problem to improve the robustness of solutions in a noisy setting. This relaxation also allows us to impose additional structural constraints on the solution, to solve semi-supervised seriation problems. We also show how these results can be directly transposed to a ranking problem where a global order must be recovered from pairwise comparisons. We present numerical experiments on archeological data, gene sequences, football and coding competitions.

16h30-16h45 QUESTIONS
16h45 Clôture de la journée