Data Sciences program
Introduction by Nikos Paragios, head of the program
The proliferation of data management systems along with tremendous progress made over the past decade in terms of computing power have contributed to the creation of a new discipline at the intersection of computer science and applied mathematics (namely statistics, machine learning, optimization), the data science domain.
The main objective is to develop mathematical models and their computational solutions able to reason and interpret massive amount of data where typically the information sought is quite sparse. Leading innovation companies in the new digital era rely on mining, understanding, interpreting of such data towards contentcreation, product development and new services. For example Google, Facebook, and Amazon are examples relying on such technologies to build realtime efficient recommendation systems.
> Find out the detailed presentation of the program
The complexity of the task is mostly due to three important challenges:
 the dimensionality of the data/observations that often is huge. In other words, measurements are massive in terms of dimension and quite often they are of different natures and quite heterogeneous.
 The sparse nature of critical events where one should be able to determine/develop solutions. These events, that are often rare, appear in a nonuniform and a rather nonfrequent manner.
 Last but not least, is the volume of measurements in terms of observations where a colossal amount of data is collected in a continuous fashion. This proposal aims to introduce novel scientific methods to reason/mine/interpret “big data” while addressing the aforementioned challenges. To this end, machine learning, large scale optimization and distributed computing are the core disciplines of training in the datasciences of the new digital innovation era.
Data Sciences Program:
The data science program strats the second year of the engineering curiculum, it is backed by the MAP option, it focuses on offering following courses:

Foundations of Deep Learning:
The advent of big data and powerful computers have made deep learning algorithms the current method of choice for a host of machine learning problems. Over the last few years deep learning systems have been beating with a large margin the previous stateoftheart systems in tasks as diverse as speech recognition, image classification, and object detection.
Deep architectures are composed of multiple levels of nonlinear operations, such as in neural nets with many hidden layers. Searching the parameter space of deep architectures is a difficult task, but learning algorithms such as those for Deep Belief Networks have recently been proposed to tackle this problem with notable success. This course will discuss the motivations and principles regarding learning algorithms for deep architectures, starting from the unsupervised learning of singlelayer models such as Restricted Boltzmann Machines, and moving on to learning deeper models such as Deep Belief Networks.

Foundations of Machine Learning (M1):
It is essentially the intersection between statistics and computation, though the principles of machine learning have been rediscovered from many different traditions, including artificial intelligence, Bayesian statistics, and frequently statistics. This course gives an overview of the most important trends in machine learning, with a particular focus on statistical risk and its minimization with respect to a prediction function. A substantial lab section involves group projects on data science competitions and gives students the ability to apply the course theory to realworld problems.

Foundations of Signal Processing & Sparse Coding (M1):
This class will introduce the mathematical concepts and techniques needed to achieve a solid understanding of the fundamental principles of linear signal processing, as well as recent research on nonlinear signal processing, with a focus on sparse coding. Starting with the fundamentals of linear signal processing, we will see how the main notions of Fourier transforms can be understood in terms of a change of basis, and use this intuition to present both continuous and discrete time signal processing. Moving on from the harmonic basis we will then cover the basics of overcomplete bases, timefrequency analysis and wavelets. This will lead us to techniques developed around sparse coding with overcomplete dictionaries, involving optimization with sparsityinducing norms & dictionary learning.

Foundations of Discrete Optimization (M1):
Discrete optimization is concerned with a subset of optimization problems where some or all of the variables are confined to take a value from a discrete set. In this course, we will study the fundamental concepts of discrete optimization such as greedy algorithms, dynamic programming and minmax relationships. Each concept will be illustrated using wellknown problems such as shortest paths, minimum spanning tree, mincut, maxflow and bipartite matching. We will also identify which problems are easy and which problems are hard, and briefly discuss how to obtain an approximate solution to hard problems.

Foundations of Neural Information Processing:
Neural information processing is the study of computational systems for data understanding. It covers a range of techniques including statistical learning theory, information theory, graphical models, and nonlinear and discrete optimization, as well as their application to important prediction problems facing science and industry. Summarizing some of the major results of the machine learning research community of the past few decades, as well as their interrelationships, this course covers fundamental techniques that can be applied to a wide variety of realworld problems.

Foundations of Geometric Methods in Data Analysis:
Data analysis is the process of cleaning, transforming, modeling or comparing data, in order to infer useful information and gain insights into complex phenomena. From a geometric perspective, when an instance (a physical phenomenon, an individual, etc.) is given as a fixedsized collection of realvalued observations, it is naturally indentified with a geometric point having these observations as coordinates. This course reviews fundamental constructions related to the manipulation of such point clouds, mixing ideas from computational geometry and topology, statistics, and machine learning. The emphasis is on methods that not only come with theoretical guarantees, but also work well in practice. In particular, software references and example datasets will be provided to illustrate the constructions.

Foundations of Polyhedral Combinatorial Optimization:
Polyhedral techniques have emerged as one of the most powerful tools to analyse and solve combinatorial optimization problems. Broadly speaking, combinatorial optimization problems can be formulated as integer linear programs. In this course, we will study the fundamental concepts of polyhedral techniques such as totally unimodular matrices, matroids and submodular functions. Each concept will be illustrated using wellknown problems such as bipartite matching, mincut, maxflow and minimum spanning tree. The course is divided into two parts. In the first part, we will study easy problems (those that admit efficient optimal algorithms). We will use polyhedral techniques to explain why these problems are easy. In the second part, we will study hard problems (specifically, NPhard problems). We will use polyhedral techniques to obtain provably accurate approximate solutions for various hard problems.

Foundations of Large Scale & Distributed Optimization:
In a wide range of application fields (inverse problems, machine learning, computer vision, data analysis, networking,...), large scale optimization problems need to be solved. The objective of this course is to introduce the theoretical background which makes it possible to develop efficient algorithms to successfully address these problems by taking advantage of modern multicore or distributed computing architectures. This course will be mainly focused on nonlinear optimization tools for dealing with convex problems. Proximal tools, splitting techniques and MajorizationMinimization strategies which are now very popular for processing massive datasets will be presented. Illustrations of these methods on various applicative examples will be provided.
Faculty:
 Nikos Paragios  Full professor at the department of applied mathematics of Ecole Centrale de Paris and affiliated research scientist at Inria
 Matthew Blaschko  Affiliated associate professor at the department of applied mathematics of Ecole Centrale de Paris.
 Frédéric Cazals  Professor applied mathematics
 Lionel Gabet  Professor at the department of applied mathematics of Ecole Centrale de Paris
 Iasonas Kokkinos  Associate professor at the department of applied mathematics of Ecole Centrale de Paris and affiliated research at Inria.
 Pawan Kumar  Associate professor at the department of applied mathematics of Ecole Centrale de Paris and affiliated research scientist at Inria.
 Steve Oudot  Permanent research scientist at Inria and and affiliated adjunct professor at the department of computer science at Ecole Polytechnique at at the departement applied mathematics of Ecole Centrale de Paris.
 JeanChristophe Pesquet  Full of professor at the department of computer science of the University of ParisEast and affiliated adjunct professor at the department of applied mathematics of Ecole Centrale de Paris.

Émilie Chouzenoux
 Assistant Professor with the University of ParisEast, ChampssurMarne, France (LIGM, UMR CNRS
8049).
Jobs opportunities:
Application areas related to data science is very large:
 the digital sector,
 health and biotechnology,
 finance,
 marketing,
 robotics,
 insurance...