Written by Maxime Goin and Le Thy Nguyen, Embassy of Switzerland in France
Data scientists are a scarce and valuable commodity. A study made by Gartner three years ago was already putting up the warning sign: if the public and private education systems would not adapt quickly, there would not be enough talent in the industry and only one-third of the IT jobs would be filled. In Europe, French schools were in the front line to provide a new education frame for “data scientists”. In September 2013, Telecom ParisTech was the first institution in France to launch a “Big Data” Post-Master’s Degree program. Stéphan Clémençon, teacher – researcher at Telecom ParisTech and holder of the “Machine Learning for Big Data” Chair, tells us more about the development of this program and the new challenges ahead.
Why was there a need (in 2013) to create a Master’s program in that field?
At Telecom ParisTech, we have been working for a long time with companies in a wide variety of sectors. Indeed, nearly 100 % of the research activity in our school is funded by the private sector. This gives us valuable insight concerning the needs that our partners have in terms of knowledge and skills. As a matter of fact, companies in the industrial and service sectors have challenged us to produce research results that could help them tackle important issues in the Big Data era. These issues include predictive maintenance, product recommendation, text and voice analysis, social network analysis, etc. These very diverse issues all have something in common: they rely on the gathering, statistical analysis and interpretation of massive amounts of data, either originating from industrial sensors (equipping engines for example), internal sources (customer relationship management…) or the web (social networks, web traffic data, ranking data).
These issues can be cast as statistical machine learning problems we have been working on for many years, but already back in 2013 industrialists realized how important machine-learning methodology and algorithms are for concrete applications. As the framework of these applications (computational constraints, nature of the data, need for real-time analysis, etc.) often raises new challenging questions for academics, we decided to start a research and education Chair, entitled “Machine Learning for Big Data”, together with four industrial partners: PSA Peugeot Citroën, Safran, BNP Paribas and Criteo. Three training programs were created simultaneously to meet as quickly as possible the needs of our partners: a Data Scientist Certificate (22-day program in 10 months), a seven-week MOOC and a Post-Master’s Degree, consisting of about 700 hours’ training in 9 months, plus a 6-month internship and a professional thesis.
What are the specificities of this program? How does it stand out from all the “Big data education programs” in Europe?
Telecom ParisTech’s “Big Data” Post-Master’s is the first program of this kind to have been created in France. It relies on the skills of about a dozen research professors that have been working on Big Data issues for years. It is highly multidisciplinary: of course, focus is on applied maths, statistics and computer science, with courses in machine learning, large scale data mining, Hadoop, SQL and NoSQL, distributed networks, cyber security, data visualization… But there is a whole Economics and Social Sciences component that we think is very important, including law (personal data regulations), economy, econometry, the big data ecosystem…
Another important specificity is the way we are closely working with companies. Every week, one enterprise, ranging from international brands like Total to start-ups like Dataiku, come to give a seminar to the students on how they are currently using big data technologies and what are their current challenges. These companies also provide us with real data and questions on which the students work most of the year. These issues arise again in the 6-months internship. This proximity with a wide range of companies is very valuable for the students, as it gives them real-life data challenges and many job opportunities.
The Post-Master’s program was launched 2 years ago; what is the assessment so far?
Our program is now a reference in its field in France; it won a Faculty award from IBM and was number one in the SMBG ranking in 2015. Our graduates have a very high employment rate, most of them are offered a permanent job following or instead of their internship. We have just started to train the 32 students accepted for the 3rd year running, and we receive every year even more applications.
Is France a leader in this field and why? If not, what initiatives can be taken? And by whom: government, schools or private industry?
In France, as in most European countries, the understanding of the possible deep transformations/innovations brought by big data technologies came later than in the United States. But we are catching up. First of all, as I mentioned, though our education programs do not all bear the words “Big Data” or “Data science”, we have a very high academic level, in particular in applied mathematics and statistics: numerous math students find a position every year within American firms.
Also, the French government has understood the importance of Big Data, specifically including it in different research and industrial programs such as the “34 plans for the New Industrial France”, the “World Innovation Contest 2020”, the “France-Europe Research Strategy 2020” and the “Industry of the Future” Program. These are very good boosters for the data economy and they favor both academic research and private investment.
In France we also have fertile ground for start-ups, with a lot of incubators all over the country. Most of these start-ups rely on Big Data technologies. Taking Telecom ParisTech’s incubator alone, about 12 start-ups were involved in this field in 2015. In addition to that, major French companies have all developed a data strategy. In our school, we have 3 Research Chairs in that field, involving 14 leading companies.
What are the near future challenges? How can Telecom ParisTech tackle them?
Of course there are many. From my point of view, there is one economic and one societal challenge. The former is set by the companies, who need to hire many data scientists. However, there are not yet enough programs that train seasoned engineers who can handle this particular position, able to communicate equally well with IT departments and with core business. This is an economic challenge because as we are entering this new data age, knowing how to handle and take advantage of the data becomes of strategic benefit for companies.
The other challenge I want to point out is societal. Engineers, statisticians and computer scientists now know well what “Big Data” stands for. Yet there are people that hear about Big Data almost every day and for whom it remains often obscure, not to say worrying. We have already started a new stage, by creating three training sessions aimed at CEOs, to raise their awareness of Big Data opportunities and issues. For the general public, we organize free conferences on various topics such as personal data, the Internet of things… and writing papers for mainstream media. We issue leaflets and magazines and get involved in various kinds of events. It is our role, as a public sector school, to make sure technological progress is understood by all and can play its part in the development of our economy and in the improvement of our public services such as health, transportation and education.
Stéphan Clémençon is Teacher-Researcher at Telecom ParisTech since October 2007. He is a member of the TSI Department (Image and Signal Processing) and works in the LTCI Lab (Communication and Information Theory) UMR Institut Telecom/CNRS N° 5141. His main research contributions are in the fields of Markov processes, nonparametric statistics and statistical machine-learning. He holds the “Machine Learning for Big Data” Chair of Telecom ParisTech. He is in charge of the “Big Data” Post-Master’s Degree program and of the “Data Scientist” Certificate. Before that he worked as a Teacher-Researcher at Paris X University (2000-2005) and researcher (Chief Statistics) in the Research Unity INRA Met@risk (2005-07). He was also member of the LPMA lab (Stochastic Modeling and Probability) of Universities Paris 6 and Paris 7 UMR CNRS N° 7799. He received his university education from University Paris 7 Denis Diderot (PhD in Applied Maths, visiting the Department of Statistics of Stanford University, 1997-1998).