Statistic methods and machine learning: core skills to master
Christophe Croux is a Professor of Data Science at EDHEC Business School. His current research involves the detection of anomalies in large and complex data sets and the development of forecasting methods for big data. He teaches Statistical Methods and Machine Learning as part of the MSc in Data Analytics & Artificial Intelligence.
What can you tell us about your field of expertise?
My field of expertise is twofold. I’m developing methods that are outlier-resistant and studying methods for the prediction of high-dimensional time series.
Is there a must-read for students on your domain of expertise?
No, the field evolves too rapidly. You do not read books. I have never read a book on data science in my entire professional career!
But there are a few prerequisites to performing well in my classes: you must master mathematical notations, probabilities and have an analytical mindset.
Why is it important for Data Analytics & Artificial Intelligence students to understand outlier detection in data sets? Can you give us a concrete example of its impact?
The results of a regression analysis may become useless if there are several outliers in the data. This is because traditional methods give too much importance to these outliers, while it is better to downweigh them. This is what we do when we propose methods that are resistant to outliers. Such methods may also detect the outliers.
What are the key concepts that students will learn in your classes?
Variability in the data is often underestimated. Instead of a single prediction, you get a full prediction interval, for instance. In my Statistical Methods class, students learn how to quantify uncertainty.
On completion of my course in Machine Learning, they have learned how to undertake prediction when the number of predictor variables is extremely large and how to measure prediction performance. The latter is not so easy, as you predict values that you do not know yet, so you don’t know the prediction error.
What are the key skills that students will gain in your classes?
Our students will learn how to apply statistical thinking and learn to work with Python, a coding language, which requires precision and care. We chose Python as most industries use it.
What do you expect them to have mastered on completion of your classes?
I expect them to know how to conduct a data analysis in a sound way and report the results to non-experts. I often insist on using short sentences when making a PowerPoint presentation. Communication skills are important.
Within a year, we bring students from zero to a reasonably good level of applying data analytical methods. Their master’s projects are good examples of what they can deliver and there are pretty good. This MSc programme is very effective!