Course presentationΒΆ


The goal of this course is to teach machine learning with scikit-learn to beginners, even without a strong technical background.

Predictive modeling brings value to a vast variety of data, in business intelligence, health, industrial processes and scientific discoveries. It is a pillar of modern data science. In this field, scikit-learn is a central tool: it is easily accessible, yet powerful, and naturally dovetails in the wider ecosystem of data-science tools based on the Python programming language.

This course is an in-depth introduction to predictive modeling with scikit-learn. Step-by-step and didactic lessons introduce the fundamental methodological and software tools of machine learning, and is as such a stepping stone to more advanced challenges in artificial intelligence, text mining, or data science.

The course is more than a cookbook: it will teach you to be critical about each step of the design of a predictive modeling pipeline: from choices in data preprocessing, to choosing models, gaining insights on their failure modes and interpreting their predictions.

Follow the MOOC

This document is a work in progress. You can register to the "Machine learning in Python with scikit-learn MOOC" , based on this document. Note: you will be able to register now but the MOOC itself will start on May 18 2021.
The MOOC is free and the platform does not use the student data for any other purpose than improving the educational material.


The course aims to be accessible without a strong technical background. The requirements for this course are:

  • basic knowledge of Python programming : defining variables, writing functions, importing modules

  • some prior experience with the NumPy, pandas and Matplotlib libraries is recommended but not required.

For a quick introduction on these requirements, you can go through these course materials or use the following resources:

MOOC materialΒΆ

The MOOC material is developed publicly under the CC-By license, including the notebooks, exercises and solutions to the exercises (but not the solutions for the quiz ;) via the following GitHub repository:

This is also published as a static website at:

It is possible to use the rocket icon at the top of each notebook page to interactively execute the code cells via the Binder service.

Note however that it is required to use the version hosted on the fun-mooc platform to complete the quiz.