Introduction

Introduction#

Course presentation#

Welcome!

The goal of this course is to teach machine learning with scikit-learn to beginners, even without a strong technical background.

Predictive modeling brings value to a vast variety of data, in business intelligence, health, industrial processes and scientific discoveries. It is a pillar of modern data science. In this field, scikit-learn is a central tool: it is easily accessible, yet powerful, and naturally dovetails in the wider ecosystem of data-science tools based on the Python programming language.

This course is an in-depth introduction to predictive modeling with scikit-learn. Step-by-step and didactic lessons introduce the fundamental methodological and software tools of machine learning, and is as such a stepping stone to more advanced challenges in artificial intelligence, text mining, or data science.

The course is more than a cookbook: it will teach you to be critical about each step of the design of a predictive modeling pipeline: from choices in data preprocessing, to choosing models, gaining insights on their failure modes and interpreting their predictions.

Follow the MOOC

The latest version of the MOOC here "Machine learning in Python with scikit-learn" , is available for self-paced learning and is continuously updated to run with the latest version of scikit-learn. Enroll for the full MOOC experience (quiz solutions, executable notebooks, discussion forum, etc ...) !
The MOOC is free and the platform does not use the student data for any other purpose than improving the educational material.

Prerequisites#

The course aims to be accessible without a strong technical background. The requirements for this course are:

basic knowledge of Python programming : defining variables, writing functions, importing modules
some prior experience with the NumPy, pandas and Matplotlib libraries is recommended but not required.

For a quick introduction on these requirements, you can go through these course materials or use the following resources:

MOOC material#

The MOOC material is developed publicly under the CC-BY license.

You can cite us through the project’s Zenodo archive using the following DOI: 10.5281/zenodo.7220306.

The following repository includes the notebooks, exercises and solutions to the exercises (but not the quizzes’ solutions ;):

INRIA/scikit-learn-mooc

The MOOC material is also published as a static website at:

https://inria.github.io/scikit-learn-mooc/

It is possible to use the rocket icon at the top of each notebook page to interactively execute the code cells via the Binder service.

The videos are available as YouTube playlist at the Inria Learning Lab channel:

https://www.youtube.com/playlist?list=PL2okA_2qDJ-m44KooOI7x8tu85wr4ez4f

Note however that it is required to use the version hosted on the fun-mooc platform to complete the quizzes.