Main take-away

Main take-away#

Wrap-up#

In this module, we presented the framework used in unsupervised learning with clustering, focusing on k-means and how to evaluate its results using both unsupervised and supervised metrics.

We explored the concept of cluster stability, addressed the limitations of k-means when clusters are not convex, and introduced HDBSCAN as an alternative.

Finally, we showed how clustering can be integrated into supervised pipelines to perform unsupervised feature engineering.

To go further#

You can refer to the following scikit-learn examples which are related to the concepts approached in this module: