πŸ“ƒ Solution of Exercise M6.01ΒΆ

The aim of this notebook is to investigate if we can tune the hyperparameters of a bagging regressor and evaluate the gain obtained.

We will load the California housing dataset and split it into a training and a testing set.

from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split

data, target = fetch_california_housing(as_frame=True, return_X_y=True)
target *= 100  # rescale the target in k$
data_train, data_test, target_train, target_test = train_test_split(
    data, target, random_state=0, test_size=0.5)

Note

If you want a deeper overview regarding this dataset, you can refer to the Appendix - Datasets description section at the end of this MOOC.

Create a BaggingRegressor and provide a DecisionTreeRegressor to its parameter base_estimator. Train the regressor and evaluate its statistical performance on the testing set using the mean absolute error.

from sklearn.metrics import mean_absolute_error
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import BaggingRegressor

tree = DecisionTreeRegressor()
bagging = BaggingRegressor(base_estimator=tree, n_jobs=2)
bagging.fit(data_train, target_train)
target_predicted = bagging.predict(data_test)
print(f"Basic mean absolute error of the bagging regressor:\n"
      f"{mean_absolute_error(target_test, target_predicted):.2f} k$")
Basic mean absolute error of the bagging regressor:
36.38 k$

Now, create a RandomizedSearchCV instance using the previous model and tune the important parameters of the bagging regressor. Find the best parameters and check if you are able to find a set of parameters that improve the default regressor still using the mean absolute error as a metric.

Tip

You can list the bagging regressor’s parameters using the get_params method.

for param in bagging.get_params().keys():
    print(param)
base_estimator__ccp_alpha
base_estimator__criterion
base_estimator__max_depth
base_estimator__max_features
base_estimator__max_leaf_nodes
base_estimator__min_impurity_decrease
base_estimator__min_impurity_split
base_estimator__min_samples_leaf
base_estimator__min_samples_split
base_estimator__min_weight_fraction_leaf
base_estimator__random_state
base_estimator__splitter
base_estimator
bootstrap
bootstrap_features
max_features
max_samples
n_estimators
n_jobs
oob_score
random_state
verbose
warm_start
from scipy.stats import randint
from sklearn.model_selection import RandomizedSearchCV

param_grid = {
    "n_estimators": randint(10, 30),
    "max_samples": [0.5, 0.8, 1.0],
    "max_features": [0.5, 0.8, 1.0],
    "base_estimator__max_depth": randint(3, 10),
}
search = RandomizedSearchCV(
    bagging, param_grid, n_iter=20, scoring="neg_mean_absolute_error"
)
_ = search.fit(data_train, target_train)
import pandas as pd

columns = [f"param_{name}" for name in param_grid.keys()]
columns += ["mean_test_score", "std_test_score", "rank_test_score"]
cv_results = pd.DataFrame(search.cv_results_)
cv_results = cv_results[columns].sort_values(by="rank_test_score")
cv_results["mean_test_score"] = -cv_results["mean_test_score"]
cv_results
param_n_estimators param_max_samples param_max_features param_base_estimator__max_depth mean_test_score std_test_score rank_test_score
15 21 1.0 0.8 8 40.030884 1.273278 1
1 14 0.5 0.8 8 41.353784 1.116039 2
6 21 1.0 0.8 7 41.799669 1.521985 3
17 22 0.8 0.8 7 42.216457 1.072362 4
12 21 0.5 0.8 7 42.388688 1.177851 5
19 25 1.0 1.0 7 42.746657 1.276234 6
18 15 0.5 1.0 7 43.014306 1.303823 7
4 24 1.0 1.0 7 43.048924 0.969746 8
2 14 0.5 1.0 7 43.272181 1.227020 9
7 17 0.5 1.0 6 45.258098 1.229740 10
11 15 1.0 1.0 6 45.467947 1.198316 11
14 14 0.5 0.5 7 46.188176 1.689583 12
16 11 0.5 0.8 5 48.486368 1.007336 13
10 11 1.0 0.8 5 49.332554 1.022812 14
13 24 1.0 1.0 4 51.860316 1.123646 15
8 29 0.5 0.8 4 52.360165 1.393773 16
3 15 0.5 0.5 5 52.491499 2.411352 17
0 10 0.5 1.0 3 56.769980 1.116707 18
5 27 1.0 0.8 3 57.312467 1.262450 19
9 14 0.5 0.8 3 58.037170 0.844139 20
target_predicted = search.predict(data_test)
print(f"Mean absolute error after tuning of the bagging regressor:\n"
      f"{mean_absolute_error(target_test, target_predicted):.2f} k$")
Mean absolute error after tuning of the bagging regressor:
40.66 k$

We see that the predictor provided by the bagging regressor does not need much hyperparameter tuning compared to a single decision tree. We see that the bagging regressor provides a predictor for which tuning the hyperparameters is not as important as in the case of fitting a single decision tree.