# XGBoost hyperparameter tuning in Python using grid search

Fortunately, XGBoost implements the scikit-learn API, so tuning its hyperparameters is very easy.

I assume that you have already preprocessed the dataset and split it into training, test dataset, so I will focus only on the tuning part.

First, we have to import XGBoost classifier and GridSearchCV from scikit-learn.

1
2

from xgboost import XGBClassifier
from sklearn.model_selection import GridSearchCV

After that, we have to specify the constant parameters of the classifier. We need the objective. In this case, I use the “binary:logistic” function because I train a classifier which handles only two classes. Additionally, I specify the number of threads to speed up the training, and the seed for a random number generator, to get the same results in every run.

1
2
3
4
5

estimator = XGBClassifier(
objective= 'binary:logistic',
nthread=4,
seed=42
)

In the next step, I have to specify the tunable parameters and the range of values.

1
2
3
4
5

parameters = {
'max_depth': range (2, 10, 1),
'n_estimators': range(60, 220, 40),
'learning_rate': [0.1, 0.01, 0.05]
}

In the last setup step, I configure the GridSearchCV object. I choose the best hyperparameters using the ROC AUC metric to compare the results of 10-fold cross-validation.

1
2
3
4
5
6
7
8

grid_search = GridSearchCV(
estimator=estimator,
param_grid=parameters,
scoring = 'roc_auc',
n_jobs = 10,
cv = 10,
verbose=True
)

Now, we can do the training.

1

grid_search.fit(X, Y)

Here are the results:

1
2
3
4
5
6
7

Fitting 10 folds for each of 96 candidates, totalling 960 fits
[Parallel(n_jobs=10)]: Using backend LokyBackend with 10 concurrent workers.
[Parallel(n_jobs=10)]: Done 30 tasks | elapsed: 11.0s
[Parallel(n_jobs=10)]: Done 180 tasks | elapsed: 40.1s
[Parallel(n_jobs=10)]: Done 430 tasks | elapsed: 1.7min
[Parallel(n_jobs=10)]: Done 780 tasks | elapsed: 3.1min
[Parallel(n_jobs=10)]: Done 960 out of 960 | elapsed: 4.0min finished

The `best_estimator_`

field contains the best model trained by GridSearch.

1

grid_search.best_estimator_

## You may also like