hgboost’s documentation!

The Hyperoptimized Gradient Boosting library (hgboost), is a Python package for hyperparameter optimization for XGBoost, LightBoost, and CatBoost. HGBoost will carefully split the dataset into a train, test, and an independent validation set. Within the train-test set there is the inner loop for optimizing the hyperparameters using Bayesian optimization (based on Hyperopt) and, the outer loop is to test how well the best-performing models can generalize using an external k-fold cross validation. This approach will select the most robust model with the highest performance.

hgboost is fun because:

    1. It consists three of the most popular decision tree algorithms; XGBoost, LightBoost and Catboost.

    1. It consists the most popular hyperparameter optimization library for Bayesian Optimization; Hyperopt.

    1. An automated manner to split the data set into a train-test and independent validation to reliably determine the model performance.

    1. The pipeline has a nested scheme with an inner loop for hyperparameter optimization and an outer loop with k-fold crossvalidation to determine the most robust and best-performing model.

    1. It can handle both classification and regression tasks.

    1. It is easy to go wild and create a multi-class model or an ensemble of boosted decision tree models.

    1. It takes care of unbalanced datasets.

    1. It aims to create explainable results for the hyperparameter search-space, and model performance results by creating insightful plots.

    1. It is open-source.

    1. It is documented with many examples.

_images/schematic_overview.png

Star is important too!

If you like this project, star this repo at the github page! This is important because only then I know how much you like it :)

Quick install

pip install hgboost

Github

Github hgboost. Please report bugs, issues and feature extensions there.

Citing hgboost

The bibtex can be found in the right side menu at the github page.

Content

Installation

Indices and tables