GitHub

gbt is a library for gradient boosted trees with minimal coding required. It is a thin wrapper around lightgbm.

What you need:

a pandas dataframe,
the target column to predict on,
categorical feature columns (can be empty),
numerical feature columns (can be empty, but you should have at least one categorical or numerical feature),
params_preset: "binary", "multiclass", "mape", "l2" to specify what type of prediction objective and default hyperparameters to use.

You don't need to (though you are welcome to):

normalize the numerical feature values
construct the encoder to one-hot encode categorical features
manage saving of artifacts for above feature transformation
implement evaluation metrics

Install

pip install gbt

Quickstart

train_df = pd.DataFrame(
    {
        "a": [1, 2, 3, 4, 5, 6, 7],
        "b": ["a", "b", "c", None, "e", "f", "g"],
        "c": [1, 0, 1, 1, 0, 0, 1],
        "some_other_column": [0, 0, None, None, None, 3, 3],
    }
)

class DatasetBuilder:
    def training_dataset(self):
        return train_df
    
    def testing_dataset(self):
        return train_df  # TODO: use an actual test dataset

TrainingPipeline(
    params_preset="binary",  # one of mape, l2, binary, multiclass
    params_override={"num_leaves": 10},
    label_column="c",
    val_size=0.2,  # fraction of the validation split
    categorical_feature_columns=["b"],
    numerical_feature_columns=["a"],
).fit(DatasetBuilder())

Output of training

The output includes:

the model file (decision trees and boosting coefficients),
a feature transformer state file (in JSON)

Using the model to predict on new data

Under construction.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.github		.github
gbt		gbt
tests		tests
.editorconfig		.editorconfig
.gitignore		.gitignore
.pylintrc		.pylintrc
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
requirements_dev.txt		requirements_dev.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Install

Quickstart

Output of training

Using the model to predict on new data

About

Releases 1

Packages

Languages

License

zzsi/gbt

Folders and files

Latest commit

History

Repository files navigation

Install

Quickstart

Output of training

Using the model to predict on new data

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages