Skip to content

Releases: ClimbsRocks/auto_ml

Ensembling, evolutionary algorithm hyperparameter search, dependency updates, and performance improvements

12 Sep 03:01
Compare
Choose a tag to compare

Ensembling's back for it's alpha release, evolutionary algorithms are doing our hyperparameter search now, we've handled a bunch of dependency updates, and a bunch of smaller performance tweaks.

Support for Yandex's newly released CatBoost!

19 Jul 16:17
Compare
Choose a tag to compare
v2.4.1

v2.4.1 for catboost support

Prediction intervals

14 Jul 06:09
Compare
Choose a tag to compare

Using quantile regression, we can now return prediction intervals.

Another minor change is adding in a column of absolute changes for feature_responses

Trains GBM and LightGBM iteratively, numpy fixes

09 Jul 21:04
Compare
Choose a tag to compare

LightGBM and sklearn's gbm now use warm_starting or iterative training to find the best number of trees

Some minor bugfixes, and improvements to categorical_ensembling

13 Jun 03:23
Compare
Choose a tag to compare

Avoids double training deep learning models, changes how we sort and order features for analytics reporting, and adds a new _all_small_categories category to categorical ensembling.

Feature responses and silent alpha of predict_uncertainty

06 Jun 02:14
Compare
Choose a tag to compare

Feature responses allows linear-model-like interpretations for non-linear models.

Adds Keras and TensorFlow to requirements

18 May 01:01
Compare
Choose a tag to compare

Avoids mutating input DF
Standardizes examples and tests to use load_ml_model()

Feature learning works with user_input_func

03 May 00:33
Compare
Choose a tag to compare

Adds Feature Learning and Categorical Ensembling

19 Apr 06:00
Compare
Choose a tag to compare

Feature learning and categorical ensembling are really cool features that each get us 2-5% accuracy gains!

For full info, check the docs.

2.0 Release to celebrate progress and code cleanup

04 Apr 01:43
Compare
Choose a tag to compare

Enough incremental improvements have added up that we're now ready to mark a 2.0 release!

Part of the progress also means deprecating a few unused features that were adding unnecessary complexity and preventing us from implementing new features like ensembling properly.

New changes for the 2.0 release:

  • Refactored and cleaned up code. Ensembling should now be much easier to add in, and in a way that's fast enough to be used in production (getting predictions from 10 models should take less than 10x as long as getting predictions from 1 model)
  • Deprecated compute_power
  • Deprecated several methods for grid searching over transformation_pipeline hyperparameters (different methods for feature selection, whether or not to do feature scaling, etc.). We just directly made a decision to prioritize the final model hyperparameter search.
  • Deprecated the current implementation of ensembling. It was implemented in such a way that it was not quick enough to make predictions in prod, and thus, did not meet the primary use cases of this project. Part of removing it allows us to reimplement ensembling in a way that is prod-ready.
  • Deprecated X_test and y_test, except for working with calibrate_final_model.
  • Added better documentation on features that were in silent alpha release previously.
  • Improved test coverage!

Major changes since the 1.0 release:

  • Integrations for deep learning (using TensorFlow and Keras)
  • Integration of Microsoft's LightGBM, which appears to be a possibly better version of XGBoost
  • Quite a bit more user logging, warning, and input validation/input cleaning
  • Quite a few edge case bug fixes and minor performance improvements
  • Fully automated test suite with decent test coverage!
  • Better documentation
  • Support for pandas DataFrames- much more space efficient than lists of dictionaries