Add feature_clustering_selection method #209

brunoleme · 2022-08-24T05:15:43Z

Status

READY

Todo list

Documentation
Tests added and passed

Background context

This is a correlation-based feature selection method. But unlike the already existing correlation_feature_selection which does not have a criteria to selected among correlated features, feature_clustering_selection first employs a feature clustering, using absolute correlation as distance metric, following by the selection of the feature with lower 1-R2 metric from each cluster. 1-R2 metric allows to find the feature that most preserve the information (own cluster R2) from the other features from the same clusters, penalizing by the information (nearest cluster R2) present in the nearest cluster.

Description of the changes proposed in the pull request

This commit will add the feature selection method feature_clustering_selection in fklearn/tuning/model_agnostic_fc.py

Where should the reviewer start?

The reviewer should start by method feature_clustering_selection at src/fklearn/tuning/model_agnostic_fc.py
The method test_feature_clustering_selection at fklearn/tests/tuning/test_model_agnostic_fc.py illustrates how is the method usage.

This is a correlation-based feature selection method. But unlike the already existing correlation_feature_selection that does not have a criteria to selected among correlated features, feature_clustering_selection first employs a feature clustering, using absolute correlation as distance metrics, following by the selection of the feature with lower 1-R2 metric from each cluster. 1-R2 metric allows to find the feature that most preserve the information (own cluster R2) from the other features from the same clusters, penalizing by the information (nearest cluster R2) present in the nearest cluster

+PEP8 code styling

brunoleme requested a review from a team as a code owner August 24, 2022 05:15

brunoleme added 4 commits August 24, 2022 10:07

Add feature_clustering_selection method

423bc9a

+PEP8 code styling

Add feature_clustering_selection method

e1caa5d

+PEP8 code styling

Add feature_clustering_selection method

1587a4e

+PEP8 code styling

Add feature_clustering_selection method

0c1bbf4

+PEP8 code styling

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add feature_clustering_selection method #209

Add feature_clustering_selection method #209

brunoleme commented Aug 24, 2022 •

edited

Loading

Add feature_clustering_selection method #209

Are you sure you want to change the base?

Add feature_clustering_selection method #209

Conversation

brunoleme commented Aug 24, 2022 • edited Loading

Status

Todo list

Background context

Description of the changes proposed in the pull request

Where should the reviewer start?

brunoleme commented Aug 24, 2022 •

edited

Loading