Open
Description
Filed from a conversation on the public, SDV Slack. This is a lower priority feature request.
Problem Description
The single table ML Efficacy metrics make use of different ML algorithms from sklearn. For example, the MulticlassMLPClassifier
uses this implementation under-the-hood.
The sklearn implementation has many parameters that you can tune, for example max_iter
or activation
. However, all these parameters are hardcoded in SDMetrics
right now without an API to change them.
Expected behavior
I would expect the ability to change the parameters of the underlying sklearn model. Exact API may vary but one possibility is just to provide a dictionary:
from sdmetrics.single_table import MulticlassMLPClassifier
MulticlassMLPClassifier.compute(
test_data=real_data,
train_data=synthetic_data,
target='categorical_column_name',
metadata=metadata
model_parameters={'max_iter': 500, 'activation': 'identity'}
)
Workaround
In the meantime, a workaround is to access the hardcoded parameters from the class, and modify them.
from sdmetrics.single_table import MulticlassMLPClassifier
# hardcode the parameters in the class itself
MulticlassMLPClassifier.MODEL_KWARGS = { 'max_iter': 500, 'activation': 'identity' }
# use the class to compute a score
MulticlassMLPClassifier.compute(
test_data=real_data,
train_data=synthetic_data,
target='categorical_column_name',
metadata=metadata
)