-
-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for sklearn Pipelines #171
Comments
Hey there @MyNameIsFu! I believe you can make this work using sklearn's from sklearn.impute import SimpleImputer
from sklearn.pipeline import Pipeline
from prince.mca import MCA
import numpy as np
test_data = pd.DataFrame(data=np.random.random((10, 5)))
test = Pipeline(steps=[
("impute", SimpleImputer()), # This Breaks the Pipeline since it returns an ndarray
("mca", MCA()),
])
test[0].set_output(transform="pandas")
test.fit_transform(test_data) I hope this works for you! |
I added a note to the FAQ on the documentation website. I'll close this :) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
MCA is currently not able to be part of a sklearn Pipeline containing any preceding steps.
In my case I need an Imputer to fill any NaN values.
Working Example:
But including a SimpleImputer results in a numpy array that is being forwarded to the MCA:
I've tried including a dummy transformer step betwen the imputer and MCA that forwards an arbitrary DataFrame with generic index and column labels, but it results in a KeyError with unknown Index labels being searched in the column list:
Any suggestions?
The text was updated successfully, but these errors were encountered: