-
Notifications
You must be signed in to change notification settings - Fork 24
Open
Description
Abstract
I have confirmed that TargetEncoder#fit() method could not accept Pandas DataFrame.
How to reproduce
The following is reproducible snipet.
>>> import pandas as pd
>>> data = {
... "group": ["A", "A", "A"],
... "target": [1, 2, 3],
... }
>>> df = pd.DataFrame(data)
>>>
>>> import xfeat
>>> encoder = xfeat.TargetEncoder(input_cols=["group"], target_col="target")
>>>
>>> encoder.fit(df)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/amedama/.virtualenvs/py39/lib/python3.9/site-packages/xfeat/cat_encoder/_target_encoder.py", line 136, in fit
target_encoder.fit(input_df[col], input_df[self._target_col])
File "/Users/amedama/.virtualenvs/py39/lib/python3.9/site-packages/xfeat/cat_encoder/_target_encoder.py", line 220, in fit
raise RuntimeError
RuntimeErrorProbably this behavior is not expected.
Because TargetEncoder#fit_transform() method accepts Pandas DataFrame.
>>> encoder.fit_transform(df)
group target group_te
0 A 1 2.5
1 A 2 2.0
2 A 3 1.5What is happen
There is Incomplete conditional branch in TargetEncoder#fit() method.
https://github.com/pfnet-research/xfeat/blob/v0.1.1/xfeat/cat_encoder/_target_encoder.py#L214,L220
The above only includes considerations for CuDF and NumPy.
Therefore, RuntimeError will be raised by Pandas Series.
Workaround
I could not find it.
Environment
$ python -V
Python 3.9.13
$ pip list | grep xfeat
xfeat 0.1.1Metadata
Metadata
Assignees
Labels
No labels