Skip to content

TargetEncoder#fit() could not accept Pandas DataFrame #11

@momijiame

Description

@momijiame

Abstract

I have confirmed that TargetEncoder#fit() method could not accept Pandas DataFrame.

How to reproduce

The following is reproducible snipet.

>>> import pandas as pd
>>> data = {
...     "group": ["A", "A", "A"],
...     "target": [1, 2, 3],
... }
>>> df = pd.DataFrame(data)
>>> 
>>> import xfeat
>>> encoder = xfeat.TargetEncoder(input_cols=["group"], target_col="target")
>>> 
>>> encoder.fit(df)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/amedama/.virtualenvs/py39/lib/python3.9/site-packages/xfeat/cat_encoder/_target_encoder.py", line 136, in fit
    target_encoder.fit(input_df[col], input_df[self._target_col])
  File "/Users/amedama/.virtualenvs/py39/lib/python3.9/site-packages/xfeat/cat_encoder/_target_encoder.py", line 220, in fit
    raise RuntimeError
RuntimeError

Probably this behavior is not expected.
Because TargetEncoder#fit_transform() method accepts Pandas DataFrame.

>>> encoder.fit_transform(df)
  group  target  group_te
0     A       1       2.5
1     A       2       2.0
2     A       3       1.5

What is happen

There is Incomplete conditional branch in TargetEncoder#fit() method.

https://github.com/pfnet-research/xfeat/blob/v0.1.1/xfeat/cat_encoder/_target_encoder.py#L214,L220

The above only includes considerations for CuDF and NumPy.
Therefore, RuntimeError will be raised by Pandas Series.

Workaround

I could not find it.

Environment

$ python -V
Python 3.9.13
$ pip list | grep xfeat                   
xfeat                         0.1.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions