Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

One-hot encoding as a new alteration #676

Open
Seddryck opened this issue Feb 19, 2022 · 0 comments
Open

One-hot encoding as a new alteration #676

Seddryck opened this issue Feb 19, 2022 · 0 comments
Milestone

Comments

@Seddryck
Copy link
Owner

This alteration is taking the values from a column and is creating a new column for each distinct value found in this column. The value of the newly created columns are set to 0 except for the column matching with the initial column (which is removed). The name of the newly created column is based on the initial name of the column followed by an underscore and the name of the value.

Name Country
John Doe US
Jean Dupont France
Jacques Martin France
Bill Smith US
Mario Rossi Italy
Ashok Kumar India

is transfromed into

Name Country_US Country_France Country_Italy Country_India
John Doe 1 0 0 0
Jean Dupont 0 1 0 0
Jacques Martin 0 1 0 0
Bill Smith 1 0 0 0
Mario Rossi 0 0 1 0
Ashok Kumar 0 0 0 1
@Seddryck Seddryck added this to the v2.1 milestone Feb 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant