-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature proposal] Dataframe merge by ID #690
Comments
Are you interested in |
More of a As an edit: This functionality is exactly what I'd like https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.join.html |
We add |
I've got a few different dataframes that I'd like to merge when doing calculating some regression, and right now I do so by converting to a matrix of doubles, aligning the rows by id, and then rebuilding a dataframe. In spark and pandas, they have utility methods that allow you to merge dataframes with a
by
option to specify which column is used to match the data.Describe the solution you'd like
Extend the merge method with either a simple
by
option to specific key to merge on, add amergeWith
method, or aMergeOptions
parameter that contains information such asby
(key to join on), andmergeType
(inner vs outerjoins, left vs right join).https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.merge.html
https://spark.apache.org/docs/latest/api/scala/org/apache/spark/sql/Dataset.html
The text was updated successfully, but these errors were encountered: