Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #183 - Support Parsing of MultiPolygon in WKT Format #195

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

roblovelock
Copy link

I was unable to find a way to keep the current functionality backward compatible with Multi point WKT.

Because of this I have added a new "wktArray" function. This will parse all forms of WKT and output the result(s) in an array. It will then be up to the end user to explode the array to get a row per point.

the two lines below will output the same point data, however the second will support multipoint and therefore may create multiple rows.

wkt($"text")("point")

explode(wktArray($"text")("point"))

@codecov-io
Copy link

codecov-io commented Feb 12, 2018

Codecov Report

Merging #195 into master will increase coverage by 0.47%.
The diff coverage is 99.06%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #195      +/-   ##
==========================================
+ Coverage   89.57%   90.05%   +0.47%     
==========================================
  Files          48       48              
  Lines        1564     1639      +75     
  Branches      103      103              
==========================================
+ Hits         1401     1476      +75     
  Misses        163      163
Impacted Files Coverage Δ
src/main/scala/magellan/dsl/package.scala 87.5% <100%> (+0.83%) ⬆️
...che/spark/sql/catalyst/expressions/functions.scala 98.14% <100%> (+0.8%) ⬆️
src/main/scala/magellan/WKTParser.scala 98.41% <98.24%> (+1.04%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 58ebcd9...0f7a1b2. Read the comment docs.

@harsha2010
Copy link
Owner

harsha2010 commented Mar 5, 2018

@roblovelock thanks for working on this
I have a few questions:

  1. why do we need wktArray? as far as I can see, there is a few types here: Point, Polygon, LineString, MultiPoint, MultiLineString, MultiPolygon etc
    the WKT expression already handles Point, Polygon and Polyline,
    What you need to do is be able to also return Array[Point], Array[Polygon] in addition with schema something like StructField("multipoint", Array[PointUDT],...) etc

  2. I notice a function that takes Points and converts to wktArray. Why is this needed? i.e. why interpret a single Point as a Array of size 1?

Basically, wkt($text) should output a struct of the form [point, polygon, polyline, multipoint, multipolygon ] and it will be up to the user to pick the right data structure based on the expected type

I can go ahead and prototype how this looks like and create a PR against yours. Let me know!

@roblovelock
Copy link
Author

I agree.

The reason i introduced "wktArray" was that I had only briefly looked over the code and didn't want to break any backward compatibility.

it looked to me that the output from wkt was a single shape. therefor wktArray would do the same as wkt but always return an array of shapes (even if it was a single point) and I would leave wkt to only return a single shape.

@harsha2010
Copy link
Owner

@roblovelock got it.. yeah you can change the output from wkt to return anything as long as it matches the schema specified in the WKT expression. I'll refactor this PR a bit to make that change today and merge it in
This looks good overall, thanks for working on it!

@Liorba
Copy link

Liorba commented May 17, 2018

@harsha2010 is there is a plan to merge this PR? this is much anticipated addition :-)

@riyadparvez
Copy link

@harsha2010 is there any update on this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants