-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create a GeoJSON processing pipeline #32
Comments
http://www.prism.oregonstate.edu/projects/public/phm/phm_us_shp.zip
|
This isn't working. The shapefiles use polygons, reasonably, so instead of having one file per point, we have one file per polygon. That's no good. It may be easier to go back to #31 and figure out how to get that projection working. |
I'm trying using SpatiaLite (the SQLite geodata extension), and loading the shapefile directly in there. The thinking is to then issue ~9M queries, one for each coordinate pair, creating a GeoJSON record for each. The catch is that each query for the metadata for a given point requires about 1.5 seconds, or 23 weeks of queries. This could be parallelized, but that hardly seems worth the trouble. SpatiaLite supports spatial indices, but they're just not working for me. e.g.:
There are no useful Google results when searching for information about this error. I've tried creating the table, but then it demands a particular columns, and presumably this would continue until I built an adequate table? Anyway, that doesn't seem like a functioning approach. |
I guess I could always go full AWS and use Postgres with PostGIS extensions, on RDS, to do this. It'd surely be faster. But I don't want to get that AWS lock-in. |
I'm trying out Postgres/PostGIS, using these data-loading instructions. That's gone fine so far (with the surprise requirement that I have to run |
Ah-ha—this works: SELECT gid, id, gridcode, zone
FROM conus
WHERE ST_Contains(
geom, ST_Transform(
ST_GeomFromText(
'POINT(-78.2 38.5)', 4326)
,4269)
)=true; This is running 19–25 ms per query. That comes to 122 hours for 20 million queries, or just over 5 days, which seems pretty manageable. Spot-checking a couple of queries, the geometry appears to be fine, presumably thanks to the 4326/4269 conversion. I feel good about this. |
Any progress? |
No, this was going really badly, for reasons that I didn't document and cannot remember, though in my defense I was on some very powerful painkillers for the entire week, and it's kind of amazing that I wrote any useful code at all? Coincidentally, I was working on this again last night, returning to where I was at with #31 before I closed it. I think the proper thing to do is to solicit help in adjusting the projection that doomed that approach, instead of moving to an entirely different approach. I'll have to set up some test data, document the problem, and tweet about it, in hopes that some clever geodata folks can propose a solution. |
Any progress? |
The text was updated successfully, but these errors were encountered: