-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large dataset timeout #53
Comments
I am also experiencing this issue any suggestions on a work around. |
I have had this problem with large datasets as well. In my case I don't need the whole dataset so the workaround I used was to use some additional parameters to query a subset of the data. You can query with a lat/lon bounding box(at least some servers) if you use:
If you need the whole dataset I suppose you could break it up into multiple subsets though i imagine you'd have to also clean up duplicate features. edit If the request uses something like Just make sure you don't overwrite the first dataset you downloaded. Finally, to get the files that were interrupted to work you'll need to edit them a little, but its fairly simple to do in python. first I use:
to read the last 1000 characters to make sure the last feature is complete. It always is but I just want to make sure. Then:
I've only tested this a couple of times but it seems to work and I haven't had to deal with cleaning duplicate data. |
@jacksonvoelkel If the layer has an ID field, then I've found forcing esri2geojson to query by ID range avoids timeouts, you can now do this with |
Thanks for the tips! |
When downloading a very large dataset, esri2geojson encounters this issue:
./esri2geojson https://gismaps.kingcounty.gov/arcgis/rest/services/Property/KingCo_PropertyInfo/MapServer/2 asdf.geojson 2018-05-22 12:52:15,082 - cli.esridump - INFO - Built 615 requests using OID where clause method Traceback (most recent call last): File "./esri2geojson", line 11, in <module> sys.exit(main()) File "/home/<username>/esridump/local/lib/python2.7/site-packages/esridump/cli.py", line 111, in main feature = next(feature_iter) File "/home/<username>/esridump/local/lib/python2.7/site-packages/esridump/dumper.py", line 425, in __iter__ raise EsriDownloadError("Could not connect to URL", e) esridump.errors.EsriDownloadError: ('Could not connect to URL', EsriDownloadError('https://gismaps.kingcounty.gov/arcgis/rest/services/Property/KingCo_PropertyInfo/MapServer/2/query: Could not retrieve this chunk of objects HTTP 500 <html><head><title>Apache Tomcat/7.0.57 - Error report</title><style><!--H1 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;} H2 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;} H3 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;} BODY {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;} B {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;} P {font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;}A {color : black;}A.name {color : black;}HR {color : #525D76;}--></style> </head><body><h1>HTTP Status 500 - </h1><HR size="1" noshade="noshade"><p><b>type</b> Exception report</p><p><b>message</b> <u></u></p><p><b>description</b> <u>The server encountered an internal error that prevented it from fulfilling this request.</u></p><p><b>exception</b> <pre>java.lang.NullPointerException\n</pre></p><p><b>note</b> <u>The full stack trace of the root cause is available in the Apache Tomcat/7.0.57 logs.</u></p><HR size="1" noshade="noshade"><h3>Apache Tomcat/7.0.57</h3></body></html>',))
Multiple runs of the same command download a file between 10mb and 600mb, depending on when the connection is lost. I think it would be very beneficial for esri2geojson to not exit upon this error, but to continue down the queue of batches to download.
The text was updated successfully, but these errors were encountered: