Sources that support returnIdsOnly but not returnCountOnly #34

albarrentine · 2017-02-23T20:23:12Z

For the Wuhan, China data source (seem to be getting all the legacy servers lately), pyesridump was only pulling 1001 records when there were 3429 in the full data set.

From this line:

pyesridump/esridump/dumper.py

Line 244 in 5823178

except EsriDownloadError:

, it looks like if the source doesn't support returnCountOnly, the bounding box is recursively subdivided into four quadrants (Quadtree-style) with a stopping condition when there are < maxRecords in a given quadrant.

This should generally retrieve everything, except for the following test in _scrape_an_envelope:
```
if len(features) == max_records:
```
It appears the Wuhan source returns 1001 records where max_records is 1000, which executes the same code as if it had returned 999 results i.e. assumes the base case has been met and returns early. This could be fixed by changing the conditional to:
```
if len(features) >= max_records
```
With the new OID enumeration from Faster method for object ID enumeration for sources that do not support pagination #33, it might make sense to use the quadrant-based method as a fallback only if the source supports neither returnCountOnly nor returnIdsOnly. Otherwise OID enumeration should be fewer queries. Does that make sense or are there some other edge cases to consider?

iandees · 2017-02-28T02:36:43Z

@thatdatabaseguy can you make a PR for the change you mention in (1) above? That seems like a useful thing.

This was referenced Mar 1, 2017

Allow retrieving >= maxRecordCount when doing quadrant partitioning #36

Merged

Use quadrant-based method as a last resort after OID enumeration is tried #37

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sources that support returnIdsOnly but not returnCountOnly #34

Sources that support returnIdsOnly but not returnCountOnly #34

albarrentine commented Feb 23, 2017

iandees commented Feb 28, 2017

Sources that support returnIdsOnly but not returnCountOnly #34

Sources that support returnIdsOnly but not returnCountOnly #34

Comments

albarrentine commented Feb 23, 2017

iandees commented Feb 28, 2017