-
Notifications
You must be signed in to change notification settings - Fork 71
Don't call all records deleted if record count is not set #4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
the SHP file or SHX file already give a record count. In that case, checking the shape index being > number of records always returns deleted, which is not correct.
@@ -1744,7 +1744,7 @@ int SHPAPI_CALL DBFIsRecordDeleted( DBFHandle psDBF, int iShape ) | |||
/* -------------------------------------------------------------------- */ | |||
/* Verify selection. */ | |||
/* -------------------------------------------------------------------- */ | |||
if( iShape < 0 || iShape >= psDBF->nRecords ) | |||
if( iShape < 0 || (iShape > 0 && iShape >= psDBF->nRecords) ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't that be ? (I changed the iShape > 0 to psDBF->nRecords > 0 since that matches more your rationale + added the comment)
if( iShape < 0 || (iShape > 0 && iShape >= psDBF->nRecords) ) | |
/* Some DBF files are written without a record count (because, say */ | |
/* the SHP file or SHX file already give a record count. In that */ | |
/* case, checking the shape index being > number of records always */ | |
/* returns deleted, which is not correct. */ | |
if( iShape < 0 || ( psDBF->nRecords > 0 && iShape >= psDBF->nRecords) ) |
I didn't check the DBF spec but I don't think a DBF header with nRecords == 0 would be valid ?
And as far as I can see DBFReadAttribute() would have the same issue., and quite a few other places How much is your proposed change tested ;-) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ha ha. Well, in theory it's been tested widely in PostGIS forever, but I only had to make the one change to pass regression... and your change is correct, the comment is right and my code is not. I think our test files are so simple we aren't even seeing the DBFReadAttribute case because we have no attributes in our DBF files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a point in trying to support such broken DBF files, unless you see them in great use in the wild ? I don't have memories of issues reported in GDAL regarding such cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's possible we are generating them ourselves... leave this PR open, I will check tomorrow... somehow we got these shape files in the very early days and added them to our regression suite... https://github.com/postgis/postgis/tree/master/regress/loader
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either we generated them ourselves with early versions of shapelib or maybe created them with ArcView 3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe regenerate them with a recent ogr2ogr ?
So, doing some testing with our regression files... I loaded up a the point file, which looks like this:
So, a table with only a point column. Then dumped it, both with ogr2ogr and with pgsql2shp. The DBF file from ogr2ogr looks like this
It looks like ogr2ogr adds in a FID column during the dump, so ogrinfo returns this:
The DBF file from pgsql2shp on the other hand:
So, this is generated using shapelib, naturally, and it's got the record count all zeroed out. I'm guessing that this is a result of the dumper not trying to write any tuples because there's not any fields. Or maybe writing tuples but with no fields added. Have to confirm still. |
Implication being, we've been generating DBF files with pgsql2shp that have a zero record count in the case of geometry-only tables, and doing so for 18 years... so there are definitely some in the wild. Hm. |
OK, so if we want to support such files in shapelib (and then in GDAL), I would suggest a different approach that will be less invasive than patching all sites where we look for num_records. If the num_records header is read to be 0 at file opening, then compute its value from the file size and record size. |
The size of the DBF file seems invariant to the number of rows... Looking into the dumper, it only calls |
I don't get this. Normally file_size (roughly) = fixed_header_size + number_of_fields * field_description_size + number_of_rows * record_size |
In that case, OGR creates a dummy "FID" field so that there's at least one field |
Closing this stalled PR |
Some DBF files are written without a record count (because, say
the SHP file or SHX file already give a record count. In that
case, checking the shape index being > number of records always
returns deleted, which is not correct.