Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[5.5.3]: HDF5 reader does not account for collection header size when reading GlobalHeap #1165

Open
1 task done
sunnyguan opened this issue Apr 4, 2023 · 3 comments
Open
1 task done
Labels
bug Something isn't working

Comments

@sunnyguan
Copy link

sunnyguan commented Apr 4, 2023

Versions impacted by the bug

v5.x

What went wrong?

We noticed that when reading String labels from HDF5 file using default H5iospNew, one of the labels had an incorrect value of Bad HeapObject.dataSize=id=16, refCount=0, dataSize=503533375848452, dataPos=116958. After looking into the code, the issue seems to be in ucar.nc2.internal.iosp.hdf5.H5objects.GlobalHeap (some code has been removed for clarity):

String magic = getRandomAccessFile().readString(4);
// ...
version = getRandomAccessFile().readByte();
getRandomAccessFile().skipBytes(3);
sizeBytes = getRandomAccessFile().readInt();
// ...

getRandomAccessFile().skipBytes(4); // pad to 8
int count = 0;
int countBytes = 0;

while (true) {
    o.id = getRandomAccessFile().readShort();
    // ...
    countBytes += dsize + 16;
    // ...
    hos.put(o.id, o);
}

countBytes does not account for the 16 bytes of the header that we already read in. As a result, it reads an extra global heap object past the end of the heap. Unfortunately, if the extra object (probably just garbage?) has an id that happens to be an id of an existing valid object, the entry will be overwritten in the map. There is an extra check in another piece of code that turns this invalid object into "Bad HeapObject.dataSize..." because it's o.dataSize is too large (as it's just random bytes), which is what we see in the returned String label.

Our current way around this is to initialize int countBytes = 16 instead of 0, and it seems to have fixed the issue. It might take me some time to create an example file because this happens pretty rarely, but let me know if this is the correct approach.

Relevant stack trace

No response

Relevant log messages

No response

If you have an example file that you can share, please attach it to this issue.

If so, may we include it in our test datasets to help ensure the bug does not return once fixed?
Note: the test datasets are publicly accessible without restriction.

N/A

Code of Conduct

  • I agree to follow the UCAR/Unidata Code of Conduct
@sunnyguan sunnyguan added the bug Something isn't working label Apr 4, 2023
@JohnLCaron
Copy link
Collaborator

Hi Sonny, do you have an example file where this problem happens? Without it, its hard to judge why this happens sometimes and not others. thanks.

@JohnLCaron
Copy link
Collaborator

Also, what version of the code / branch are you looking at? It doesnt seem to match what I have from github?

@sunnyguan
Copy link
Author

Ah, seems like I was using Intellij's jar inspect instead of the actual source code, I've updated to match the code on GitHub. I'll try to create an example file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants