-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UntilPageHeaderReader may capture a position where it's not actually a beginning of a page #15
Comments
Do you have any real world data that contains Generally the point of a magic is that it's, well, magic and the sequence that allows you to identify the start of a stream. |
Is your proposal to silently ignore hash mismatches and continue the search? What if there is a legitimate hash mismatch then? I think it'd be better to error in that instance than to silently omit a packet and mess up higher layers :). |
One of the most simple example is when the comment header contains a text "OggS". I agree that finding the pattern in audio packets is rare, probably almost never happen tho. |
My proposal in my brain is to silently skip it. If the skipped packet was within a valid packet position but the page simply contains an error, then we can detect it for the next time we captured a valid packet, since we will see a jump in "page sequence no". |
My concern is about seeking. When parsing pages sequentially, maybe we don't have to care about it so much. But when capturing a page right after seeking, because of the misdetection we may lose multiple pages covered by the first "false page". This may cause a problem when binary searching, and that's why I am worrying about it. Do you have any idea to handle them both? Or am I thinking too much? |
Hmm that's a good point. During normal operation it's not an issue I guess, mainly during seeking.
Nono, I think you are onto a good solution here :). It would be great if for the search of a page header you could specify whether you want pages with hash mismatches to be discarded or not, so that seeking ignores invalid hashes and normal operation when pages strictly follow each other would yield errors at a hash mismatch. I think such a change would be an improvement. Not sure if checking the sequence number is needed. PRs welcome! |
Although extremely rare, the
OggS
magic pattern may occur in the places where a page does not start. CurrentUntilPageHeaderReader
does not check for the integrity it captured, so it may capture a wrong page span and then return falseHashMismatch
error.I have an idea of fixing this issue and currently working on it. It will require a large refactoring, and possibly cause breaking changes.
The text was updated successfully, but these errors were encountered: