UntilPageHeaderReader may capture a position where it's not actually a beginning of a page #15

TonalidadeHidrica · 2020-11-15T06:55:36Z

Although extremely rare, the OggS magic pattern may occur in the places where a page does not start. Current UntilPageHeaderReader does not check for the integrity it captured, so it may capture a wrong page span and then return false HashMismatch error.

I have an idea of fixing this issue and currently working on it. It will require a large refactoring, and possibly cause breaking changes.

The text was updated successfully, but these errors were encountered:

est31 · 2020-11-15T07:07:54Z

Do you have any real world data that contains OggS? Any real world use case?

Generally the point of a magic is that it's, well, magic and the sequence that allows you to identify the start of a stream.

est31 · 2020-11-15T07:13:33Z

Is your proposal to silently ignore hash mismatches and continue the search? What if there is a legitimate hash mismatch then? I think it'd be better to error in that instance than to silently omit a packet and mess up higher layers :).

TonalidadeHidrica · 2020-11-15T07:14:18Z

One of the most simple example is when the comment header contains a text "OggS". I agree that finding the pattern in audio packets is rare, probably almost never happen tho.

TonalidadeHidrica · 2020-11-15T07:18:16Z

My proposal in my brain is to silently skip it. If the skipped packet was within a valid packet position but the page simply contains an error, then we can detect it for the next time we captured a valid packet, since we will see a jump in "page sequence no".

TonalidadeHidrica · 2020-11-15T07:24:10Z

My concern is about seeking. When parsing pages sequentially, maybe we don't have to care about it so much. But when capturing a page right after seeking, because of the misdetection we may lose multiple pages covered by the first "false page". This may cause a problem when binary searching, and that's why I am worrying about it. Do you have any idea to handle them both? Or am I thinking too much?

est31 · 2020-11-15T07:29:51Z

One of the most simple example is when the comment header contains a text "OggS".

Hmm that's a good point. During normal operation it's not an issue I guess, mainly during seeking.

Do you have any idea to handle them both? Or am I thinking too much?

Nono, I think you are onto a good solution here :). It would be great if for the search of a page header you could specify whether you want pages with hash mismatches to be discarded or not, so that seeking ignores invalid hashes and normal operation when pages strictly follow each other would yield errors at a hash mismatch. I think such a change would be an improvement. Not sure if checking the sequence number is needed. PRs welcome!

TonalidadeHidrica mentioned this issue Feb 14, 2021

Improve seek #22

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UntilPageHeaderReader may capture a position where it's not actually a beginning of a page #15

UntilPageHeaderReader may capture a position where it's not actually a beginning of a page #15

TonalidadeHidrica commented Nov 15, 2020

est31 commented Nov 15, 2020

est31 commented Nov 15, 2020

TonalidadeHidrica commented Nov 15, 2020

TonalidadeHidrica commented Nov 15, 2020

TonalidadeHidrica commented Nov 15, 2020

est31 commented Nov 15, 2020

UntilPageHeaderReader may capture a position where it's not actually a beginning of a page #15

UntilPageHeaderReader may capture a position where it's not actually a beginning of a page #15

Comments

TonalidadeHidrica commented Nov 15, 2020

est31 commented Nov 15, 2020

est31 commented Nov 15, 2020

TonalidadeHidrica commented Nov 15, 2020

TonalidadeHidrica commented Nov 15, 2020

TonalidadeHidrica commented Nov 15, 2020

est31 commented Nov 15, 2020