-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: avoid panic when reading corrupt index in TSM file #24033
base: gw-master-1.x-24033
Are you sure you want to change the base?
Conversation
Added two checks for indexOfsStart,and count against corruption, to avoid influxdb exit with panic. on start. |
tsdb/engine/tsm1/reader.go
Outdated
@@ -1233,6 +1233,10 @@ func (d *indirectIndex) UnmarshalBinary(b []byte) error { | |||
return fmt.Errorf("indirectIndex: not enough data for index entries count") | |||
} | |||
count := int32(binary.BigEndian.Uint16(b[i : i+indexCountSize])) | |||
|
|||
if count <= 0 { | |||
return fmt.Errorf("indirectIndex: the count should have >0 value") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a better message might include the value that was passed in; something like:
fmt.Errorf("invalid index %d; count must be greater than 0", count)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done , replaced with your suggested one .
Thank you @ayang64
if count <= 0 { | ||
return fmt.Errorf("invalid index %d; count must be greater than 0", count) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no way for count to be less than 0. All possible values of a uint16 will fit in an int32 and since uint16 is only positive, count will never be less than 0.
count
could be 0, though, and we definitely don't want to continue further in this case. It's possible count == 0
is valid, but having logged is good for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fundamental solution for this case which I have considered was to add CRC for whole Index block as well, as it is done for other parts. But the solution obviously will break backward compatibility.
Are there any mechanism allowing to have support of different incompatible versions of *tsm file, with special hangling?
Another thing I have taken account during implementation is the following check in writer code tsdb/engine/tsm1/writer.go: 708 line , which skips count == 0 cases.
func (t *tsmWriter) WriteIndex() error {
indexPos := t.n
if t.index.KeyCount() == 0 {
return ErrNoValues
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is unlikely we will introduce an incompatible TSM format at this point in the product's lifecycle.
@@ -1350,7 +1354,7 @@ func (m *mmapAccessor) init() (*indirectIndex, error) { | |||
|
|||
indexOfsPos := len(m.b) - 8 | |||
indexStart := binary.BigEndian.Uint64(m.b[indexOfsPos : indexOfsPos+8]) | |||
if indexStart >= uint64(indexOfsPos) { | |||
if indexStart >= uint64(indexOfsPos) || (uint64(indexOfsPos)-indexStart) > math.MaxInt32 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This added check ensures that index can be accessed using signed 32-bit offsets. Since int32
index offsets are used internally (see indirectIndex.UnmarshalBinary
), I think this is a correct check. I would prefer a distinct error message for this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does look like the correct check. The error message (all the error messages in this method) should include the file name, which we have through m.f.Name()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the most effective way to wrap the file name onto error massage is the line 240 in tsdb/engine/tsm1/reader.go.
What go you think @davidby-influx ?
index, err := t.accessor.init()
if err != nil {
**_return nil, err_**
}
@gggevorgyan I would like to merge this PR and for you to get credit for it, but we have some checks on commit messages that are failing. InfluxData requires that all commit messages meet semantic commit guidelines (https://www.conventionalcommits.org/en/v1.0.0/). Starting the commit message with |
9be14a8
to
5ee614f
Compare
…n reading a corrupt file) Added check on indexOfsStart, to make index slice length fit in int32, and count should have >0 value otherwise it is corruption. - [ ] CHANGELOG.md updated - [*] Rebased/mergable - [*] Tests pass - [*] Sign [CLA](https://influxdata.com/community/cla/) (if not already signed)
Thank you @gwossum, |
… corrupt file)
Added check on indexOfsStart, to make index slice length fit in int32, and count should have >0 value otherwise it is corruption.
CHANGELOG.md updated
[*] Rebased/mergable
[*] Tests pass
[*] Sign CLA (if not already signed)
Closes #
Required checklist
/etc
folder andNewDemoConfig
methods) (influxdb and plutonium)Description
1-3 sentences describing the PR (or link to well written issue)
Context
Why was this added? What value does it add? What are risks/best practices?
Affected areas (delete section if not relevant):
List of user-visible changes. As a user, what would I need to see in docs?
Examples:
CLI commands, subcommands, and flags
API changes
Configuration (sample config blocks)
Severity (delete section if not relevant)
i.e., ("recommend to upgrade immediately", "upgrade at your leisure", etc.)
Note for reviewers:
Check the semantic commit type: