Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raft: add entryID and logSlice types #145

Merged
merged 9 commits into from
Feb 5, 2024

Conversation

pav-kv
Copy link
Contributor

@pav-kv pav-kv commented Jan 30, 2024

This PR introduces type-safe wrappers for log entry IDs and raft log slices. We will use them for a more readable and safe code. Some usage is introduced in this PR, see individual commits.

A lot of code in this repo manipulates (term, index) tuples as separate raw ints. Likewise, entry slices are handled without association with the leader term. Getting this usage wrong can be costly, since both entry IDs and log slices are centrepiece of raft safety.

Related to: #142, #144

types.go Show resolved Hide resolved
@pav-kv
Copy link
Contributor Author

pav-kv commented Jan 30, 2024

@ahrtr @nvanbenschoten This addresses one of the feedback points from #139 review.

PTAL. Commit-by-commit review might be the most convenient.

@pav-kv pav-kv requested a review from ahrtr January 30, 2024 14:19
types.go Outdated Show resolved Hide resolved
types.go Outdated Show resolved Hide resolved
types.go Outdated Show resolved Hide resolved
types.go Show resolved Hide resolved
types.go Outdated Show resolved Hide resolved
raft.go Outdated Show resolved Hide resolved
types.go Show resolved Hide resolved
log.go Outdated Show resolved Hide resolved
@pav-kv
Copy link
Contributor Author

pav-kv commented Jan 30, 2024

@nvanbenschoten Addressed your comments, thank you. PTAL.

types.go Outdated Show resolved Hide resolved
@pav-kv
Copy link
Contributor Author

pav-kv commented Jan 31, 2024

@nvanbenschoten @ahrtr Things start stacking up on top of this PR. Please let me know if there is major objection to this clean-up. If not, I would like to merge and iterate/rebase.

log.go Show resolved Hide resolved
Copy link
Contributor

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@pav-kv
Copy link
Contributor Author

pav-kv commented Feb 1, 2024

Thanks!
@ahrtr Could you take a final pass?

@pav-kv pav-kv requested a review from serathius February 2, 2024 12:51
@pav-kv pav-kv force-pushed the add-entry-append-types branch 6 times, most recently from 038ad09 to e23983f Compare February 4, 2024 02:51
@pav-kv pav-kv changed the title raft: add entryID and logAppend structs raft: add entryID and logSlice types Feb 4, 2024
@pav-kv pav-kv force-pushed the add-entry-append-types branch 2 times, most recently from 5e4ca8e to 42a3fef Compare February 4, 2024 02:59
@pav-kv
Copy link
Contributor Author

pav-kv commented Feb 4, 2024

@nvanbenschoten I've changed this PR a little since the last review. Mainly, I reworked the logSlice struct, and improved commenting for it. Otherwise the change is equivalent.

@ahrtr @serathius Could you please take a final look at this PR, and merge if it looks good? Commit-by-commit review can be most convenient. I have follow-up work that propagates safe struct usage in other places (which already helped uncover bugs in tests), but I am not including them in this PR to keep it small and simple. Thank you.

types.go Show resolved Hide resolved
types.go Outdated Show resolved Hide resolved
Comment on lines +69 to +74
type logSlice struct {
// term is the leader term containing the given entries in its log.
term uint64
// prev is the ID of the entry immediately preceding the entries.
prev entryID
// entries contains the consecutive entries representing this slice.
entries []pb.Entry
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking it might be better to replace the following fields in Message with this struct.

raft/raftpb/raft.pb.go

Lines 402 to 415 in 2f10997

Term uint64 `protobuf:"varint,4,opt,name=term" json:"term"`
// logTerm is generally used for appending Raft logs to followers. For example,
// (type=MsgApp,index=100,logTerm=5) means the leader appends entries starting
// at index=101, and the term of the entry at index 100 is 5.
// (type=MsgAppResp,reject=true,index=100,logTerm=5) means follower rejects some
// entries from its leader as it already has an entry with term 5 at index 100.
// (type=MsgStorageAppendResp,index=100,logTerm=5) means the local node wrote
// entries up to index=100 in stable storage, and the term of the entry at index
// 100 was 5. This doesn't always mean that the corresponding MsgStorageAppend
// message was the one that carried these entries, just that those entries were
// stable at the time of processing the corresponding MsgStorageAppend.
LogTerm uint64 `protobuf:"varint,5,opt,name=logTerm" json:"logTerm"`
Index uint64 `protobuf:"varint,6,opt,name=index" json:"index"`
Entries []Entry `protobuf:"bytes,7,rep,name=entries" json:"entries"`

This can be addressed in separate PRs. In order to be backward compatible, we might want to do it across multiple releases. e.g. add logSlice into protobuf firstly, and then replace the fields in Message in a following release.

Copy link
Contributor Author

@pav-kv pav-kv Feb 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we should probably leave this for some future consideration. The scope of this work currently is internal cleanup that doesn't affect public API / wire protocol.

There are a few additional considerations:

  • logSlice is intended as a code-level data structure. It's usually a bad practice to use protobuf-generated structs as data structures because we have limited control of its content, and also this is part of the public API.
  • The LogTerm/Index fields are used by a bunch of other message types (not only MsgApp). We can't easily deprecate them, rather it would have to be a new LogSlice field or something. A broader refactoring could be: split the Message into message-type-specific sub-structs.

There is some argument for moving in a slightly opposite direction from protobufs:

  • raft doesn't need to know every pb.Entry, it only internally cares about IDs of the entries, and config changes. Exposing application-specific entry contents to this package poses security and privacy risks. There could be a world in which raft is purely a code-level algorithm/protocol, and messages are packaged at application level.
  • raft necessitates dependency on protobuf, which is not the most efficient format: it has to be marshaled/unmarshaled. Some applications may choose other wire formats, for instance flatbuffers.

@ahrtr
Copy link
Member

ahrtr commented Feb 4, 2024

A generic comment, I prefer not to add too much small commits, as most of them are mechanical changes, e.g. use xxx in xxx. So suggest to squash the commits, or get the type.go and types_test.go in a commit, and get all others in another commit.

Comment on lines +64 to +66
// Users of this struct can assume the invariants hold true. Exception is the
// "gateway" code that initially constructs logSlice, such as when its content
// is sourced from a message that was received via transport, or from Storage,
// or in a test code that manually hard-codes this struct. In these cases, the
// invariants should be validated using the valid() method.
Copy link
Member

@ahrtr ahrtr Feb 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain more about the exceptions? I think we should add a generic verification to verify the invariant properties in each test if possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with that. This PR adds a valid() check to TestLogMaybeAppend, which is at the moment the only user of the new struct. More tests will be converted in a follow-up PR. The valid() check in other tests helped to find a few places where these invariants are violated (so the tests are testing incorrect behaviour).

Copy link
Contributor Author

@pav-kv pav-kv Feb 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "exception" is basically any place which receives a log slice from an untrusted source (network, storage, etc). Example: see the handleAppendEntries call:

raft/raft.go

Lines 1748 to 1750 in 2c8980b

// TODO(pav-kv): construct logSlice up the stack next to receiving the
// message, and validate it before taking any action (e.g. bumping term).
a := logSlice{term: m.Term, prev: entryID{term: m.LogTerm, index: m.Index}, entries: m.Entries}

Once the invariants are validated, this struct can be passed around internally, and all the code can assume invariants are true. For instance, we know that the i-th entry in entries slice necessarily has entries[i].index == prev.index + 1 + i, so we don't have to additionally check for this.

I think the comment already explains this well, but please suggest a better phrasing if it doesn't.

@pav-kv
Copy link
Contributor Author

pav-kv commented Feb 5, 2024

@ahrtr

A generic comment, I prefer not to add too much small commits, as most of them are mechanical changes

I prefer to isolate mechanical changes to separate commits, so that all changes have a small scope and are obviously correct. Such a PR "tells a story" rather than dumps a large change onto a reviewer. Small changes make it possible to review one thing at a time, and not have to do mental work to decompose it. I do this both for the reviewer and for myself, to self-review and convince myself too.

I usually communicate in the PR that commit-by-commit may be convenient / increase confidence in this being a mechanical change. This is not mandatory to review things this way though - if you prefer the other way around feel free to review the whole change rather than individual commits.

Separate commits also help easily narrowing the scope of the PR when reviewers ask so. Then it becomes a simple commit removal, rather than rewriting the whole change manually.

A lot of code in this repo manipulates (term, index) tuples as ints.
Getting it wrong can be costly, since entry IDs are centrepiece of raft
safety. This commit introduces a type that will be used to replace pairs
of ints, to make the code more readable and safe.

Signed-off-by: Pavel Kalinnikov <[email protected]>
The lastTerm() method is used in pair with lastIndex() almost in every
occasion. Returning the whole pair from a method makes it cleaner and
cheaper. The lastIndex() and lastTerm() methods in the worst case fall
back to storage, so doing it once is better than twice.

Signed-off-by: Pavel Kalinnikov <[email protected]>
This is a type-safe wrapper for all kinds of log slices. We will use it
for more readable and safe code. Usages will include: wrapping log
append requests; unstable struct; possibly surface in a safer API.

Signed-off-by: Pavel Kalinnikov <[email protected]>
@pav-kv
Copy link
Contributor Author

pav-kv commented Feb 5, 2024

@ahrtr Addressed actionable comments. Thanks for the review!

@pav-kv pav-kv requested a review from ahrtr February 5, 2024 11:56
raft.go Outdated
if prev.index < r.raftLog.committed {
// TODO(pav-kv): construct logSlice up the stack next to receiving the
// message, and validate it before taking any action (e.g. bumping term).
a := logSlice{term: m.Term, prev: entryID{term: m.LogTerm, index: m.Index}, entries: m.Entries}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably we should add a function something like message2LogSlice to encapsulate the details.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a logSliceFromMsgApp func, because it should specifically be a MsgApp message.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because it should specifically be a MsgApp message.

In that case, we should add a verification each time when the function is called in case it's misused.. The verification should only be executed in test. Let me add a generic verification framework/utility in a separate PR.

Copy link
Member

@ahrtr ahrtr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a minor comment

Great work. Thanks.

@pav-kv pav-kv requested review from ahrtr and removed request for serathius February 5, 2024 17:55
@ahrtr ahrtr merged commit 23c936a into etcd-io:main Feb 5, 2024
10 checks passed
@pav-kv pav-kv deleted the add-entry-append-types branch February 5, 2024 20:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants