-
Notifications
You must be signed in to change notification settings - Fork 823
map: Add Drain for traversing maps while removing entries #1349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
d65df8e to
9a2089a
Compare
dylandreimerink
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for putting in the work. Though I would like to request some changes.
11c813b to
2dfaef8
Compare
Many thanks again Dylan for the feedback! |
2dfaef8 to
23c82af
Compare
ti-mo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the patches! I'm not completely convinced we want to make MapIterator an interface (yet).
lmb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a question regarding semantics:
- What happens when you do a plain Lookup on a queue or stack? Does that function like a peek? Or does it return an error?
- Are there other map types which support LookupAndDelete? Are there ones which support passing a Key argument?
|
Thanks to all the reviewers for the feedback.
|
|
Have you had the time to look into my questions in #1349 (review) ? I think they will help us decide what the implementation looks like. |
@lmb I apologize, I missed the message.
|
Clarification, for stacks/queues:
Only hash maps(and variants: LRU, PER_CPU, LRU_PER_CPU) have support for Also interesting is that
Yes, the hashmap supports passing a key value. Actually, even before v5.14, a valid pointer must be provided to the key field. The kernel always checks it and copies it into kernel memory, just to discard it. After v5.14 it of course starts to actually get passed to the map implementation. |
|
Okay! That was very helpful, thanks to both of you. The peek operations only return the top of the stack / first item in the queue? And currently doing I'm currently leaning towards adding a new method |
Yes, peeking only returns the top of the stack. Any key provided will be ignored as far as I understand. But as @S41m0n mentioned the |
c3c539c to
3acb798
Compare
|
Thanks for the info-gathering and brainstorming. I just pushed a possible implementation of what has been discussed. |
3acb798 to
c620552
Compare
39b3749 to
3ac6523
Compare
dylandreimerink
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the end result, but between the commits a lot happens that is canceled out later. So please squash all commits into one (or cleanup the git history some other way)
3ac6523 to
3d9f4e0
Compare
0916768 to
318ab1b
Compare
|
I removed from the code the support for the fallback sequential |
2ae048d to
d6678a8
Compare
d6678a8 to
47fdd29
Compare
This commit introduces the `Map.Drain` API to traverse the map while also
removing its entries. It leverages the same `MapIterator` structure, with
the introduction of a new unexported method to handle the map draining.
The tests make sure that the behavior is as expected, and that this API
returns an error while invoked on the wrong map, such as arrays, for which
`Map.Iterate` should be used instead.
The `LookupAndDelete` system call support has been introduced in:
1. 5.14 for BPF_MAP_TYPE_HASH, BPF_MAP_TYPE_PERCPU_HASH,
BPF_MAP_TYPE_LRU_HASH and BPF_MAP_TYPE_LRU_PERCPU_HASH.
2. 4.20 for BPF_MAP_TYPE_QUEUE, BPF_MAP_TYPE_STACK
Do not expect the `Map.Drain` API to work on prior versions, according
to the target map type. From the user perspective, the usage should
be similar to `Map.Iterate`, as shown as follows:
```go
m, err := NewMap(&MapSpec{
Type: Hash,
KeySize: 4,
ValueSize: 8,
MaxEntries: 10,
})
// populate here the map and defer close
it := m.Drain()
for it.Next(keyPtr, &value) {
// here the entry doesn't exist anymore in the underlying map.
...
}
```
Signed-off-by: Simone Magnani <[email protected]>
47fdd29 to
46f5343
Compare
|
dylandreimerink
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am happy with this iteration 😉.
| if ptr, ok := keyOut.(unsafe.Pointer); ok { | ||
| copy(unsafe.Slice((*byte)(ptr), len(buf)), buf) | ||
| } else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry if this is a bit off topic. @ti-mo Do you recall why we have this exception here? This is just moving the existing code added in 4609dc7. But we seem to take so much care to ensure safety elsewhere, including in sysenc.Unmarshal. But not here.
If for some reason the user passes in an unsafe pointer in keyOut that is smaller than buf this causes us to write into unknown parts of the heap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not exactly sure, but the commit you linked does contain some hints around slices of types with trailing padding:
Allowing such a type creates an edge case: make([]padding, 1)
uses zero-allocation marshaling while make([]padding, 2)
doesn't, due to interior padding. It's simpler to skip such
types for now.
Passing in unsafe.Pointer sidesteps this limitation. Also see f0d238d. unsafe.Pointer is supported for most map operations, see marshalMapSyscallInput.
ti-mo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still some changes to be made here, and I think some more testing needs to be done to understand the behaviour in the face of concurrent deletes.
It's also important to understand that LookupAndDelete may keep failing after a successful NextKey on very busy maps, and we may need to protect the application from getting into runaway CPU usage. I vaguely recall some existing code in the library that gives up after x attempts, but I can't remember where exactly.
Please reach out offline if you'd like some help with this.
We'll ship this in a 0.17.1 after the xmas break.
|
|
||
| // Next decodes the next key and value. | ||
| // | ||
| // Iterating a hash map from which keys are being deleted is not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this comment removed? This applies in general, and it's documentation on exported API, so we shouldn't just remove it.
| // safe. You may see the same key multiple times. Iteration may | ||
| // also abort with an error, see IsIterationAborted. | ||
| // | ||
| // Iterating a queue/stack map returns an error (NextKey invalid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here and below:
| // Iterating a queue/stack map returns an error (NextKey invalid | |
| // Iterating a queue/stack map returns an error, use [Map.Drain] instead. |
Nit: try to reduce your use of the term 'API', it's redundant and often implied. Less is more!
| // but their respective outputs will differ. | ||
| // | ||
| // Draining a map that does not support entry removal such as | ||
| // an array return an error (LookupAndDelete not supported): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The exact error returned is an implementation detail and may change over time unless it's a sentinel. If it's in godoc, it becomes a contract.
If it does return an error wrapping ErrNotSupported, this is valid docmentation:
Draining a map that doesn't support deletions will return [ErrNotSupported].
|
|
||
| // isKeyValueMap returns true if map supports key-value pairs (ex. hash) | ||
| // and false in case of value-only maps (ex. queue). | ||
| func isKeyValueMap(m *Map) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would make more sense inverted, e.g. func (m *Map) isKeyless() bool since you're negating its only use below. Then the docstring can also become a bit more focused: isKeyless returns true if the map is value-only, like a [Queue].
| return mi.nextIterate(keyOut, valueOut) | ||
| } | ||
|
|
||
| func (mi *MapIterator) nextDrain(keyOut, valueOut interface{}) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| func (mi *MapIterator) nextDrain(keyOut, valueOut interface{}) bool { | |
| func (mi *MapIterator) pop(keyOut, valueOut interface{}) bool { |
This function pops a single entry, it doesn't drain the whole map.
| return false | ||
| } | ||
|
|
||
| func (mi *MapIterator) nextIterate(keyOut, valueOut interface{}) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| func (mi *MapIterator) nextIterate(keyOut, valueOut interface{}) bool { | |
| func (mi *MapIterator) next(keyOut, valueOut interface{}) bool { |
|
|
||
| iter := m.Drain() | ||
| allocs := testing.AllocsPerRun(int(m.MaxEntries()-1), func() { | ||
| if !iter.Next(keyPtr, &value) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if !iter.Next(keyPtr, &value) { | |
| qt.Assert(t, qt.IsTrue(iter.Next(keyPtr, &value))) |
Or maybe qt.Assert allocates? Feel free to ignore.
| } else { | ||
| err = m.Put(uint32(i), uint32(i)) | ||
| } | ||
| if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if err != nil { | |
| qt.Assert(t, qt.IsNil(err)) |
| // Handle value-only maps (ex. queue). | ||
| if !isKeyValueMap(mi.target) { | ||
| if keyOut != nil { | ||
| mi.err = fmt.Errorf("non-nil keyOut provided for map without a key, must be nil instead") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| mi.err = fmt.Errorf("non-nil keyOut provided for map without a key, must be nil instead") | |
| mi.err = fmt.Errorf("non-nil keyOut provided for map without a key") |
| mi.cursor = make([]byte, mi.target.keySize) | ||
| } | ||
|
|
||
| // Always retrieve first key in the map. This should ensure that the whole map |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is the case. Consider the following scenario:
- drainer A: NextKey(0) yields a
- drainer B: NextKey(0) yields a
- drainer B: LookupAndDelete(a) succeeds
- drainer A: LookupAndDelete(a) fails
The L&D failing can happen at any point due to a concurrent delete, also when e.g. a bpf program deletes a key, which is fairly common. In the current implementation, I think iteration halts at that point, while there may still be keys left in the map.
|
👋 @smagnani96 I've converted this to a draft for now, please address the feedback and ping when you want another review. |
Closing this, will reopen in case I'll find some slots to finish it 🙏🏼 |
Dear community, with this PR I would like to introduce the
Map.DrainAPI to traverse a map while also removing its entries.The current
MapIterator.Nexttraverse a map without removing the entries. For some use cases, it might be useful to have this API so that maps such as Hash can be freed during the lookup.In addition, the
MapIterator.Nextwould normally fail with maps of type Queue or Stack, for two main reasons:The proposed solution reuses the
MapIteratorstructure and as much code as possible: theMap.DrainAPI return an iterator with a special flag set, specifying that the retrieved entries must be deleted as well.To support different kernel versions in which the support forThese compatibility support has been dropped in this PR. Queues are expected to work since kernel v4.2, and the other Hash-like maps from v5.14.LookupAndDeletesystem call is not equal among the different maps (e.g., queues and hash support for this has been introduced in different commit),this implementation introduces a fallback method that would leverage the sequential
Lookup->Deleteoperations in caseLookupAndDeletefor that specific map is not supported.Please any feedback is really welcome, don't hesitate ping me 😃