map: Add Drain for traversing maps while removing entries #1349

smagnani96 · 2024-02-19T11:39:43Z

Dear community, with this PR I would like to introduce the Map.Drain API to traverse a map while also removing its entries.
The current MapIterator.Next traverse a map without removing the entries. For some use cases, it might be useful to have this API so that maps such as Hash can be freed during the lookup.
In addition, the MapIterator.Next would normally fail with maps of type Queue or Stack, for two main reasons:

these maps do not have kernel primitives to traverse their content w/o removing, only support for LookupAndDelete (Lookup for Queue/Stack acts as a peek method and return the head);
these maps are keyless, hence need separate support.

The proposed solution reuses the MapIterator structure and as much code as possible: the Map.Drain API return an iterator with a special flag set, specifying that the retrieved entries must be deleted as well.
To support different kernel versions in which the support for LookupAndDelete system call is not equal among the different maps (e.g., queues and hash support for this has been introduced in different commit),
this implementation introduces a fallback method that would leverage the sequential Lookup->Delete operations in case LookupAndDelete for that specific map is not supported. These compatibility support has been dropped in this PR. Queues are expected to work since kernel v4.2, and the other Hash-like maps from v5.14.

Please any feedback is really welcome, don't hesitate ping me 😃

dylandreimerink

Thank you very much for putting in the work. Though I would like to request some changes.

examples/queue/main.go

map.go

map_test.go

smagnani96 · 2024-02-29T13:59:47Z

Thank you very much for putting in the work. Though I would like to request some changes.

Many thanks again Dylan for the feedback!

ti-mo

Thanks for the patches! I'm not completely convinced we want to make MapIterator an interface (yet).

map.go

types.go

lmb

I have a question regarding semantics:

What happens when you do a plain Lookup on a queue or stack? Does that function like a peek? Or does it return an error?
Are there other map types which support LookupAndDelete? Are there ones which support passing a Key argument?

map.go

smagnani96 · 2024-05-06T13:26:38Z

Thanks to all the reviewers for the feedback.
I guess the only thing left to decide is whether to proceed with one of the two proposed solutions:

Creating a simple method pop that would be called instead of next, as for the solution proposed in the previous commit.
Creating an interface to support multiple MapIterators, as for the current commit under review.

lmb · 2024-05-14T15:35:35Z

Have you had the time to look into my questions in #1349 (review) ? I think they will help us decide what the implementation looks like.

smagnani96 · 2024-05-15T09:51:56Z

Have you had the time to look into my questions in #1349 (review) ? I think they will help us decide what the implementation looks like.

@lmb I apologize, I missed the message.

According to https://docs.kernel.org/bpf/map_queue_stack.html and some quick tests of the code, Queues/Stack maps support the following operations.
- lookup (peek): returns the value at the head without removing it. If no value exists, ErrKeyNotExist is returned.
- update (push)
- delete (pop)
  In addition, batch operations are not supported. Also, NextKey is not supported: it either returns can't marshal key: <type> doesn't marshal to 0 bytes or next key: invalid argument when called with nil (these maps have a key of size 0).
On the other hand, other maps like Array and Hash support LookupAndDelete, and the method requires a non-null key as a parameter.

dylandreimerink · 2024-05-16T10:17:07Z

Queues/Stack maps support the following operations.

Clarification, for stacks/queues:
BPF_MAP_LOOKUP_ELEM -> peek
BPF_MAP_LOOKUP_AND_DELETE_ELEM -> pop
BPF_MAP_UPDATE_ELEM -> push

On the other hand, other maps like Array and Hash support LookupAndDelete, and the method requires a non-null key as a parameter.

Only hash maps(and variants: LRU, PER_CPU, LRU_PER_CPU) have support for BPF_MAP_LOOKUP_AND_DELETE_ELEM since v5.14. Array maps don't seem to support it since you can't delete from an array map.

Also interesting is that BPF_MAP_LOOKUP_AND_DELETE_BATCH which is the batch version is only supported by hash maps and not by stacks or queues and was added before the non-batch version was supported in v5.6.

Are there ones which support passing a Key argument?

Yes, the hashmap supports passing a key value. Actually, even before v5.14, a valid pointer must be provided to the key field. The kernel always checks it and copies it into kernel memory, just to discard it. After v5.14 it of course starts to actually get passed to the map implementation.

lmb · 2024-05-17T10:27:05Z

Okay! That was very helpful, thanks to both of you. The peek operations only return the top of the stack / first item in the queue? And currently doing .Iterate() on a queue or stack doesn't work at all since we pass a key?

I'm currently leaning towards adding a new method Map.Drain() *MapIterator. Initially this could only support queues / stacks but later on we may be able to extend it to hash maps. The benefit of a new method is that it makes it clearer that we're actually deleting items here. We can put the implementation into the same MapIterator struct for now as Timo suggested. I think that we should take MapIterator.Next(nil, &value) when draining (and not ignore the key argument as currently done.)

dylandreimerink · 2024-05-17T11:09:57Z

The peek operations only return the top of the stack / first item in the queue? And currently doing .Iterate() on a queue or stack doesn't work at all since we pass a key?

Yes, peeking only returns the top of the stack. Any key provided will be ignored as far as I understand. But as @S41m0n mentioned the BPF_MAP_GET_NEXT_KEY op would return an error if using the normal iterator.

smagnani96 · 2024-06-25T14:15:01Z

Thanks for the info-gathering and brainstorming. I just pushed a possible implementation of what has been discussed.
Let me know your opinions :)

map.go

map_test.go

dylandreimerink

I like the end result, but between the commits a lot happens that is canceled out later. So please squash all commits into one (or cleanup the git history some other way)

map.go

smagnani96 · 2024-09-03T18:26:36Z

This PR has been rebased to #1559 so that the current PR preserve only the needed material for the Map.Drain API. The only commit that would remain and which we should refer to is 3d9f4e0.

smagnani96 · 2024-09-05T17:38:53Z

I removed from the code the support for the fallback sequential Lookup -> Delete.
The Drain API against BPF_MAP_TYPE_HASH, BPF_MAP_TYPE_PERCPU_HASH, BPF_MAP_TYPE_LRU_HASH and BPF_MAP_TYPE_LRU_PERCPU_HASH is not expected to work with kernel versions < 5.14.

This commit introduces the `Map.Drain` API to traverse the map while also removing its entries. It leverages the same `MapIterator` structure, with the introduction of a new unexported method to handle the map draining. The tests make sure that the behavior is as expected, and that this API returns an error while invoked on the wrong map, such as arrays, for which `Map.Iterate` should be used instead. The `LookupAndDelete` system call support has been introduced in: 1. 5.14 for BPF_MAP_TYPE_HASH, BPF_MAP_TYPE_PERCPU_HASH, BPF_MAP_TYPE_LRU_HASH and BPF_MAP_TYPE_LRU_PERCPU_HASH. 2. 4.20 for BPF_MAP_TYPE_QUEUE, BPF_MAP_TYPE_STACK Do not expect the `Map.Drain` API to work on prior versions, according to the target map type. From the user perspective, the usage should be similar to `Map.Iterate`, as shown as follows: ```go m, err := NewMap(&MapSpec{ Type: Hash, KeySize: 4, ValueSize: 8, MaxEntries: 10, }) // populate here the map and defer close it := m.Drain() for it.Next(keyPtr, &value) { // here the entry doesn't exist anymore in the underlying map. ... } ``` Signed-off-by: Simone Magnani <[email protected]>

smagnani96 · 2024-12-06T12:09:56Z

Moved functions so that GH code diffs are more readable.
Added comments to newly helper functions.
Removed TestDrainEmptyMap: added this check into TestMapDrain before filling the map.

dylandreimerink

I am happy with this iteration 😉.

dylandreimerink · 2024-12-09T14:19:17Z

map.go

+	if ptr, ok := keyOut.(unsafe.Pointer); ok {
+		copy(unsafe.Slice((*byte)(ptr), len(buf)), buf)
+	} else {


Sorry if this is a bit off topic. @ti-mo Do you recall why we have this exception here? This is just moving the existing code added in 4609dc7. But we seem to take so much care to ensure safety elsewhere, including in sysenc.Unmarshal. But not here.

If for some reason the user passes in an unsafe pointer in keyOut that is smaller than buf this causes us to write into unknown parts of the heap.

I'm not exactly sure, but the commit you linked does contain some hints around slices of types with trailing padding:

Allowing such a type creates an edge case: make([]padding, 1)
uses zero-allocation marshaling while make([]padding, 2)
doesn't, due to interior padding. It's simpler to skip such
types for now.

Passing in unsafe.Pointer sidesteps this limitation. Also see f0d238d. unsafe.Pointer is supported for most map operations, see marshalMapSyscallInput.

ti-mo

Still some changes to be made here, and I think some more testing needs to be done to understand the behaviour in the face of concurrent deletes.

It's also important to understand that LookupAndDelete may keep failing after a successful NextKey on very busy maps, and we may need to protect the application from getting into runaway CPU usage. I vaguely recall some existing code in the library that gives up after x attempts, but I can't remember where exactly.

Please reach out offline if you'd like some help with this.

We'll ship this in a 0.17.1 after the xmas break.

ti-mo · 2024-12-17T16:26:29Z

map.go

 // Next decodes the next key and value.
 //
-// Iterating a hash map from which keys are being deleted is not


Why was this comment removed? This applies in general, and it's documentation on exported API, so we shouldn't just remove it.

ti-mo · 2024-12-17T16:32:17Z

map.go

+// safe. You may see the same key multiple times. Iteration may
+// also abort with an error, see IsIterationAborted.
+//
+// Iterating a queue/stack map returns an error (NextKey invalid


Here and below:

Suggested change

// Iterating a queue/stack map returns an error (NextKey invalid

// Iterating a queue/stack map returns an error, use [Map.Drain] instead.

Nit: try to reduce your use of the term 'API', it's redundant and often implied. Less is more!

ti-mo · 2024-12-17T16:41:00Z

map.go

+// but their respective outputs will differ.
+//
+// Draining a map that does not support entry removal such as
+// an array return an error (LookupAndDelete not supported):


The exact error returned is an implementation detail and may change over time unless it's a sentinel. If it's in godoc, it becomes a contract.

If it does return an error wrapping ErrNotSupported, this is valid docmentation:

Draining a map that doesn't support deletions will return [ErrNotSupported].

ti-mo · 2024-12-17T16:45:13Z

map.go

@@ -1602,6 +1623,12 @@ func marshalMap(m *Map, length int) ([]byte, error) {
 	return buf, nil
 }

+// isKeyValueMap returns true if map supports key-value pairs (ex. hash)
+// and false in case of value-only maps (ex. queue).
+func isKeyValueMap(m *Map) bool {


This would make more sense inverted, e.g. func (m *Map) isKeyless() bool since you're negating its only use below. Then the docstring can also become a bit more focused: isKeyless returns true if the map is value-only, like a [Queue].

ti-mo · 2024-12-18T13:13:02Z

map.go

+	return mi.nextIterate(keyOut, valueOut)
+}
+
+func (mi *MapIterator) nextDrain(keyOut, valueOut interface{}) bool {


Suggested change

func (mi *MapIterator) nextDrain(keyOut, valueOut interface{}) bool {

func (mi *MapIterator) pop(keyOut, valueOut interface{}) bool {

This function pops a single entry, it doesn't drain the whole map.

ti-mo · 2024-12-18T13:29:27Z

map.go

+	return false
+}
+
+func (mi *MapIterator) nextIterate(keyOut, valueOut interface{}) bool {


Suggested change

func (mi *MapIterator) nextIterate(keyOut, valueOut interface{}) bool {

func (mi *MapIterator) next(keyOut, valueOut interface{}) bool {

ti-mo · 2024-12-18T13:31:54Z

map_test.go

+
+			iter := m.Drain()
+			allocs := testing.AllocsPerRun(int(m.MaxEntries()-1), func() {
+				if !iter.Next(keyPtr, &value) {


Suggested change

if !iter.Next(keyPtr, &value) {

qt.Assert(t, qt.IsTrue(iter.Next(keyPtr, &value)))

Or maybe qt.Assert allocates? Feel free to ignore.

ti-mo · 2024-12-18T13:32:21Z

map_test.go

+				} else {
+					err = m.Put(uint32(i), uint32(i))
+				}
+				if err != nil {


Suggested change

if err != nil {

qt.Assert(t, qt.IsNil(err))

ti-mo · 2024-12-18T13:36:07Z

map.go

+	// Handle value-only maps (ex. queue).
+	if !isKeyValueMap(mi.target) {
+		if keyOut != nil {
+			mi.err = fmt.Errorf("non-nil keyOut provided for map without a key, must be nil instead")


Suggested change

mi.err = fmt.Errorf("non-nil keyOut provided for map without a key, must be nil instead")

mi.err = fmt.Errorf("non-nil keyOut provided for map without a key")

ti-mo · 2024-12-18T13:45:17Z

map.go

+		mi.cursor = make([]byte, mi.target.keySize)
+	}
+
+	// Always retrieve first key in the map. This should ensure that the whole map


I don't think this is the case. Consider the following scenario:

drainer A: NextKey(0) yields a

drainer B: NextKey(0) yields a

drainer B: LookupAndDelete(a) succeeds

drainer A: LookupAndDelete(a) fails

The L&D failing can happen at any point due to a concurrent delete, also when e.g. a bpf program deletes a key, which is fairly common. In the current implementation, I think iteration halts at that point, while there may still be keys left in the map.

ti-mo · 2025-02-25T14:14:52Z

👋 @smagnani96 I've converted this to a draft for now, please address the feedback and ping when you want another review.

smagnani96 · 2025-05-21T13:13:30Z

👋 @smagnani96 I've converted this to a draft for now, please address the feedback and ping when you want another review.

Closing this, will reopen in case I'll find some slots to finish it 🙏🏼

smagnani96 force-pushed the feature/queue_stack_support branch from d65df8e to 9a2089a Compare February 19, 2024 18:06

dylandreimerink requested changes Feb 26, 2024

View reviewed changes

examples/queue/main.go Outdated Show resolved Hide resolved

map.go Outdated Show resolved Hide resolved

map.go Outdated Show resolved Hide resolved

map_test.go Outdated Show resolved Hide resolved

smagnani96 force-pushed the feature/queue_stack_support branch 2 times, most recently from 11c813b to 2dfaef8 Compare February 29, 2024 13:52

smagnani96 force-pushed the feature/queue_stack_support branch from 2dfaef8 to 23c82af Compare March 13, 2024 19:42

smagnani96 changed the title ~~MapIterator support for Queues/Stacks + further testing + example~~ MapIterator support for Queues/Stacks + further testing Mar 13, 2024

smagnani96 marked this pull request as ready for review March 13, 2024 19:45

smagnani96 requested a review from a team as a code owner March 13, 2024 19:45

ti-mo requested changes Mar 18, 2024

View reviewed changes

map.go Outdated Show resolved Hide resolved

map.go Outdated Show resolved Hide resolved

map.go Outdated Show resolved Hide resolved

map.go Outdated Show resolved Hide resolved

types.go Outdated Show resolved Hide resolved

lmb reviewed Mar 21, 2024

View reviewed changes

map.go Outdated Show resolved Hide resolved

smagnani96 force-pushed the feature/queue_stack_support branch 3 times, most recently from c3c539c to 3acb798 Compare June 25, 2024 14:06

smagnani96 changed the title ~~MapIterator support for Queues/Stacks + further testing~~ MapIterator support for Draining map content (e.g., queues,stacks,hash) Jun 25, 2024

smagnani96 force-pushed the feature/queue_stack_support branch from 3acb798 to c620552 Compare June 26, 2024 16:24

lmb reviewed Jun 27, 2024

View reviewed changes

smagnani96 force-pushed the feature/queue_stack_support branch 2 times, most recently from 39b3749 to 3ac6523 Compare July 30, 2024 18:07

dylandreimerink requested changes Sep 2, 2024

View reviewed changes

map.go Outdated Show resolved Hide resolved

smagnani96 force-pushed the feature/queue_stack_support branch from 3ac6523 to 3d9f4e0 Compare September 3, 2024 18:22

smagnani96 changed the title ~~MapIterator support for Draining map content (e.g., queues,stacks,hash)~~ introducing Map.Drain API for traversing maps while removing entries Sep 3, 2024

smagnani96 force-pushed the feature/queue_stack_support branch 2 times, most recently from 0916768 to 318ab1b Compare September 5, 2024 17:32

smagnani96 force-pushed the feature/queue_stack_support branch 2 times, most recently from 2ae048d to d6678a8 Compare September 5, 2024 17:40

smagnani96 force-pushed the feature/queue_stack_support branch from d6678a8 to 47fdd29 Compare December 6, 2024 11:53

smagnani96 force-pushed the feature/queue_stack_support branch from 47fdd29 to 46f5343 Compare December 6, 2024 11:57

smagnani96 requested review from dylandreimerink and ti-mo December 6, 2024 12:10

dylandreimerink approved these changes Dec 9, 2024

View reviewed changes

ti-mo changed the title ~~introducing Map.Drain API for traversing maps while removing entries~~ map: Add Drain for traversing maps while removing entries Dec 17, 2024

ti-mo requested changes Dec 18, 2024

View reviewed changes

ti-mo marked this pull request as draft February 25, 2025 14:12

smagnani96 closed this May 21, 2025

	// Iterating a queue/stack map returns an error (NextKey invalid
	// Iterating a queue/stack map returns an error, use [Map.Drain] instead.

	func (mi *MapIterator) nextDrain(keyOut, valueOut interface{}) bool {
	func (mi *MapIterator) pop(keyOut, valueOut interface{}) bool {

	func (mi *MapIterator) nextIterate(keyOut, valueOut interface{}) bool {
	func (mi *MapIterator) next(keyOut, valueOut interface{}) bool {

	if !iter.Next(keyPtr, &value) {
	qt.Assert(t, qt.IsTrue(iter.Next(keyPtr, &value)))

	mi.err = fmt.Errorf("non-nil keyOut provided for map without a key, must be nil instead")
	mi.err = fmt.Errorf("non-nil keyOut provided for map without a key")

map: Add Drain for traversing maps while removing entries #1349

map: Add Drain for traversing maps while removing entries #1349

Uh oh!

Conversation

smagnani96 commented Feb 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dylandreimerink left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

smagnani96 commented Feb 29, 2024

Uh oh!

ti-mo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lmb left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

smagnani96 commented May 6, 2024

Uh oh!

lmb commented May 14, 2024

Uh oh!

smagnani96 commented May 15, 2024

Uh oh!

dylandreimerink commented May 16, 2024

Uh oh!

lmb commented May 17, 2024

Uh oh!

dylandreimerink commented May 17, 2024

Uh oh!

smagnani96 commented Jun 25, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dylandreimerink left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

smagnani96 commented Sep 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

smagnani96 commented Sep 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

smagnani96 commented Dec 6, 2024

Uh oh!

dylandreimerink left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ti-mo Dec 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ti-mo left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

smagnani96 commented Feb 19, 2024 •

edited

Loading

smagnani96 commented Sep 3, 2024 •

edited

Loading

smagnani96 commented Sep 5, 2024 •

edited

Loading

ti-mo Dec 9, 2024 •

edited

Loading