Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

String move() and String(String &&rval) breaks operation of reserve() #161

Open
drmpf opened this issue Apr 30, 2022 · 4 comments
Open

String move() and String(String &&rval) breaks operation of reserve() #161

drmpf opened this issue Apr 30, 2022 · 4 comments

Comments

@drmpf
Copy link

drmpf commented Apr 30, 2022

When #if __cplusplus >= 201103L || defined(GXX_EXPERIMENTAL_CXX0X)
operator = uses move() to just update the buffer pointer of the destination
This ignores any reserve() the user has made to ensure the memory is not unnecessarily fragmented.
String(String &&rval) has a similar problem

move() should first check the capacity of the destination and if there is sufficient space copy the source to the destination
String(String &&rval) should use move()

A suggested move() is

void String::move(String &rhs) {
	if (this != &rhs) {
           if (capacity > rhs.size) {
               copy(rhs.buffer,rhs.size);
             } else {
		free(buffer);
		buffer = rhs.buffer;
		len = rhs.len;
		capacity = rhs.capacity;
             }
		rhs.buffer = NULL;
		rhs.len = 0;
		rhs.capacity = 0;
	}
}
@matthijskooijman
Copy link
Collaborator

Interesting perspective. Normally, I would say that string moves should try to minimize copying for maximum speed, and if a move is available, just updating the pointer instead of copying data is the obvious best way.

But I can see your usecase where you reserve space for strings so you never need to realloc them and prevent fragmentation, which would ineed be broken by string moves.

However, these two things are conflicting. If your suggestion would be applied, then in cases where fragmentation is not a specific consideration (which I think is most of the time), your suggestion can significantly hurt performance, which is IMHO not what we'd want.

One compromise might be to apply the copying only when capacity > size && capcity > rhs.size, so if there is extra room available, which suggests that an explicit reserve() has happened and fragmentation might be a consideration. I'm not sure if that's enough, though.

Regarding your proposed implementation, I think there is an error in there. The rhs.buffer = NULL; and subsequent lines should, I think, be inside the else, since now this leaks memory be unsetting rhs.buffer without freeing it.

@drmpf
Copy link
Author

drmpf commented Apr 30, 2022

Yes there is memory leak in the proposed code that needs to be fixed, say

void String::move(String &rhs) {
	if (this != &rhs) {
           if (capacity > rhs.size) {
               copy(rhs.buffer,rhs.size);
             } else {
		free(buffer);
		buffer = rhs.buffer;
		len = rhs.len;
		capacity = rhs.capacity;
		rhs.buffer = NULL; // prevents invalidate() trying to free rhs.buffer which is now the destination buffer
             }
           rhs.invalidate();
	}
}

I did wonder if the original code had been 'optimized' for large assignments, as you might get when dealing with webpage Strings on ESP32 etc. On the other hand the ESP32 etc have much higher clock speeds so the copies will be faster.
In any case the previous code will impact the careful use of Strings on AVR boards where reserve() makes a difference.

"cases where fragmentation is not a specific consideration (which I think is most of the time)"

I don't disagree with you, but I have continual fights on Arduino forum whenever I propose Strings as the simple way to do things.
The pervasive response always is "but you will get fragmentation and there is not enough memory to waste any and you will get program errors" and the forum advice is almost always "you should never be using Strings"

I have looked into using Strings extensively in Taming Arduino Strings, Avoiding Fragementation and Out-of-Memory Issues and the solution for small memory uC's , if you have memory problems, depends on reserve() working.

Your heuristic 'capacity > size && capcity > rhs.size' seems a bit hit and miss to me. Once a String's capacity grows in use it is never reduce even is the size is reduced.

I did think perhaps someone would propose that the move code could be conditional on the processor, so that 'small' memory processors (Uno/Mega etc) would copy, if capacity allowed, while those processors with larger memories (ESP8266/ESP32) would use move().

@matthijskooijman
Copy link
Collaborator

Your heuristic 'capacity > size && capcity > rhs.size' seems a bit hit and miss to me. Once a String's capacity grows in use it is never reduce even is the size is reduced.

Yeah, it's probably not good enough.

Would an alternative be to remember whether reserve() has been called and to copy into those strings? Still a bit heuristically, of course...

I did think perhaps someone would propose that the move code could be conditional on the processor, so that 'small' memory processors (Uno/Mega etc) would copy, if capacity allowed, while those processors with larger memories (ESP8266/ESP32) would use move().

I wonder if small vs large memory is really the distinguishing factor here. Rather, I guess you really need to switch between both behaviors based on the expectation of the programmer (i.e. whether they wrote their code to expect copies to prevent fragmentation or not). This could be done using a board option selected at build time (a platform option would be better, if that is implemented). Or, maybe (given that libraries might also have different expectations) you'd need more fine-grained control (e.g. mark specific strings is non-move, maybe implied like a call to reserve() as suggested above).

Another way to look at this, is to ask in which cases do you need the move behavior instead of copying bytes? I guess this is mostly cases where the destination string has no buffer allocated at all yet (i.e. move constructor, not move assignment operator). Another case could be when you completely replace a string and both are local variables, e.g. something like:

String value = "none";
if (some_condition)
    value = String("x=") + String(int_value)

In this case, your suggested move operator would copy bytes, while it would definitely be better to just move the pointer. Doing the byte copy only after reserve() was called would also catch this case correctly.

I have looked into using Strings extensively in Taming Arduino Strings, Avoiding Fragementation and Out-of-Memory Issues and the solution for small memory uC's , if you have memory problems, depends on reserve() working.

Nice article you wrote, excellent and detailed investigation. One smalle remark: You say "due to the build in __malloc_margin of 128 bytes.", but AFAICS (based on docs and source), it seems to be 32 by default, not 128?

Also nice to see your SafeString class. I have been thinking about (but never finding time for anything else) generalizing the official String class (or maybe introducing a new class) that can wrap either an external char* buffer, a PROGMEM string, or dynamically allocated buffers and offer the same API on all of them (maybe split into a read-only base class and read-write subclass), so code using strings can be written generically. But that digresses greatly from this issue, and might deserve an issue on its own...

@drmpf
Copy link
Author

drmpf commented Jul 22, 2022

an alternative be to remember whether reserve() has been called and to copy into those strings?

I like that idea. Simple for the user. Most of the time Strings "just work"
When you need to "fix" String fragmentation, reserve() is the tool provided.

Re __malloc_margin

void setup() {
  Serial.begin(9600);
  Serial.print("extern size_t __malloc_margin:"); Serial.println(__malloc_margin);
}
void loop() {
}

prints
extern size_t __malloc_margin:128
on UNO compiled with Arduino 1.8.19
from this patch (I think)
https://github.com/arduino/toolchain-avr/blob/master/avr-libc-patches/01-arduino-malloc_margin.patch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants