Use of `serializedSize` uses significant memory #760

fproske · 2025-02-03T14:05:41Z

(Please use https://github.com/futureverse/future/discussions for Q&A)

Describe the bug

Upgrading from future version 1.33.2 to version 1.34.0 results in OOM issues running in container. Profiling the code showed that the serializedSize function allocates significant amounts of memory. It's hard for me to create a reproducible example as some of the globals I'm providing to the future() call are nested R6 objects. Is it discouraged to send such complex data structures?

Expected behavior

Using significantly less memory as in previous versions

Session information

Occurs both on my macOS ARM and on Linux pods in Kubernetes cluster. future_1.34.0, parallelly_1.42.0 , globals_0.16.3

The text was updated successfully, but these errors were encountered:

HenrikBengtsson · 2025-02-07T19:46:29Z

Thanks for reporting.

Upgrading from future version 1.33.2 to version 1.34.0 results in OOM issues running in container.

Yes, the built-in size estimator of global variables was switch from an internal ad-hoc approach to use parallelly::serializedSize() in future 1.34.0.

Before anything else, there is of course the possibility that other things were updated too, when you upgrade the future package. Did you verify that the OOM problem goes away if you re-install future 1.33.2? You can do that quickly by install.packages("https://cran.r-project.org/src/contrib/Archive/future/future_1.33.2.tar.gz"). It would be useful to rule out other reasons.

After having verified that it is indeed specific to future 1.34.0, please see if the problem goes away if you switch back to the previous estimator by setting:

options(future.globals.objectSize.method = "objectSize")

An alternative is to set environment variable,

export R_FUTURE_GLOBALS_OBJECTSIZE_METHOD="objectSize"

before starting R. This will be detected and parsed by the future on load such that corresponding R option will be set.

Your feedback on this will be critical to understand and fix this. Based on that, I will then be able to give you suggestions on the best workaround until resolved.

PS. You might indeed be onto something, so I created futureverse/parallelly#126 to see if parallelly::serializedSize() can be optimized further.

With update to future version 1.34.0, memory usage increased significantly due to the use of `parallelly::serializedSize`. I created an [issue](futureverse/future#760) and downgraded future for now

fproske · 2025-02-11T17:24:12Z

Sorry for the late response, it took a while to reproduce the issue and run some experiments. Our initial issue was that after upgrading to future 1.34.0 we got the warning that the size of globals would exceed 500MB. We have never seen this warning before and downgrading only the future package also resolved this issue (same as upgrading and setting options(future.globals.maxSize = Inf) which is our current workaround). Profiling the process led to the discovery mentioned in the initial post.

We have done some more experiments running our application in a memory-restricted environment (using Docker): Our application before initiating the future() (using multicore) requires around 400MB of memory as reported by top, thus we set the memory limit of the Docker container to 500m.
With the globals size checking enabled (default: options(future.globals.maxSize = 500 * 1024 ^ 2)) we get an error that the objects we are trying to allocate would exceed 500MB. However, when we disable the check (options(future.globals.maxSize = Inf) and run the exact same experiment, everything works flawlessly. The process is not OOM killed.

To conclude: I'm not sure why profvis reports such high memory for the serializedSize function, but I guess the more important issue/question is whether the memory reported by the new function is accurate.

For us, the issue is resolved as we are happy with disabling this check (we were simply not aware that such artificial size restriction existed before the issue started appearing), but are happy to run additional experiments if that helps with debugging. If you conclude that everything works as expected feel free to close this.

PS: Thanks for all the work you do with the future package. It is very useful to us!

HenrikBengtsson · 2025-02-11T19:03:18Z

So, the background for having that error on "the size of globals would exceed 500MB" is to protect users from sending ginormous objects over the internet by mistake. Since parallelization is so easy these days, it is not unlikely that someone would end up sending 1-100's of GBs by mistake over the network without knowing. The only thing they would notice is that it would take a very long time. They might also get a notification from a sysadm, and worse, if they're on metered internet, which some people still are, they would get a hefty bill.

We have never seen this warning ...

So, that's one critique against this built-in protection. No matter what the threshold is, there will always be users operating just below the limit, but then one day the work with some data that pushes them just over the limit, and boom, they get a surprising error. That sends them down a troubleshooting path, and, worse, it might happen at the very end of a long-running pipeline. So, it wasted time and efforts. This is what happened to you (although here it was due to a change of the internals and not data per se).

Since day one I thought about this, and I'm considering removing this protection, i.e. make options(future.globals.maxSize = +Inf) the default, and then probably document the risk of parallelizing out on the internet. It's quite likely I'll do this in the next release.

... I guess the more important issue/question is whether the memory reported by the new function is accurate.

... or, equivalently, if the old objectSize algorithm is accurate :p Joking aside, estimating the size of an object in R is tricky. The old approach was very ad hoc and did not go too deep in nested objects - it simply ignored deeply nested elements. The new serializedSize approach definitely includes all components. So, I'd say that serializedSize gives a more accurate estimate of the object size.

Thanks for reaching out and sharing your use case here.

HenrikBengtsson · 2025-02-11T19:09:00Z

The process is not OOM killed.

Forgot to say, this part still puzzles me. If nothing else changed, and based on the conclusions in futureverse/parallelly#126 that serializedSize() has a constant, minimal memory footprint, I still don't understand what the OOM was trigger serializedSize and not with objectSize. If anything, I'd expect the latter to consume a tad more memory, because it walks all the globals. Oh well, I'll probably put this one on the maybe-some-day list.

HenrikBengtsson · 2025-02-11T20:34:20Z

I'm closing, but if you, or someone else run into this, please share here, or create a new issue.

fproske added the bug label Feb 3, 2025

HenrikBengtsson mentioned this issue Feb 7, 2025

serializedSize(): Does it allocate memory? futureverse/parallelly#126

Open

HenrikBengtsson closed this as completed Feb 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use of `serializedSize` uses significant memory #760

Use of `serializedSize` uses significant memory #760

fproske commented Feb 3, 2025

HenrikBengtsson commented Feb 7, 2025

fproske commented Feb 11, 2025 •

edited

Loading

HenrikBengtsson commented Feb 11, 2025 •

edited

Loading

HenrikBengtsson commented Feb 11, 2025

HenrikBengtsson commented Feb 11, 2025

Use of serializedSize uses significant memory #760

Use of serializedSize uses significant memory #760

Comments

fproske commented Feb 3, 2025

HenrikBengtsson commented Feb 7, 2025

fproske commented Feb 11, 2025 • edited Loading

HenrikBengtsson commented Feb 11, 2025 • edited Loading

HenrikBengtsson commented Feb 11, 2025

HenrikBengtsson commented Feb 11, 2025

Use of `serializedSize` uses significant memory #760

Use of `serializedSize` uses significant memory #760

fproske commented Feb 11, 2025 •

edited

Loading

HenrikBengtsson commented Feb 11, 2025 •

edited

Loading