Mesh has a compile-time fixed max arena size of 64 GB #37

asl · 2019-03-10T14:36:16Z

Hello

I'm trying to benchmark SPAdes (https://github.com/ablab/spades) with Mesh. Currently SPAdes runs fine with both tcmalloc and jemalloc (and have embedded jemalloc for the sake of completeness). On reasonable small dataset (with memory consumption around ~30Gb) I'm seeing:

src/meshable_arena.cc:448:void mesh::MeshableArena::beginMesh(void*, void*, size_t): ASSERTION 'r == 0' FAILED:

Some quick debugging revealed that mesh tried to mprotect lots of pages of 4k each. As a result mprotect() at some point returns ENOMEM. On my system I'm having:

$ sysctl vm.max_map_count 
vm.max_map_count = 65530

If I'd increase vm.max_map_count to 655300 (I'm lucky, I'm having sudo access and the majority of users don't) then the assertion goes away and just std::bad_alloc is thrown. Here is MALLOCSTATS=1 report just in case:

Meshed pages HWM:   297048
Meshed MB HWM:      1160.3
MH Alloc Count:     376537                                                                                                                                                      MH Free  Count:     629688
MH High Water Mark: 676778

But for me it looks there is some huge design flaw somewhere as the # of memory mappings is a limited resource and one simply cannot mmap / mprotect each page.

The text was updated successfully, but these errors were encountered:

bpowers · 2019-03-10T23:49:23Z

Hi @asl!

I think there are ~ 3 things going on here. First, is that there appears to be a good amount of memory Mesh can reclaim on SPades (over a GB), neat!

Second is that you're right; we're hitting limits around vm.max_map_count. This is a little confusing and a bug -- we try to explicitly avoid hitting this limit, but I think our existing code to do so is too naiive. On startup, Runtime::initMaxMapCount() looks at /proc/sys/vm/max_map_count and sets a limit on the number of meshes based on max_map_count. The comment for kMeshesPerMap says:

// if we have, e.g. a kernel-imposed max_map_count of 2^16 (65k) we
// can only safely have about 30k meshes before we are at risk of
// hitting the max_map_count limit.
static constexpr double kMeshesPerMap = .457;

BUT, we only check that at the start of GlobalHeap::meshAllSizeClasses() -- if we find too many spans to mesh we could end up in the Danger Zone. I've opened up #38 to specifically track this.

Third, we allocate our (sparse) arena on program startup, which gives us a lot of simplicity. When things were much earlier in development, the Ubuntu system I was on had trouble with coredumping the arena - it seemed to insist on filling the (mostly empty) virtual mapping of the arena with zeros on a crash.

Before Friday, the arena size was 8 GB, which is too small for your ~ 30 GB working set. I've increased the arena to 64 GB in the latest commit to master - let me know if this enables SPades to run correctly. I've opened #39 to track this specific issue.

asl · 2019-03-11T00:00:54Z

Mesh can reclaim on SPades (over a GB), neat!

How I can see it? Is this "Meshed MB HWM" value?

/proc/sys/vm/max_map_count and sets a limit on the number of meshes based on max_map_count

Oh, well.. This does not smell good :) SPAdes uses (file) memory maps here and there. Though typically it's something around 10 * # thread, so should be below 1000 for almost any sane system.

Before Friday, the arena size was 8 GB, which is too small for your ~ 30 GB working set. I've
increased the arena to 64 GB in the latest commit to master - let me know if this enables SPades
to run correctly. I've opened #39 to track this specific issue.

I believe it's quite important not have some hard-coded arena size. We could easily utilize, say, 1Tb of RAM in hard cases ;) The actual working set should be around ~60 Gb for this particular dataset iirc. Sadly, now mesh just fails to allocate anything and and therefore throws std::bad_alloc.

asl · 2019-03-11T00:29:27Z

More information about that std::bad_alloc – it seems mesh failed to fulfill request to allocate 28 Gb as one piece.

And indeed, in

void *GlobalHeap::malloc(size_t sz) {

We're seeing that mesh unable to allocate more than 2Gb of memory in a single chunk:

if (unlikely(pageCount * kPageSize > INT_MAX)) {
    return nullptr;
 }

Really? :)

I opened #40 to track this issue

bpowers · 2019-03-11T04:59:02Z

hah, yeah... Thanks for the separate tracking issue :)

bpowers · 2019-03-11T05:11:08Z

And agreed on not requiring a fixed max; it is just that having a single range of virtual address space greatly simplifies the implementation. I know Go does (or used to do) a similar thing. This LWN article seems to describe this exact problem: https://lwn.net/Articles/428100/

brano543 · 2019-08-28T21:33:24Z

Hey there. Are there any future plans for fixing this issue? This project has great potential and I can see that a lot of effort has been put into it to solve this "fragmentation" nightmare we all battle against in long-running server jobs. Unfortunately this issue seems to me like a show breaker why one can't use this library.

Could you also explain why that arena size can't be set to "INFINITY" (2^64) ? I mean why it is even needed to have constraint on size? I am not sure I get it why does Ubuntu try to dump memory which wasn't ever malloc'd or was freed.

emeryberger · 2019-09-20T00:36:09Z

bump @bpowers

bpowers · 2019-09-23T05:21:14Z

@brano543 can you describe what actual issues you are running into? The max heap size is now 64 GB, "which ought to be enough for anybody". Please let us know if you are running into issues with this in practice and we can prioritize working on it, but please don't not try mesh because of perceived limitations.

There are two main reasons we didn't set it larger from the get-go -- the first is that some tools (like the crash reporting software on Ubuntu) choked on very large virtual memory mappings. The behavior we were seeing was: Mesh would allocate 64 GB of virtual address space. A program would allocate a few hundred MB and then crash. The core dump parser wouldn't be smart enough to understand 63.5 GB of that virtual address space wasn't ever allocated or backed by real RAM, and would try to create a core dump file for sending to Ubuntu filled with 63.5 GB worth of 0s. There is a madvise flag MADV_DONTDUMP that should help with this, but at the time, I had trouble integrating this in a way that didn't hurt performance.

The second reason is that we have some ancillary data structures we allocate (like lookup tables) that depend on the size of the arena. I think this is a smaller issue, as they will "just" use up some extra virtual address space.

asl · 2019-09-23T06:59:39Z

Thanks for clarification. This effectively closes Mesh for SPAdes as we're routinely allocating more than 64 Gb of RAM. Apparently no other memory allocator we are aware of has such a limitation.

emeryberger · 2019-09-23T13:28:35Z

How much memory do you allocate? I feel like this is something that could be made a build-time parameter.

asl · 2019-09-23T13:29:22Z

As much as necessary. Could allocate 0.5 Tb, could allocate 1 Tb. It depends on the input.

emeryberger · 2019-09-23T13:34:52Z

And to be clear, you mean that the actual physical footprint of the app in RAM is ~ 1TB, correct?

asl · 2019-09-23T13:51:09Z

It might be 100 Mb, it might be 4 Gb, it might consume 1 Tb. Everything depends on the input.

bpowers · 2019-09-23T16:25:52Z

@asl if you increase this constant here: https://github.com/plasma-umass/Mesh/blob/master/src/common.h#L104 from 64 to 2000, that should bump the max heap up to 2 TB. I would be eager to hear how this works for you! If things seem to work fine, I can do some testing on some much smaller systems, and see about making that the default.

bpowers · 2019-09-23T16:27:07Z

I'll also talk to @emeryberger - my intuition is that having a single, non-growable heap makes parts of the implementation significantly easier, but maybe I'm overthinking it.

asl · 2019-10-22T01:09:17Z

Well, for us we'd need something like a run-time constant then. E.g. the user could specify max amount of memory he / she could use.

asl · 2019-10-22T02:33:37Z

@bpowers So, I tried again on small dataset (with expected memory consumption less than 10 Gb). Unfortunately, I had to lower kMeshesPerMap down to 0.1, otherwise it did not work. So, I guess #38 is really a blocker.

bigerl · 2020-10-19T12:09:36Z

I also ran into this:

Mesh: arena exhausted: current arena size is 64 GB; recompile with larger arena size.

The system has 128 GB RAM and the application uses more-or-less 128GB RAM for the given input.
I would also like to test it on another machine with up to 1TB if RAM (and use all of it).

I have three questions:

Like the error message suggested, I changed the arena size here https://github.com/plasma-umass/Mesh/blob/master/src/common.h#L124 to 128, i.e. static constexpr size_t kArenaSize = 128ULL * 1024ULL * 1024ULL * 1024ULL; . Is this correct? Do I need to change anything else?
Does the issue with the ubuntu creating mostly empty memory dumps (from here: Mesh has a compile-time fixed max arena size of 64 GB #37 (comment)) still exist?
Why not just check the amount of installed system memory at compile time and set parameters so that all of it can be used (or slightly higher)?

dumblob · 2021-12-16T11:53:08Z

For a few projects (actually implementations of programming languages for HPC etc.) I wanted to propose to use Mesh. But any such compile-time limitation puts me off. It's simply impossible to use - many ordinary desktop systems have more than 64GByte ram nowadays. Ordinary servers typically several terabytes and special systems tens or even small hundreds of terabytes.

Any plans on removing this limitation or making it at least a run-time setting?

asl · 2021-12-16T11:56:18Z

@dumblob The project seems to be abandoned. We (SPAdes) are using mimalloc now.

dumblob · 2021-12-16T13:55:44Z

@asl aha, ok. That's a pity anyway.

Btw. actually I wanted to test Mesh against mimalloc and I'm pleased to hear mimalloc serves your purpose well (my experience is also basically only very positive with mimalloc).

emeryberger · 2021-12-17T03:52:17Z

The lead PhD student on this project, @bpowers , has moved to industry and recently had a child, so he has been otherwise quite occupied :).

In any event, this particular issue got lost; sorry about that.

dumblob · 2021-12-17T09:01:25Z

I see - then I wish all the best to @bpowers et al.

Should this project get resurrected at some point, I'll try to keep an eye on it.

This was referenced Mar 10, 2019

Move check for aboveMeshThreshold() into meshing loop #38

Closed

better (possibly dynamic) max arena size. #39

Closed

bpowers mentioned this issue Mar 11, 2019

MeshableArena: MADV_FREE indirection table #41

Open

bpowers changed the title ~~Maximum # of mmapped ranges quickly exhausted~~ Mesh has a compile-time fixed max arena size of 64 GB Sep 23, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mesh has a compile-time fixed max arena size of 64 GB #37

Mesh has a compile-time fixed max arena size of 64 GB #37

asl commented Mar 10, 2019

bpowers commented Mar 10, 2019

asl commented Mar 11, 2019

asl commented Mar 11, 2019 •

edited

bpowers commented Mar 11, 2019

bpowers commented Mar 11, 2019

brano543 commented Aug 28, 2019

emeryberger commented Sep 20, 2019

bpowers commented Sep 23, 2019

asl commented Sep 23, 2019

emeryberger commented Sep 23, 2019

asl commented Sep 23, 2019 •

edited

emeryberger commented Sep 23, 2019

asl commented Sep 23, 2019

bpowers commented Sep 23, 2019

bpowers commented Sep 23, 2019

asl commented Oct 22, 2019

asl commented Oct 22, 2019

bigerl commented Oct 19, 2020 •

edited

dumblob commented Dec 16, 2021

asl commented Dec 16, 2021

dumblob commented Dec 16, 2021

emeryberger commented Dec 17, 2021

dumblob commented Dec 17, 2021

Mesh has a compile-time fixed max arena size of 64 GB #37

Mesh has a compile-time fixed max arena size of 64 GB #37

Comments

asl commented Mar 10, 2019

bpowers commented Mar 10, 2019

asl commented Mar 11, 2019

asl commented Mar 11, 2019 • edited

bpowers commented Mar 11, 2019

bpowers commented Mar 11, 2019

brano543 commented Aug 28, 2019

emeryberger commented Sep 20, 2019

bpowers commented Sep 23, 2019

asl commented Sep 23, 2019

emeryberger commented Sep 23, 2019

asl commented Sep 23, 2019 • edited

emeryberger commented Sep 23, 2019

asl commented Sep 23, 2019

bpowers commented Sep 23, 2019

bpowers commented Sep 23, 2019

asl commented Oct 22, 2019

asl commented Oct 22, 2019

bigerl commented Oct 19, 2020 • edited

dumblob commented Dec 16, 2021

asl commented Dec 16, 2021

dumblob commented Dec 16, 2021

emeryberger commented Dec 17, 2021

dumblob commented Dec 17, 2021

asl commented Mar 11, 2019 •

edited

asl commented Sep 23, 2019 •

edited

bigerl commented Oct 19, 2020 •

edited