-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sanity check for rebound #20
Conversation
The traceback is
|
The failing function call in
putting a print statement and a sleep in the writing function gives (serial cpu btw):
so it's failing when writing |
And the full backtrace from
@Yurlungur you were recently in this file. Do you have any idea if the changes you made could have caused something like this? It's like some pointer got corrupted, or something. |
It looks like one of your params is not being properly output as an attribute. Do you know which varaible it is failing on? In template <typename T>
void Params::WriteToHDF5AllParamsOfType(const std::string &prefix,
const HDF5::H5G &group) const {
for (const auto &p : myParams_) {
const auto &key = p.first;
const auto type = myTypes_.at(key);
if (type == std::type_index(typeid(T))) {
auto typed_ptr = dynamic_cast<Params::object_t<T> *>((p.second).get());
std::cout << "Writing param " << key << std::endl;
HDF5::HDF5WriteAttribute(prefix + "/" + key, *typed_ptr->pValue, group);
}
}
} this'll produce a lot of output. But whatever param it prints out right before the segfault should tell us what param is causing problems. (Unless of course the out of memory access happened somewhere completely different and it just segfaults here because we're lucky. Which is possible.) |
it fails pretty reliably when writing |
Sorry somehow missed your reply... is |
Yeah, just a simple |
That's surprising... I don't see how the parthenon HDF5 machinery could cause that... I wonder if there's an out-of-bounds memory access elsewhere that's causing this? Might be worth doing an address sanitizer sweep. |
ASAN did find two things. One was a divide by zero and one was a use after free which I think was the cause of this problem. |
Huh I wonder why i didn't see those. What ASAN configure options/compiler/platform did you use, and what problem did you run? |
I was running |
looks like that fixed the problem with the serial build. There's still the issue of the parallel builds and the rebound params disagreeing across ranks |
I wonder if this params disagreement issue is always going to be present when we're storing pointers to structs (like |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch! LGTM
Glad this is fixed! |
Checks that the particle allocation persists across the interface when we create the rebound simulation.
Background
Description of Changes
Checklist