-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#2092: Runtime optimizations stage 3 #2093
base: develop
Are you sure you want to change the base?
Conversation
Pipelines resultsPR tests (gcc-12, ubuntu, mpich) Build for 85f8124 (2023-06-13 15:16:40 UTC)
PR tests (gcc-9, ubuntu, mpich, zoltan, json schema test) Build for 85f8124 (2023-06-13 15:16:40 UTC)
PR tests (nvidia cuda 11.0, ubuntu, mpich) Build for f823b5c (2023-04-26 18:49:37 UTC)
PR tests (intel icpc, ubuntu, mpich) Build for 85f8124 (2023-06-13 15:16:40 UTC)
PR tests (gcc-8, ubuntu, mpich, address sanitizer) Build for 85f8124 (2023-06-13 15:16:40 UTC)
PR tests (gcc-11, ubuntu, mpich, trace runtime, coverage) Build for 85f8124 (2023-06-13 15:16:40 UTC)
PR tests (nvidia cuda 11.2, ubuntu, mpich) Build for fed8a98 (2023-05-08 20:01:18 UTC)
PR tests (gcc-10, ubuntu, openmpi, no LB) Build for 85f8124 (2023-06-13 15:16:40 UTC)
PR tests (clang-12, ubuntu, mpich) Build for 85f8124 (2023-06-13 15:16:40 UTC)
PR tests (clang-9, ubuntu, mpich) Build for 85f8124 (2023-06-13 15:16:40 UTC)
PR tests (clang-13, ubuntu, mpich) Build for 85f8124 (2023-06-13 15:16:40 UTC)
PR tests (clang-11, ubuntu, mpich) Build for 85f8124 (2023-06-13 15:16:40 UTC)
PR tests (clang-14, ubuntu, mpich) Build for 85f8124 (2023-06-13 15:16:40 UTC)
PR tests (clang-10, ubuntu, mpich) Build for 85f8124 (2023-06-13 15:16:40 UTC)
PR tests (nvidia cuda 11.2, gcc-9, ubuntu, mpich) Build for 85f8124 (2023-06-13 15:16:40 UTC)
|
dd1c401
to
222c937
Compare
Are the first two commits clearing up a first/second handler kind of situation? Why shouldn't a runnable be made and called there? |
More deeply, do we need to maintain some notion of 'system mode' runnables that do all of the relevant instrumentation for occasions when we want to trace and account for low-level activity? |
This is for the particular case where the collection message send is completely local. It goes through the location manager and then needs to call back into the collection manager. Creating a runnable in this case isn't particularly useful because the context it preserved on the stack (the runnable should not be enqueued). Once the collection's runnable is created, it preserves the context. If the send is off node, the location manager will create runnables for its handlers to forward, etc. Do you think this case is fine to optimize out? |
@@ -963,7 +963,8 @@ void ActiveMessenger::prepareActiveMsgToRun( | |||
if (is_obj) { | |||
objgroup::dispatchObjGroup(base, handler, from_node, cont); | |||
} else { | |||
runnable::makeRunnable(base, not is_term, handler, from_node) | |||
auto m = base; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is base
declared as const&
here, necessitating the additional reference count bump and then move of the copy? Can ownership transfer flow further up the stack?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lifflander @stmcgovern Can one of you please respond to Phil's comment about this?
#if vt_check_enabled(trace_enabled) | ||
auto const han_type = HandlerManager::getHandlerRegistryType(handler); | ||
if (han_type == auto_registry::RegistryTypeEnum::RegVrt or | ||
han_type == auto_registry::RegistryTypeEnum::RegGeneral or | ||
han_type == auto_registry::RegistryTypeEnum::RegObjGroup) { | ||
r->addContextTrace(msg, handler, from); | ||
r->addContextTrace(r->getMsg(), handler, from); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If r
has the message pointer as a member now, then this method can just refer to it internally, can't it? I feel like we may have discussed that at some point in the past, and there was a reason not to back then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lifflander @stmcgovern Can one of you please respond to Phil's comment about this?
Please squash the 'fix' commits into the commit they're fixing, for atomicity of evolution and review. |
This generally looks good to me, even if only part of the code base is transformed to avoid copies and captures, and there's more to be done later. |
That sounds good then. We don't need to separately account those two bits different bits of overhead, since we already have a clear context in which it's occurring. |
db26edd
to
f823b5c
Compare
Not ready yet, but took draft off to get more testing. |
1ee5caa
to
fed8a98
Compare
a3cd2b8
to
ef593be
Compare
ef593be
to
5cb0060
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good but there are some open questions.
@@ -963,7 +963,8 @@ void ActiveMessenger::prepareActiveMsgToRun( | |||
if (is_obj) { | |||
objgroup::dispatchObjGroup(base, handler, from_node, cont); | |||
} else { | |||
runnable::makeRunnable(base, not is_term, handler, from_node) | |||
auto m = base; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lifflander @stmcgovern Can one of you please respond to Phil's comment about this?
#if vt_check_enabled(trace_enabled) | ||
auto const han_type = HandlerManager::getHandlerRegistryType(handler); | ||
if (han_type == auto_registry::RegistryTypeEnum::RegVrt or | ||
han_type == auto_registry::RegistryTypeEnum::RegGeneral or | ||
han_type == auto_registry::RegistryTypeEnum::RegObjGroup) { | ||
r->addContextTrace(msg, handler, from); | ||
r->addContextTrace(r->getMsg(), handler, from); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lifflander @stmcgovern Can one of you please respond to Phil's comment about this?
auto base_msg = user_msg.template to<BaseMsgType>(); | ||
return messaging::PendingSend(base_msg, [=](MsgPtr<BaseMsgType> in) { | ||
return messaging::PendingSend(std::move(base_msg), [=](MsgPtr<BaseMsgType>&& in) mutable { | ||
bool const is_obj = HandlerManager::isHandlerObjGroup(typed_handler); | ||
if (is_obj) { | ||
objgroup::dispatchObjGroup( | ||
user_msg.template to<BaseMsgType>(), typed_handler, node, nullptr | ||
); | ||
} else { | ||
runnable::makeRunnable(user_msg, true, typed_handler, node) | ||
runnable::makeRunnable(std::move(user_msg), true, typed_handler, node) | ||
.withTDEpochFromMsg() | ||
.enqueue(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this lambda keep using user_msg
(including to cast it to BaseMsgType
again) instead of using in
?
@@ -727,15 +730,17 @@ void EntityLocationCoord<EntityID>::routePreparedMsg( | |||
|
|||
if (use_eager) { | |||
theMsg()->pushEpoch(epoch); | |||
routeMsgEager<MessageT>(msg->getEntity(), msg->getHomeNode(), msg); | |||
routeMsgEager<MessageT>(msg->getEntity(), msg->getHomeNode(), std::move(msg)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need to extract the entity and home node before this call in case the move is evaluated first?
theMsg()->pushEpoch(epoch); | ||
routeMsgNode<MessageT>( | ||
msg->getEntity(), msg->getHomeNode(), node, msg | ||
m->getEntity(), m->getHomeNode(), node, std::move(m) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need to extract the entity and home node before this call in case the move is evaluated first?
82a360e
to
85f8124
Compare
This demonstrates performance improvement in the following performance test From Develop @ 6109deb
From this branch 2092
|
Fixes #2092