-
Notifications
You must be signed in to change notification settings - Fork 912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xmlrpc.XmlRpcNode consumes 100%+ cpu after being shutdown #2238
Comments
I've noticed the same issues on the ROS Melodic/Ubuntu 18 but only with the Python 3.6.9. On the other hand this is not the case with the Python 2.7. |
Same for me. Unusable at the moment. We want to switch between two driving modes with different set of nodes. we want to avoid running all at the time when it is not necessary. for that we have a launch_handler with service which should start the corresponding launch file. when killing one each time a new instance of my handler appears in my process list, which generates 100% cpu load. for each on/off it generates a new one. |
@TobiMiller @fgrcar Here is how I worked around the problem:
Let me know if it works for you. |
@drjsmith Thanks so much for posting what helped for you ! I'm surprised this hasn't been addressed sooner, it's really a pain. I made a QT user Interface using the roslaunch API basically to launch and shutdown nodes of our stack. But I'm seeing the same CPU error you mentioned as I shutdown the nodes, which makes our Qt interface extremely slow and impossible to use as some nodes are being shut. Your proposition seems to work for me so far though! Based on it, I wrote a quick class that the Qt interface calls to launch and shutdown nodes :
To launch a node from the QT :
and the calls are made like this :
Thank you so much for this, it really helped ! I do hope they address the issue and find a proper way to fix it so we can reuse the API just like on previous ROS versions. |
@RodolpheCyber You're welcome! |
Fixing the issue with 100% CPU usage after the shutdown ros#2238
is this issue still present on melodic? |
@adipotnis I discovered the problem when upgrading from kinetic to noetic, so I haven't personally tested it on melodic. @fgrcar reported encountering the problem on melodic but only when using python 3. I wouldn't be surprised if the root bug has been present for many versions and has only been exposed due to the transition to python 3, though I haven't verified this. |
stop CPU usage spike when a rospy node receives a signals.SIGINT interrupt but before it terminates the process (e.g. pressing CTRL+C while the node is sleeping due to a rospy.sleep() call no longer sends CPU usage of a core to 100%) should also fix ros#2238
TLDR: calling
shutdown()
on aXmlRpcNode
does not fully shut it down and leaves a thread running.I am currently on Noetic/Ubuntu 20.04/python 3.8.
What is going on:
start()
results in a new thread runningself.server.serve_forever()
, whereserver
is of typeThreadingXMLRPCServer
which has a base class ofsocketserver.BaseServer
.BaseServer
insocketserver.py:215
, will not exit until the loop's condition is satisfied:while not self.__shutdown_request:
self.__shutdown_request
can be set by callingshutdown()
onBaseServer
XmlRpcNode
'sshutdown()
method never callsself.server.shutdown()
, onlyserver.socket.close()
andserver.server_close()
. I'm pretty sure those last 2 functions do the same thing, since:The thread is then stuck in an infinite loop. Before shutting down
XmlRpcNode
, the thread not use a significant amount of cpu; afterwards, however, the thread consumes 100% of a single core (as shown bytop
).It seems to me that the solution may be as simple as adding
self.server.shutdown()
toXmlRpcNode
'sshutdown()
function, though I'm not certain whether it should be before or after closing the socket or if there are other considerations.Why this matters to me:
I'm guessing this isn't a common problem, since generally the process would be ending shortly afterwards. However, I am using
roslaunch.parent.ROSLaunchParent
to launch and shutdown nodes as part of a python-based robot navigation benchmarking codebase. I wrote the code and used it extensively on Kinetic/Ubuntu 16.04/python 2.7 and did not encounter this issue at that time: starting and stopping hundreds of instances ofroslaunch.parent.ROSLaunchParent
did not cause any noticeable issues. As soon as I switched to Noetic/Ubuntu 20.04/python 3.8, I noticed that after running a few experiments the python process was at times using ~166% cpu. Usage hits 100% if there is only 1 'stopped' instance; ~166% if there are 2 or more.The cpu usage of the process drops very low as soon as I start a new instance; I'm not familiar enough with the
Selector
-related code to understand why this is so.Minimum example:
The script starts a roscore and then repeatedly creates, starts, and stops
roslaunch.parent.ROSLaunchParent
instances; runtop
to see how the process' cpu usage changes.The text was updated successfully, but these errors were encountered: