-
Notifications
You must be signed in to change notification settings - Fork 542
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[nvidia] Capture more nvidia commands #3777
[nvidia] Capture more nvidia commands #3777
Conversation
Congratulations! One of the builds has completed. 🍾 You can install the built RPMs by following these steps:
Please note that the RPMs should be used only in a testing environment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me. I re-started timeouted tests that failed.
sos/report/plugins/nvidia.py
Outdated
self.add_service_status("nvidia-persistenced") | ||
self.add_service_status("nvidia-fabricmanager") | ||
self.add_service_status("nvidia-toolkit-firstboot") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curious, would it good to add journal for all these too. i.e. if one only wants to enable the nvidia
plugin and nothing else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. I'll addthe journals as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If these are added to the services
tuple, it can be used for plugin enablement as well as automatically getting the journal and service status.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. Shall I add the three services to the plugin enablement? We have its journal capture a bit different:
self.add_journal(boot=0, identifier='nvidia-persistenced')
But a quick test I did in a RHEL AI image showed that the nvidia-persistenced service is captured correctly. My only doubt is the 'boot=0' option.
5ca4480
to
7296934
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
@@ -44,4 +53,5 @@ def setup(self): | |||
) | |||
self.add_journal(boot=0, identifier='nvidia-persistenced') | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, do we need the extra line here? not overly precious about it though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I think we can remove it.
My only doubt then is what to do with this add_journal() call before it. The service is now in the services tuple, but the boot=0 is what makes me doubt if it's safe or not to remove it. Any thoughts on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At current, this will result in duplicate journal collections for nvidia-persistenced
, once for the whole journal and then again for only the current boot. Since that service is now in the services
tuple, we automatically get the journal.
Thank you for clarifying it Jake. I'll remove the extra capture and will push again. |
7296934
to
db02bec
Compare
Capture commands related to nvidia container toolkit. Related: RHEL-58172 Signed-off-by: Jose Castillo <[email protected]>
db02bec
to
7e0a1fe
Compare
Capture commands related to nvidia container toolkit.
Related: RHEL-58172
Please place an 'X' inside each '[]' to confirm you adhere to our Contributor Guidelines