Skip to content

[WIP] Preload disposables #660

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

ben-grande
Copy link
Contributor

For: QubesOS/qubes-issues#1512


Untested and incomplete.


:return:
"""
dispvm_template = self.arg
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.dest?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But also, who will call it? If something from inside qubesd, then it doesn't need to be an "API" method, just a function somewhere, probably next to the dispvm class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.dest?

WIll fix.

But also, who will call it? If something from inside qubesd, then it doesn't need to be an "API" method, just a function somewhere, probably next to the dispvm class.

Haven't determined yet who will call it and not sure it will be something inside qubesd, all I gethered so far is that it needs to run after the autostarted qubes starts, which I presume can be easily monitored.

dispvm_template = self.arg
preload_dispvm = dispvm_template.features.get("preload-dispvm", None)
## TODO: should this function receive disposable name as argument
## or call admin.vm.CreateDisposable?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO makes more sense to create here.

There should be also somewhere a counter how many preloaded disposables should be and if that limit (which may be 0) is reached already, do nothing here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix.


:return:
"""
used_dispvm = self.arg
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.dest?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix.

Comment on lines 414 to 418
preload_dispvm = preload_dispvm.split(" ")
if use_first:
used_dispvm_name = preload_dispvm[1:]
else:
used_dispvm_name = used_dispvm.name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also check if the selected dispvm is actually one of preloaded ones.
But also, what is the use case of pointing at specific DispVM, instead of its template?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also check if the selected dispvm is actually one of preloaded ones.

Will fix.

But also, what is the use case of pointing at specific DispVM, instead of its template?

In case a specific preloaded dispvm is unpaused.

@qubes.api.method(
"internal.dispvm.preload.Use", scope="local", write=True
)
async def dispvm_preload_use(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be an utility function somewhere, not an API method exposed to external callers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also moved internal.dispvm.preload.Create as a utility function instead of API method.

Comment on lines 210 to 312
if self.is_domain_preloaded():
self.pause()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is easy thing to do, but requires testing if it isn't too early. Some services (especially gui-agent) may still be starting.
There is an qubes.WaitForSession service that may help (ensure to use async handler to not block qubesd while waiting on it).
And also, the event is called domain-start.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an qubes.WaitForSession service that may help (ensure to use async handler to not block qubesd while waiting on it).

What is preloaded dispvm has gui feature disabled in case the user is using the disposable to be qubes-builder-dvm for example, that doesn't require a GUI? What can be done in this case?

See also QubesOS/qubes-issues#9789 related to gui feature and +WaitForSession.

And also, the event is called domain-start.

Right... My mind was on domain-unpaused therefore domain-started.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is preloaded dispvm has gui feature disabled in case the user is using the disposable to be qubes-builder-dvm for example, that doesn't require a GUI? What can be done in this case?

Very good question. In that case I guess start even is all we have. Maybe with some (configurable?) delay?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a delay for both enabled gui and disabled gui. There is attribute/prefs qrexec_timeout, I used a delay lower than that, not sure it should be configurable as I don't know the benefits. The preloaded DispVM is paused when the timeout is reached.

def on_domain_unpaused(self):
"""Mark unpaused preloaded domains as used."""
if self.is_domain_preloaded():
self.app.qubesd_call(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't exist in core-admin - it's only in core-admin-client. Here, you are running inside qubesd already, you can simply call appropriate function directly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix.

)
threshold = 1024 * 5
if memory >= (available_memory - threshold):
await DispVM.create_preloaded_dispvm(self)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong, should have called QubesAdminAPI.create_disposable with preload argument.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or rather DispVM.from_appvm

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is tiny bit of code that I didn't want to duplicate from create_disposable():

        dispvm = await qubes.vm.dispvm.DispVM.from_appvm(
            dispvm_template, preload=preload
        )
        # TODO: move this to extension (in race-free fashion, better than here)
        dispvm.tags.add("created-by-" + str(self.src))
        dispvm.tags.add("disp-created-by-" + str(self.src))

The tags... although I see some tests not using it:

% rg from_appvm ../qubes-core-*
../qubes-core-admin/qubes/tests/vm/dispvm.py
90:    def test_000_from_appvm(self, mock_storage, mock_makedirs, mock_symlink):
104:                qubes.vm.dispvm.DispVM.from_appvm(self.appvm)
117:    def test_001_from_appvm_reject_not_allowed(self):
120:                qubes.vm.dispvm.DispVM.from_appvm(self.appvm)

../qubes-core-admin/qubes/tests/integ/dispvm.py
211:            qubes.vm.dispvm.DispVM.from_appvm(self.disp_base)
229:            qubes.vm.dispvm.DispVM.from_appvm(self.disp_base)

../qubes-core-admin/qubes/vm/dispvm.py
390:    async def from_appvm(cls, appvm, preload=False, **kwargs):
401:        >>> dispvm = qubes.vm.dispvm.DispVM.from_appvm(appvm).start()

../qubes-core-admin/qubes/api/admin.py
1291:        dispvm = await qubes.vm.dispvm.DispVM.from_appvm(

../qubes-core-admin-client/qubesadmin/tools/qvm_run.py
277:        dispvm = qubesadmin.vm.DispVM.from_appvm(args.app,

../qubes-core-admin-client/qubesadmin/tests/vm/dispvm.py
36:        vm = qubesadmin.vm.DispVM.from_appvm(self.app, None)
55:        vm = qubesadmin.vm.DispVM.from_appvm(self.app, 'test-vm')
66:        vm = qubesadmin.vm.DispVM.from_appvm(self.app, None)
72:        vm = qubesadmin.vm.DispVM.from_appvm(self.app, None)
82:        vm = qubesadmin.vm.DispVM.from_appvm(self.app, 'test-vm')
92:        vm = qubesadmin.vm.DispVM.from_appvm(self.app, None)

../qubes-core-admin-client/qubesadmin/vm/__init__.py
451:    def from_appvm(cls, app, appvm):

And you are correct, using create in the name of something that is not creating something, but marking, gave me the impression it would create the qube...

## before starting a qube?
memory = getattr(self, "memory", 0)
available_memory = (
psutil.virtual_memory().available / (1024 * 1024)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get what you mean, but this will not work. This looks only at free memory in dom0, not the whole system. And even if it would look more globally, qmemman tries to allocate available memory as much as possible. Only qmemman knows how much "free" memory you really have, and currently there is no API to query that...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will leave this for last, as it touches another component (qmmeman).

Copy link
Member

@marmarek marmarek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the overall structure is getting close, but there are still several details. And tests missing, but this you know.

preload_dispvm_max = int(
self.features.check_with_template("preload-dispvm-max", 0)
)
if preload_dispvm_max == 0:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is too late, no? I mean, the VM is created at this point already.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check probably should be in from_appvm when preload=True. And also similar check for >0, if there aren't enough preloaded dispvms already (and if there are, probably raise an exception, not just return).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is too late, no?

Yes, moved to from_appvm.

And also similar check for >0, if there aren't enough preloaded dispvms already

This case will be handled as an event instead of API method. On the next push, let's see what you think about it.

(and if there are, probably raise an exception, not just return).

Done.

dispvm_template = getattr(self, "template")
dispvm = self
elif (
self.klass == "AppVM"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is defined in a DispVM class, the klass is always DispVM here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing appvm clause.

self.klass == "AppVM"
and getattr(self, "template_for_dispvms", False)
):
dispvm_template = dispvm
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides condition always being false, dispvm is not set anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed by removing appvm clause.


preload_dispvm = dispvm_template.features.get("preload-dispvm", None)
if not preload_dispvm:
return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks wrong - this shouldn't happen, as use_preloaded_dispvm should only be called on a preloaded dispvm, at this point there should be something in this feature. This should at least be a warning, if not even an exception.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed as everything that calls use_preloaded is checking if it is empty or even if a specific disposable is in it.

if dispvm.name not in preload_dispvm:
raise qubes.exc.QubesException("DispVM is not preloaded")
used_dispvm = dispvm.name
else:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as above about the class

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed by removing clause.

dispvm.features["internal"] = False
## TODO: confirm if this guess of event code is correct
self.app.fire_event_for_permission(dispvm_template=dispvm_template)
dispvm_template.fire_event_async("domain-preloaded-dispvm-used")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_async needs await. And also, it may be useful to add dispvm (name? object?) as an event parameter.
And please document the event (there is specific sphinx syntax for events, see others for example).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

)
threshold = 1024 * 5
if memory >= (available_memory - threshold):
await DispVM.create_preloaded_dispvm(self)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or rather DispVM.from_appvm

async def on_domain_unpaused(self):
"""Mark unpaused preloaded domains as used."""
if self.is_domain_preloaded():
await DispVM.use_preloaded_dispvm(self)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.use_preloaded_dispvm()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

dispvm = app.add_new_vm(
cls, template=appvm, auto_cleanup=True, **kwargs
)
await dispvm.create_on_disk()

if preload:
await DispVM.create_preloaded_dispvm(dispvm)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dispvm.create_preloaded_dispvm()
But also, I think this function should be named like mark_preloaded() or such, as it doesn't really create anything, it's called when the VM is already created.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

if preload_dispvm:
dispvm = preload_dispvm.split(" ")[0]
dispvm = app.domains[dispvm]
await DispVM.use_preloaded_dispvm(dispvm)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dispvm.use_preloaded_dispvm()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@ben-grande ben-grande force-pushed the preload-dispvm branch 3 times, most recently from 596c882 to a76ed3e Compare February 27, 2025 18:11
@@ -0,0 +1,17 @@
[Unit]
Description=Preload Qubes DispVMs
[email protected]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure it will work this way? I don't think it will cover all instances of the qubes-vm@ units...
But maybe adding Before=qubes-preload-dispvm.service to [email protected] file will work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed my option failed. I did test without rebooting a qube enabling on runtime, not a valid test.
I did not want to touch qubes-vm@ but ordering after a template seems undocumented. Tested the way you said and it is best.


[Service]
Type=oneshot
Environment=DISPLAY=:0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is display really needed here? Xorg may not be running yet at this stage (and even when it is, user may not be logged in yet).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied from [email protected], if it is not needed here, it is not needed there also. Maybe it is needed for the gui agent to be able to create the sys-net Network Manager tray icon for example

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested the following:

DISPLAY= qvm-start QUBE
qvm-run -- QUBE xterm

Also tested the service with

Environment=DISPLAY=
ExecStart=/usr/bin/qvm-start %i

And both worked.

It opened the terminal. It seems to not be needed. But in 2013 it was?

commit 59b9e43060c3b9bd717e9cc59979153f86f1b9b7 (tag: mm_59b9e430)
Author: Marek Marczykowski-Górecki <[email protected]>
Date:   Tue Nov 26 16:53:26 2013

    Fix VM autostart - set $DISPLAY env variable

    Without this, started qrexec-daemon would not have access to GUI,
    especially can't display Qubes RPC confirmation dialogs.

diff --git a/linux/systemd/[email protected] b/linux/systemd/[email protected]
index 1770d6d9..ffa61763 100644
--- a/linux/systemd/[email protected]
+++ b/linux/systemd/[email protected]
@@ -4,6 +4,7 @@ After=qubes-netvm.service

 [Service]
 Type=oneshot
+Environment=DISPLAY=:0
 ExecStart=/usr/bin/qvm-start --no-guid %i
 Group=qubes
 RemainAfterExit=yes

There is no option --no-guid anymore. For qvm_run.py, the default is to use args.gui if DISPLAY is set. I have found nothing special on qvm_start.py

Seems "safe" to remove from both services.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, things have changed since 2013, now qvm-start only sends an API call to qubesd, it doesn't start those services as its own children anymore.

b"admin.vm.CreateDisposable", b"dom0", arg="preload-autostart"
)
# TODO: doesn't return any value, so how to check if it was preloaded?
#dispvm_preload = self.vm.features.get("preload-dispvm", "").split(" ")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check if there is (exactly) one entry?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:) Yes... will fix.

Comment on lines 193 to 208
def get_feat_preload(self, feature):
if feature not in ["preload-dispvm", "preload-dispvm-max"]:
raise qubes.exc.QubesException("Invalid feature provided")

if feature == "preload-dispvm":
default = ""
elif feature == "preload-dispvm-max":
default = 0

value = self.features.check_with_template(feature, default)

if feature == "preload-dispvm":
return value.split(" ")
if feature == "preload-dispvm-max":
return int(value)
return None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be better to have two functions instead of the conditions in one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Is shorter, better.

@qubes.events.handler(
"domain-preloaded-dispvm-used", "domain-preloaded-dispvm-autostart"
)
async def on_domain_preloaded_dispvm_used(self, event, delay=5, **kwargs): # pylint: disable=unused-argument
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The event is fired on disposable template, so it should be attached there, not to DispVM class. See DVMTemplateMixin.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix.

:returns:
"""
await asyncio.sleep(delay)
while True:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be a check for preload-dispvm-max early. Right now, it looks like it will preload always at least one dispvm even if preload-dispvm-max is 0. I know preload-dispvm has a check for that, but it's still open for races, plus admin.vm.CreateDisposable method can be called by others too (it's a public API, for those with appropriate permission).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Comment on lines 282 to 286
# TODO: what to do if the maximum is never reached on autostart
# as there is not enough memory, and then a preloaded DispVM is
# used, calling for the creation of another one, while the
# autostart will also try to create one. Is this a race
# condition?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a very good question. Yes, I think this race condition is there. Maybe wrap preloading dispvm in a lock? The section under lock should do all checks (max count, free memory) + actual creating. And when the max is reached (regardless if it's autostart or not), simply exit the loop. This way, even if two instances of this function runs in parallel, still correct number of dispvms will be preloaded, and both instances will finish eventually.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, an async lock that exits when max is reached independent of the event seems like a great idea.

# W0236 (invalid-overridden-method) Method 'on_domain_started' was expected
# to be 'non-async', found it instead as 'async'
# TODO: Seems to conflict with qubes.vm.mix.net, which is pretty strange.
# Larger bug? qubes.vm.qubesvm.QubesVM has NetVMMixin... which conflicts...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to rename this function to not override one from the mixin...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.


proc = None
try:
proc = await asyncio.wait_for(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run_service only starts the service. You need to wait for it to complete with proc.communicate() for example. Or simply use run_service_for_stdio which will do that for you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

if preload_dispvm:
dispvm = app.domains[preload_dispvm[0]]
await dispvm.use_preloaded()
return dispvm
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unpause needs to happen at some point. Normally caller expect to receive not running VM, so it will call start() on it. But this won't unpause it... One option would be to modify start() to unpause VM if it's paused. Sounds like a big change, but maybe nothing will explode?
Alternatively, unpause here, and rely on the fact that start() on a running VM is no-op, so all should work (I hope).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both options seems to have different blast ranges. I will choose the smaller blast which is to unpause it here. Also in case it is not accompanied by .start(). Although I don't see any problem with unpausing when asking to start a qube, as running any Qrexec service targeting a qube already unpauses it.

@ben-grande ben-grande force-pushed the preload-dispvm branch 2 times, most recently from a1fe380 to e6cffff Compare March 13, 2025 21:15
@@ -1,12 +1,12 @@
[Unit]
Description=Start Qubes VM %i
After=qubesd.service qubes-meminfo-writer-dom0.service
[email protected]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Before=qubes-vm@.service
Before=qubes-preload-dispvm.service

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed....

self.app.domains[self.vm].fire_event = self.emitter.fire_event
self.vm.features["preload-dispvm-max"] = "1"
self.app.default_dispvm = self.vm
# TODO: how to mock start of a qube that we don't have its object?
Copy link
Member

@marmarek marmarek Mar 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can also patch the start method at the class level, like unittest.mock.patch("qubes.vm.dispvm.DispVM.start"). A heavier hammer, but should be fine in this test context.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did this. Thanks.

# on it).
# TODO:
# Test if pause isn't too late, what if application autostarts, will
# it open before the qube is paused?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it will. Theoretically there is an "invisible" mode for gui-daemon for situation like this (it was used for very old implementation of DispVM that also kinda preloaded it). But there is no support for flipping it in runtime, gui-daemon needs to be restarted for that, so that's a broader change to use it in this version. Maybe later, I'd say it's okay to ignore this issue for now.

Copy link
Contributor Author

@ben-grande ben-grande Mar 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add this comment there. Nothing to do about this now.

@ben-grande ben-grande marked this pull request as ready for review March 14, 2025 19:18
@ben-grande
Copy link
Contributor Author

Removed draft but still not ready yet, but Gitlab CI fails to fetch amended and force pushed commits when using the Github API.

Having problems with events not being fired, trying to understand it.

@ben-grande ben-grande changed the title Preload disposables [WIP] Preload disposables Mar 14, 2025
]
method = "admin.vm.CreateDisposable"
for qube in appvms:
qube.qubesd_call("dom0", method, "preload-autostart")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks wrong, as the first argument is the call target. So, it calls the method on dom0 several times, instead of each disposable template...
See how qubesd_call is used in qubesadmin.vm.QubesVM. But since that uses private attribute, I guess it needs a wrapper (and then make all self.qubesd_call(self._method_dest, ...) use that new wrapper?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/QubesOS/qubes-core-admin-client/blob/main/qubesadmin/vm/__init__.py#L436-L441

            if self._method_dest.startswith('$dispvm:'):
                method_dest = self._method_dest[len('$dispvm:'):]
            else:
                method_dest = 'dom0'
            dispvm = self.app.qubesd_call(method_dest,
                'admin.vm.CreateDisposable')

I guess it can always be the disposable template in this case?

    for qube in appvms:
        qube.qubesd_call(qube.name, method, "preload-autostart")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the latter should work too.

@marmarek
Copy link
Member

This didn't went far:

Mar 19 17:31:15.017096 dom0 qubesd[4569]: unhandled exception while calling src=b'dom0' meth=b'admin.vm.Start' dest=b'sys-firewall' arg=b'' len(untrusted_payload)=0
Mar 19 17:31:15.017096 dom0 qubesd[4569]: Traceback (most recent call last):
Mar 19 17:31:15.017096 dom0 qubesd[4569]:   File "/usr/lib/python3.13/site-packages/qubes/api/__init__.py", line 333, in respond
Mar 19 17:31:15.017096 dom0 qubesd[4569]:     response = await self.mgmt.execute(
Mar 19 17:31:15.017096 dom0 qubesd[4569]:                ^^^^^^^^^^^^^^^^^^^^^^^^
Mar 19 17:31:15.017096 dom0 qubesd[4569]:         untrusted_payload=untrusted_payload
Mar 19 17:31:15.017096 dom0 qubesd[4569]:         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Mar 19 17:31:15.017096 dom0 qubesd[4569]:     )
Mar 19 17:31:15.017096 dom0 qubesd[4569]:     ^
Mar 19 17:31:15.017096 dom0 qubesd[4569]:   File "/usr/lib/python3.13/site-packages/qubes/api/admin.py", line 947, in vm_start
Mar 19 17:31:15.017096 dom0 qubesd[4569]:     await self.dest.start()
Mar 19 17:31:15.017096 dom0 qubesd[4569]:   File "/usr/lib/python3.13/site-packages/qubes/vm/dispvm.py", line 426, in start
Mar 19 17:31:15.017096 dom0 qubesd[4569]:     await super().start(**kwargs)
Mar 19 17:31:15.017096 dom0 qubesd[4569]:   File "/usr/lib/python3.13/site-packages/qubes/vm/qubesvm.py", line 1522, in start
Mar 19 17:31:15.017096 dom0 qubesd[4569]:     await self.fire_event_async(
Mar 19 17:31:15.017096 dom0 qubesd[4569]:         "domain-start", start_guid=start_guid
Mar 19 17:31:15.017096 dom0 qubesd[4569]:     )
Mar 19 17:31:15.017096 dom0 qubesd[4569]:   File "/usr/lib/python3.13/site-packages/qubes/events.py", line 234, in fire_event_async
Mar 19 17:31:15.017096 dom0 qubesd[4569]:     sync_effects, async_effects = self._fire_event(
Mar 19 17:31:15.017096 dom0 qubesd[4569]:                                   ~~~~~~~~~~~~~~~~^
Mar 19 17:31:15.017096 dom0 qubesd[4569]:         event, kwargs, pre_event=pre_event
Mar 19 17:31:15.017096 dom0 qubesd[4569]:         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Mar 19 17:31:15.017096 dom0 qubesd[4569]:     )
Mar 19 17:31:15.017096 dom0 qubesd[4569]:     ^
Mar 19 17:31:15.017096 dom0 qubesd[4569]:   File "/usr/lib/python3.13/site-packages/qubes/events.py", line 169, in _fire_event
Mar 19 17:31:15.017096 dom0 qubesd[4569]:     effect = func(self, event, **kwargs)
Mar 19 17:31:15.017096 dom0 qubesd[4569]: TypeError: DispVM.on_domain_started_dispvm() got an unexpected keyword argument 'start_guid'

@marmarek
Copy link
Member

And this:

Mar 19 17:31:05.021673 dom0 qubesd[4569]: vm.sys-firewall: Activating the sys-firewall VM
Mar 19 17:31:05.040131 dom0 qubesd[4569]: Uncaught exception from domain-unpaused handler for domain sys-firewall
Mar 19 17:31:05.040131 dom0 qubesd[4569]: Traceback (most recent call last):
Mar 19 17:31:05.040131 dom0 qubesd[4569]:   File "/usr/lib/python3.13/site-packages/qubes/app.py", line 1624, in _domain_event_callback
Mar 19 17:31:05.040131 dom0 qubesd[4569]:     vm.fire_event("domain-unpaused")
Mar 19 17:31:05.040131 dom0 qubesd[4569]:     ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^
Mar 19 17:31:05.040131 dom0 qubesd[4569]:   File "/usr/lib/python3.13/site-packages/qubes/events.py", line 200, in fire_event
Mar 19 17:31:05.040131 dom0 qubesd[4569]:     sync_effects, async_effects = self._fire_event(
Mar 19 17:31:05.040131 dom0 qubesd[4569]:                                   ~~~~~~~~~~~~~~~~^
Mar 19 17:31:05.040131 dom0 qubesd[4569]:         event, kwargs, pre_event=pre_event
Mar 19 17:31:05.040131 dom0 qubesd[4569]:         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Mar 19 17:31:05.040131 dom0 qubesd[4569]:     )
Mar 19 17:31:05.040131 dom0 qubesd[4569]:     ^
Mar 19 17:31:05.040131 dom0 qubesd[4569]:   File "/usr/lib/python3.13/site-packages/qubes/events.py", line 169, in _fire_event
Mar 19 17:31:05.040131 dom0 qubesd[4569]:     effect = func(self, event, **kwargs)
Mar 19 17:31:05.040131 dom0 qubesd[4569]: TypeError: DispVM.on_domain_unpaused() takes 1 positional argument but 2 were given

@ben-grande ben-grande force-pushed the preload-dispvm branch 2 times, most recently from 6c641cc to 67e48e9 Compare April 14, 2025 19:02
@marmarek
Copy link
Member

Looks like unit tests hanged, specifically qubes.tests.api_admin/TC_00_VMs/test_643_vm_create_disposable_preload_autostart ... (or the one after?)

@ben-grande
Copy link
Contributor Author

Looks like unit tests hanged, specifically qubes.tests.api_admin/TC_00_VMs/test_643_vm_create_disposable_preload_autostart ... (or the one after?)

qubes.tests.api_admin/TC_00_VMs/test_643_vm_create_disposable_preload_autostart ... ok
started
Domain 'runner-32522443-project-22468673-concurrent-0-job-9718764351' destroyed
Domain 'runner-32522443-project-22468673-concurrent-0-job-9718764351' has been undefined

Strange, as they don't hang locally, only fail (as I haven't got them to work yet).

% ./run-tests -f qubes.tests.api_admin/TC_00_VMs/test_643_vm_create_disposable_preload_autostart
...
qubes.tests.api_admin/TC_00_VMs/test_643_vm_create_disposable_preload_autostart ... FAIL (AssertionError: AssertionError("event 'domain-preloaded-dispvm-autostart' did not fire on <qubes.tests.TestEmitter object at 0x7019b78e0dd0>"))
FAIL

======================================================================
FAIL: qubes.tests.api_admin/TC_00_VMs/test_643_vm_create_disposable_preload_autostart
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python3.11/contextlib.py", line 81, in inner
    return func(*args, **kwds)
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/unittest/mock.py", line 1369, in patched
    return func(*newargs, **newkeywargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/src/contrib/qubes-core-admin/qubes/tests/api_admin.py", line 3780, in test_643_vm_create_disposable_preload_autostart
    self.assertEventFired(self.emitter, "domain-preloaded-dispvm-autostart")
  File "/home/user/src/contrib/qubes-core-admin/qubes/tests/__init__.py", line 709, in assertEventFired
    self.fail(
AssertionError: event 'domain-preloaded-dispvm-autostart' did not fire on <qubes.tests.TestEmitter object at 0x7019b78e0dd0>

----------------------------------------------------------------------
Ran 1 test in 5.410s

FAILED (failures=1)
% ./run-tests -f qubes.tests.api_admin/TC_00_VMs/test_643_vm_create_disposable_preload_use
...
qubes.tests.api_admin/TC_00_VMs/test_643_vm_create_disposable_preload_use ... ERROR (RuntimeError: RuntimeError('\n{message : RuntimeWarning("coroutine \'TC_00_VMs.dummy_coro\' was never awaited"), category : \'RuntimeWarning\', filename : \'/home/user/src/contrib/qubes-core-admin/qubes/vm/dispvm.py\', lineno : 396, line : None}'))
ERROR (RuntimeError:
{message : RuntimeWarning("coroutine 'TC_00_VMs.dummy_coro' was never awaited"), category : 'RuntimeWarning', filename : '/home/user/src/contrib/qubes-core-admin/qubes/vm/dispvm.py', lineno : 396, line : None})

======================================================================
ERROR: qubes.tests.api_admin/TC_00_VMs/test_643_vm_create_disposable_preload_use
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python3.11/contextlib.py", line 81, in inner
    return func(*args, **kwds)
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/unittest/mock.py", line 1369, in patched
    return func(*newargs, **newkeywargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/src/contrib/qubes-core-admin/qubes/tests/api_admin.py", line 3828, in test_643_vm_create_disposable_preload_use
    self.assertEqual(dispvm.features.get("internal", False), False)
AssertionError: '1' != False

During handling of the above exception, another exception occurred:

RuntimeError:
{message : RuntimeWarning("coroutine 'TC_00_VMs.dummy_coro' was never awaited"), category : 'RuntimeWarning', filename : '/home/user/src/contrib/qubes-core-admin/qubes/vm/dispvm.py', lineno : 396, line : None}

----------------------------------------------------------------------
Ran 1 test in 0.477s

FAILED (errors=1)

@ben-grande ben-grande force-pushed the preload-dispvm branch 2 times, most recently from 3949e63 to 3eeae8b Compare April 15, 2025 16:18
MODULE+"/"+CLASS+"/"+FUNCTION.
Example: qubes.tests.basic/TC_00_Basic/test_000_create

Some integration tests require the environment variable QUBES_TEST_TEMPLATES to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not true, none requires it (and if something requires this variable set, it's a bug that should never hit the main branch). You can use this variable to limit on what templates test are run (but when listing specific tests, it isn't really necessary as test names will include template name already, so you can select template this way instead).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests fails to load and be listed when QUBES_TEST_TEMPLATES is not set:

$ grep _FailedTest /tmp/out
cli (unittest.loader._FailedTest.cli)
qubes.tests.integ.basic (unittest.loader._FailedTest.qubes.tests.integ.basic)
qubes.tests.integ.grub (unittest.loader._FailedTest.qubes.tests.integ.grub)
qubes.tests.integ.devices_block (unittest.loader._FailedTest.qubes.tests.integ.devices_block)
qubes.tests.integ.qrexec (unittest.loader._FailedTest.qubes.tests.integ.qrexec)
qubes.tests.integ.qrexec_perf (unittest.loader._FailedTest.qubes.tests.integ.qrexec_perf)
qubes.tests.integ.dom0_update (unittest.loader._FailedTest.qubes.tests.integ.dom0_update)
qubes.tests.integ.vm_update (unittest.loader._FailedTest.qubes.tests.integ.vm_update)
qubes.tests.integ.network (unittest.loader._FailedTest.qubes.tests.integ.network)
qubes.tests.integ.network_ipv6 (unittest.loader._FailedTest.qubes.tests.integ.network_ipv6)
qubes.tests.integ.dispvm (unittest.loader._FailedTest.qubes.tests.integ.dispvm)
qubes.tests.integ.vm_qrexec_gui (unittest.loader._FailedTest.qubes.tests.integ.vm_qrexec_gui)
qubes.tests.integ.audio (unittest.loader._FailedTest.qubes.tests.integ.audio)
qubes.tests.integ.mime (unittest.loader._FailedTest.qubes.tests.integ.mime)
qubes.tests.integ.salt (unittest.loader._FailedTest.qubes.tests.integ.salt)
qubes.tests.integ.backup (unittest.loader._FailedTest.qubes.tests.integ.backup)
qubes.tests.integ.backupdispvm (unittest.loader._FailedTest.qubes.tests.integ.backupdispvm)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And that is a problem to fix, not to workaround by setting this variable. Tests that fails to load this way is a blocking failure, I won't merge a PR that introduces such regression.

Try to import one of those modules in python interpreter manually and see what exactly fails.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to be fixed now, will remove the comment.

@@ -113,6 +115,7 @@ def get_system_info(cls, app):
system_info = {
"domains": {
domain.name: {
"features": dict(domain.features.items()),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You setup events only for "internal" feature, not all of them, so changing others will lead to an outdated cache. I know others are not used right now, but it's easy to miss this when somebody would want to use it. I see two solutions:

  • add events to invalidate cache on other feature changes too
  • include only internal feature here (maybe even as a single entry, not a features dict, to not mislead that the list is complete; or maybe some-feature or something)

I slightly prefer the second option, as it will invalidate cache less often, but I'm okay with the first option too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants