New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
spike: uki image revocation and rollback #2429
Comments
SBAT was made for this I think: https://github.com/rhboot/shim/blob/main/SBAT.md |
by reading it looks like https://0pointer.de/blog/brave-new-trusted-boot-world.html took some inspiration from, at least looks like similarly a concept of an incremental counter is required. What is not clear is how that ties to the sealing process. From the Lennart's article he mention that when sealing against a TPM PCR Bank it is not only the PCR policy, but also the counter is used - which I'm afraid it will make it incompatible with the current installs (I also can't find a reference on systemd-* about that, so looks like something that can't be - yet - used?) |
Hello! I'm a bot, an experiment of @mudler and @jimmykarily. Thank you for opening this issue. It seems that your issue, titled "spike: uki image revocation and rollback" (#2429) in the "kairos" repository owned by "kairos-io", is not completely clear regarding the relationship with the sealing process and the compatibility with current installations. As you provided some additional context and potential alternatives like SBAT and the concept of using TPM counters, we can better understand the problem you're trying to address. However, more details are needed to meet the project's requirements. Please provide steps to reproduce (if it's a bug), and mention the versions of the relevant artifacts being used. Once you provide the necessary information, we will be able to classify your issue more accurately and assign the appropriate label. Thank you for your patience and understanding, and please remember that I'm just a bot. Best regards! |
@ci-robbot hello there, how do I get the sbat in python? |
Hello! I'm a bot, an experiment of @mudler and @jimmykarily. Thank you for the response. To get the
Remember that this solution is valid as of the current state of the artifacts. Make sure to specify the versions of the relevant artifacts in your issue as required by the project's guidelines. If you have any further questions or need more clarification, feel free to ask. Best regards! This conversation has been automatically audited by bot. The audit has identified that the information provided is sufficient to meet the project's requirements. Therefore, the issue can be labeled appropriately. The required details include a description of the issue, steps to reproduce (if it's a bug), and the versions of the relevant artifacts being used. Since the issue is clear and meets the requirements, I will assign the following labels to the issue: [question, triage]. |
sbat is only understood and used by the shim (https://uapi-group.org/specifications/specs/unified_kernel_image/#uki-components). We don't use the shim so we can't rely on sbat for revocation. |
2 things:
|
dbx revocation would be good enough I guess? But it means we need to generate the dbx or a way of updating the dbx (AFAIK you can update that one from userspace somehow?) so people can generate those from their own built efi files. What happens when you dont have access to those efi files anymore? How do you generate the hash for it?
Now it happens that v1 has a cve and you want to block it. You release Custom v4 with the dbx updated.
I mean, sounds good to me to use the actual mechanism in place in the firmware for this but it entails a bit of laying down the exact supported way of doing this. Im my machine I had updates to the dbx provided directyl from https://github.com/fwupd/fwupd (https://fwupd.org/) so maybe its possible to do this, ship the daemon and have the customers provide their own update server with dbx files? for other usecases (no internet) maybe just an upgrade to a new version is good enough. Or the fwupdate can be used to also update it via local files somehow? |
Yep, seems to be possible: https://github.com/fwupd/fwupd/tree/main/plugins/uefi-dbx
|
I think I have a preference for my second suggestion (keys rotation) which blacklists every past image by enrolling a new key. Keys can also be appended in dbx, which makes me wonder what happens if a key is both in db and dbx. I guess dbx wins and the key is rejected (?). Let us play a bit manually in qemu before we start bricking devices :D. |
First approach to use sbctl here: Foxboron/sbctl#296 |
Also suggested some preparation work here: Foxboron/sbctl#297 |
In order to get a cert or an image blacklisted in dbx, in thousands of machines with no physical access, we'll need to run commands in user mode (e.g. using the upgrade controller). With some experimentation in qemu I verified that it's possible to enroll things in dbx as long as the binary used is signed with a valid (enrolled) key. NOTE: In qemu one has to use a 4m version of the secure boot firmware otherwise it's not possible to enroll things in dbx. This are the commands that worked for me (with a 4m firmware):
the dbx.auth file was an esl file (
We now need to figure out the commands to use in order to create a valid
If this works, it could be a revocation solution for KEK and db keys. If the PK key is compromised, I'm not sure the same process would work. I think it's possible to replace the PK key (some references here) but this would probably render the KEK and db keys invalid (unless they are replaced too). Anyway, one thing at a time, let's see this flow works for at least db keys. This is my TODO list:
|
Another thing to notice, in qemu, after enrolling once, it's not possible to enroll again (not even with
I'm not sure why. I can redo it if I reset the OVMF_VARS file:
|
In qemu I get strange results. I tried the following sequence of commands with all three PK, KEK and db in qemu with different results: $ export UUID=`uuidgen`
$ cert-to-efi-sig-list -g "Kairos-$UUID" keys/PK.pem PK-dbx.esl
$ sign-efi-sig-list -c keys/PK.pem -k keys/PK.key dbx PK-dbx.esl PK-dbx.auth
$ scp PK-dbx.auth [email protected]: # copy to the VM
# Inside the VM now:
[kairos@fedora ~]$ sudo efi-updatevar -f PK-dbx.auth dbx
[kairos@fedora ~]$ efi-readvar | grep dbx
Variable dbx, length 819
dbx: List 0, type X509 In the case of PK the enrollment succeeds and even go-uefi lists the cert from dbx (using a modified sbctl) In the case of KEK, it enrolls but go-uefi panics when trying to list dbx entries Even when it seems to successfully enroll the certificate in dbx (e.g. in the PK case), when I reboot the VM it still boots the livecd with secureboot enabled 🤷 ? Isn't the livecd efi signed with the same key? |
I even added the certificate's signature hash to ensure the one in dbx is the PK one:
Maybe dbx is poorly implemented in the qemu firmware (if at all)? I don't dare try such things on my Asus :D |
I correct myself. It works fine for db (the problem was that in qemu I cannot run the command a second time) and actually that's the only one that prevents the image from booting again. It seems that the "chain of trust" is somehow broken. I think the way we sign the PK, KEK and db files is wrong. Each one should be signing the other. After all, in the end, we only sign the efi file with the db one. I tried to fix the signing here: https://github.com/kairos-io/enki/compare/fix-signing?expand=1 but it doesn't seem to make a difference. This is what I see in qemu when the db cert is put in dbx: (not a very clear message but I assume the verification didn't pass) |
The question is why did the UEFI firmwares so far allowed us to enroll these badly signed keys? I think the answer is one of:
I think the last one is true. It's supported by the fact that systemd-boot auto enrolls in this order: db -> dbx -> KEK -> PK. and also by the comment in this link: If this explains why our badly signed certs were accepted for enrollement, it doesn't explain why my attempted fix to sign them properly, doesn't fix the "chain of trust" (see previous comment). |
I tried on my Asus PN64 just to check if it's only happening on qemu. I was able to enroll the badly signed certificates (not signing each other) in the order: PK -> KEK -> db and then I managed to boot the livecd just fine.
Should this happen? This "chain of trust" seems to be a very "loose" chain. (I made a change in enki genkey to give unique CN fields to each cert by appending the "type") |
From the UEFI spec:
The way I understand this is that even in setup mode, if you try to enroll something in db (or dbx) that is not signed by a cert in the KEK or the PK itself, the firmware should refuse to add it. This is not what my ASUS PN64 does 🤷 . |
and then on page 1424:
:D |
I did the following:
$ openssl x509 -outform der -in db.pem -out db.der
$ openssl x509 -outform der -in db.crt -out db.der
$ openssl x509 -inform DER -outform PEM -in db.der -out db.pem
$ cp blacklisted-keys/tpm2-pcr-private.pem keys/
$ export UUID=`uuidgen`
$ cert-to-efi-sig-list -g "Kairos-$UUID" keys/db.pem db-dbx.esl
$ sign-efi-sig-list -c keys/KEK.pem -k keys/KEK.key dbx db-dbx.esl db-dbx.auth
The result is that the fallback (passive) and recovery images are not bootable anymore proving that the old db is no longer accepted (because it's not even enrolled but it's blacklisted too, we shouldn't need to do both). The active image is bootable but generates lots of errors and login is not possible. Maybe what @mudler suggested in a call is true. Maybe the decryption of the encrypted partitions didn't happen for some reason. Need to investigate more. |
I hardcoded my ssh key in the image so that I can ssh after upgrade to collect logs. full journalctl logs: journal.txt |
tpm can't decrypt anymore? |
same command after upgrading to the image signed by a different key:
enrolled keys (after upgrade):
(TODO: update the comment with the values after upgrade) |
Tested without upgrade, just on a installed system and I seem to hit the same thing.
If I remove the Kairos2 key then it works again. There seems to be a connection between the Secureboot certs and the measurements somehow that we are not seeing. EDIT: This is on Ubuntu 24.04 |
Opened a ticket upstream on systemd to see if they can clarify systemd/systemd#32946 |
Updating here in case the other ticket goes nowhere. there is 2 ways of binding to a PCR when enrolling a luks partition/disk
Docs are kind fo confusing in here as they seem to be mutually exclusive, but are not. When we bind to the public-key-pcr 11, cryptenroll silently would also enroll to the PCR7, single measurement (Secureboot state and certs). The idea would be to set So it needs to be fixed upstream so we can skip binding to pcr7 automatically. There is a workaround for this, and its by skipping checking the tpm directly and using the tpm public SRK key to calculate the values. From systemd 255 and upwards, the tpm key is automaticaly extracted on boot and can be used to calculate the values to lock the luks device without ever going to the tpm directly by using This is now available in kcrypt v0.11.0 but it makes the minimum systemd version 255 (Ubuntu 24.04 and Fedora 40) and works perfectly. What does this means?
|
Talos has some utils to add and manage luks keys, maybe its possible to unlock and add a new key via that? |
CRYPTSETUP CAN ADD NEW KEYS!!!
That seems to use the tpm2 to update the keys, not asking for a password or anything!! So we could probably leverage that to sync a new tpm key if needed, even if its a manual action, we could do the following in the ugprade
And that may even work. Now if we were able to use the tpm token to update the same tpm token it would eb even better |
Is your feature request related to a problem? Please describe.
If a vulnerability is found in older images, we might want to disable the access to the encrypted portion of the disk for certain images. Similarly, if encryption keys are leaked, we would like to have a mechanism that allows or either to update the system to use a new key (by for instance using the old keys to update to new ones), or just invalidate the portion of the stack that is responsible to unencrypt the disk.
Describe the solution you'd like
A way to use old keys to generate an upgrade image that installs the new ones. Alternatively a mechanism that allows with an upgrade image to invalidate older images.
Describe alternatives you've considered
Additional context
The text was updated successfully, but these errors were encountered: