-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add OATU2 specification draft [WIP] #69
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,77 @@ | ||
# Over The Air Unattended Updates (OTAU2) | ||
|
||
This document enumerates some of the approaches and FLOSS software that could | ||
be used to deploy OTAU2 to Lepidopter distribution. | ||
|
||
## Requirements | ||
|
||
A list of Lepidopter's OTAU2 requirements: | ||
|
||
* Atomic software release update | ||
|
||
* On failure, deploy previous working bootloader, kernel | ||
configuration, and filesystems | ||
|
||
* On success, deploy newest working bootloader, kernel | ||
configuration, filesystems and reboot (if needed) for the changes to take | ||
effect | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd say reboot (always) for simplicity, as There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes mostly because by rebooting an updated SD card in RasPi you never know if it will come back online. :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's exactly the point! Rebooting the RasPi you know that everything went OK / NOK as soon as possible. |
||
|
||
* Update of bootloader, kernel and configuration data, and filesystems | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can update /boot & root for sure, but RPi bootloader is fused. |
||
|
||
* Support for signing of images and verification of images on | ||
installation | ||
|
||
* Support for a self-hosted deployment server | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. wouldn't a self hosted deployment server increase the fingerprint-ability of software we deploy? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @bassosimone good point, we can perhaps use a cloud-fronted server were we 'll hand the updated images. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. AFAIK, China is ok with blocking google cloud. So we may want to use more than one cloud. |
||
|
||
* Enable/disable a specific feature and apply/rollback system updates | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
IMHO, distributing toggle-flags should be a part of orchestration platform as OTAU is not universal as it does not match needs of hardcore gentoo SysOps wanting to run ooni-probe in their dom0. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes I agree with this. I think all configuration specific to a particular installation should be store inside of the permanent storage of the device and handled by the ooniprobe software itself. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How this can be implemented in the orchestration platform? |
||
incrementally rather than through a complete OS update that | ||
replaces the filesystem | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Is there any reason to have incrementall update besides bandwidth? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In general it's good to do incremental updates if for any reason the OS image grows ups to a certain amount remote updates will be quite hard in points where bandwidth is limited and there are often network outages. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is another gotcha: your OS image may grow larger then temporary storage to download it. That's the reason to stream it from the network straight to the standby-root partition. |
||
|
||
* [OPTIONAL] Support for different host roles with a specific configuration set | ||
applicable only to specific hosts or groups (eg: partner probes) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As mentioned above configuration should not be handled by the update mechanism and it should be part of the software itself. Managing the lifecycle of multiple differently configured images is going to be imho too complex to manage in the long run. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @hellais By software you mean lepidopter or ooniprobe? I think you will encounter cases of customization so it's better to have a plan rather than implementing an OTAU system with no roles in mind. |
||
|
||
## Available tools | ||
|
||
Before reading any further you should go through the excellent study of | ||
software update management on [device-side software update strategies for | ||
automotive grade linux] | ||
(https://lists.linuxfoundation.org/pipermail/automotive-discussions/2016-May/002061.html) | ||
and the related discussion in [OSTreee manual] | ||
(https://ostree.readthedocs.io/en/latest/manual/related-projects/). | ||
|
||
The following software could potentially used to implement and deploy OTAU2 | ||
updates. | ||
|
||
### OSTree | ||
|
||
[WIP EVALUATION] | ||
|
||
### SWUpdate | ||
|
||
[WIP EVALUATION] | ||
|
||
### fwup | ||
|
||
An image based "firmware" tool that uses a dual partition update pattern. | ||
Upon a successful image update the MBR will be updated to make the bootloader | ||
boot form the 2nd (updated) partition. Update failures are being detected | ||
during the firmware update process. | ||
|
||
#### Pros | ||
|
||
* Can be integrated to lepidopter with minimal effort. | ||
|
||
* Non complicated implementation. | ||
|
||
#### Cons | ||
|
||
* There is no support for automatic (or unattended) updates. | ||
|
||
* There is no support for incremental updates every update results a new (big) | ||
image. | ||
|
||
* There is no native support for ext filesystems. | ||
|
||
* There is no fallback mode and in case of software bugs in an updated image, | ||
the system will be unable to boot and user intervention (ie. copy a working | ||
image to an SD card) will be required. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SD cards are rather big, so we can afford having previous boot & root untouched, we don't need to update them in-place.
SWUpdate calls it Double copy with fall-back, Chromium OS exploits same idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though most SD card image burners do not support archived image copying to an SD card.
Having +16G free disk space is not always that feasible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we've discussed on IRC, it may be quite safe to burn partition table,
/boot
and/
. These three blobs take first ~1Gb of the card and remaining data may be left uninitialised. Moreover, initialization of the/data
on boot may be part of wipe-on-failure strategy.