Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Ubuntu VM images #1909

Merged
merged 15 commits into from
Jun 28, 2024
Merged

Conversation

craddm
Copy link
Contributor

@craddm craddm commented May 20, 2024

✅ Checklist

  • You have given your pull request a meaningful title (e.g. Enable foobar integration rather than 515 foobar).
  • You are targeting the appropriate branch. If you're not certain which one this is, it should be develop.
  • Your branch is up-to-date with the target branch (it probably was when you started, but it may have changed since then).

🚦 Depends on

⤴️ Summary

Updates the Linux VM to a Gen2 VM.

WIP: updates to release xx.04 LTS of Ubuntu

🌂 Related issues

Closes #1550

🔬 Tests

Unable to test if the deployed VMs are fully working, as cannot currently login with a user due to #1908

Copy link

github-actions bot commented May 20, 2024

Coverage report

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
  data_safe_haven/types
  enums.py
Project Total  

This report was generated by python-coverage-comment-action

Copy link
Member

@JimMadge JimMadge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we are doing this, we might as well update to a more recent release.

@craddm
Copy link
Contributor Author

craddm commented May 22, 2024

Updating to Gen 2 worked fine.

Updating to Jammy is slightly trickier:

2024-05-22 15:05:17 [   ERROR] Diagnostics:                                                                                                                                        cli.py:104
2024-05-22 15:05:17 [   ERROR]   pulumi:pulumi:Stack (data-safe-haven-shm-lincolnshire-sre-morcilla):                                                                                 cli.py:104
2024-05-22 15:05:17 [   ERROR]     error: update failed                                                                                                                          cli.py:104
2024-05-22 15:05:17 [   ERROR]                                                                                                                                                                    cli.py:104
2024-05-22 15:05:17 [   ERROR]   azure-native:compute:VirtualMachineExtension (sre_workspaces_vm_workspace_01_log_analytics_extension):                                               cli.py:104
2024-05-22 15:05:17 [   ERROR]     error: 1 error occurred:                                                                                                                         cli.py:104
2024-05-22 15:05:17 [   ERROR]          * Code="VMExtensionHandlerNonTransientError" Message="The handler for VM extension type 'Microsoft.EnterpriseCloud.Monitoring.OmsAgentForLinux' has       cli.py:104
reported terminal failure for VM extension 'OmsAgentForLinux' with error message: '[ExtensionOperationError] Non-zero exit code: 55,                                                                        
/var/lib/waagent/Microsoft.EnterpriseCloud.Monitoring.OmsAgentForLinux-1.19.0/omsagent_shim.sh -install\n\n2024/05/22 15:0Info: Falling back to /etc/os-release distribution parsing\n-1.19.0]              
Install,failed,55,Install failed with exit code 55 because the package manager on the VM is currently locked: please wait and try again\n\n\n\n'.\r\n    \r\n'Install handler failed for the                
extension. More information on troubleshooting is available at https://aka.ms/VMExtensionOMSAgentLinuxTroubleshoot'"     

This seems like more incentive to move off omsagent, if anything.

  • It seemed to work when redeployed, but xrdp etc wasn't running. It seemed that cloud init was stalling attempting to install the snap version of Firefox. Trying to install snapd manually seemed to trigger installation of whole bunch of additional packages including xrdp, even though snapd is already installed.

as it forces some packages to be installed as snaps (e.g. Firefox), which isn't currently possible

Edit: well, may be possible. the I couldn't connect to the workspace VM through xrdp until I manually installed snapd via the serial console...

@jemrobinson
Copy link
Member

  1. Can we switch to noble (24.04)?
  2. @JimMadge: what are your thoughts about snaps?
  3. We could consider installing from a ppa repository if we want to use the .deb version, but it looks like some manual fiddling around with priorities is needed

@craddm
Copy link
Contributor Author

craddm commented May 22, 2024

  1. Can we switch to noble (24.04)?
  2. @JimMadge: what are your thoughts about snaps?
  3. We could consider installing from a ppa repository if we want to use the .deb version, but it looks like some manual fiddling around with priorities is needed

There is a 24.04 image on marketplace. I'd imagine that it too wants to install snaps. Omsagent definitely doesn't support 24.04, but we don't really need it to. I'm not sure if Azure Update Manager or Monitor Agent can handle 24.04 yet either.

@craddm craddm changed the title [WIP] Update workspace VMs to gen2 [WIP] Update Ubuntu VM images May 22, 2024
@JimMadge
Copy link
Member

Hmm, yes I'd forgotten about that but probably should have brought it up before. It feels like Ubuntu is moving towards distributing more packages as snaps. Firefox is now a snap by default.

That will be difficult to support,

  • I'm a bit suspicious of snaps now given recent security problems 1 2
  • Using a PPA feels like the same risk, a lower level of trust for the maintainers

My feeling is the drive towards snaps won't change.
Now might be the opportunity to move to another distro. Fedora maybe.

@JimMadge JimMadge mentioned this pull request May 28, 2024
3 tasks
@JimMadge
Copy link
Member

Discussion of snap endpoints in #1220

@JimMadge
Copy link
Member

Do we still need domains and IP addresses for the endpoints we want to reach (@jemrobinson)?

Using the Snap Store Proxy like we proxy apt/pip/cran could be a good solution.

@craddm
Copy link
Contributor Author

craddm commented May 29, 2024

  117.275201] cloud-init[1864]: Selecting previously unselected package libavahi-common-data:amd64.
(Reading database ... 62064 files and directories currently installed.)
[  117.350445] cloud-init[1864]: Preparing to unpack .../000-libavahi-common-data_0.8-5ubuntu5.2_amd64.deb ...
[  117.354326] cloud-init[1864]: Unpacking libavahi-common-data:amd64 (0.8-5ubuntu5.2) ...
[  117.386055] cloud-init[1864]: Selecting previously unselected package libavahi-common3:amd64.
[  117.410072] cloud-init[1864]: Preparing to unpack .../001-libavahi-common3_0.8-5ubuntu5.2_amd64.deb ...
[  117.413341] cloud-init[1864]: Unpacking libavahi-common3:amd64 (0.8-5ubuntu5.2) ...
[  117.461784] cloud-init[1864]: Selecting previously unselected package libavahi-core7:amd64.
[  117.477839] cloud-init[1864]: Preparing to unpack .../002-libavahi-core7_0.8-5ubuntu5.2_amd64.deb ...
[  117.481412] cloud-init[1864]: Unpacking libavahi-core7:amd64 (0.8-5ubuntu5.2) ...
[  117.523809] cloud-init[1864]: Selecting previously unselected package libdaemon0:amd64.
[  117.547648] cloud-init[1864]: Preparing to unpack .../003-libdaemon0_0.14-7.1ubuntu3_amd64.deb ...
[  117.558369] cloud-init[1864]: Unpacking libdaemon0:amd64 (0.14-7.1ubuntu3) ...
[  117.589616] cloud-init[1864]: Selecting previously unselected package avahi-daemon.
[  117.605491] cloud-init[1864]: Preparing to unpack .../004-avahi-daemon_0.8-5ubuntu5.2_amd64.deb ...
[  117.630232] cloud-init[1864]: Unpacking avahi-daemon (0.8-5ubuntu5.2) ...
[  117.675521] cloud-init[1864]: Selecting previously unselected package firefox.
[  117.689962] cloud-init[1864]: Preparing to unpack .../005-firefox_1%3a1snap1-0ubuntu2_amd64.deb ...
[  117.877015] cloud-init[1864]: => Installing the firefox snap
[  117.881777] cloud-init[1864]: ==> Checking connectivity with the snap store
[  117.927838] cloud-init[1864]: ===> Unable to contact the store, trying every minute for the next 30 minutes

@craddm
Copy link
Contributor Author

craddm commented May 29, 2024

Looking at loosening network rules to allow snapcraft. However, DNS still not allowing snapcraft domain names to be resolved

.ioadmin@shm-lincolnshire-sre-morcilla-vm-workspace-01:~$ nslookup api.snapcraft 
Server:         127.0.0.53
Address:        127.0.0.53#53

** server can't find api.snapcraft.io: NXDOMAIN

t.orgmin@shm-lincolnshire-sre-morcilla-vm-workspace-01:~$ nslookup cran.r-projec 
Server:         127.0.0.53
Address:        127.0.0.53#53

Non-authoritative answer:
cran.r-project.org      canonical name = cran.wu-wien.ac.at.
Name:   cran.wu-wien.ac.at
Address: 137.208.57.37
** server can't find cran.wu-wien.ac.at: NXDOMAIN

comadmin@shm-lincolnshire-sre-morcilla-vm-workspace-01:~$ nslookup login.ubuntu. 
Server:         127.0.0.53
Address:        127.0.0.53#53

** server can't find login.ubuntu.com: NXDOMAIN

raftcontent.comncolnshire-sre-morcilla-vm-workspace-01:~$ nslookup storage.snapc 
Server:         127.0.0.53
Address:        127.0.0.53#53

** server can't find storage.snapcraftcontent.com: NXDOMAIN

@craddm
Copy link
Contributor Author

craddm commented May 29, 2024

Snapcraft is blacklisted by adgaurd

2024/05/29 09:24:31.318180 44#153000 [debug] filtering: found rule "*.*" for host "api.snapcraft.io.3hmqzriazs3edasxc10qfi4p5b.zx.internal.cloudapp.net", filter list id: 0
2024/05/29 09:24:31.318377 44#153000 [debug] dnsforward: host "api.snapcraft.io.3hmqzriazs3edasxc10qfi4p5b.zx.internal.cloudapp.net" is filtered, reason: "FilteredBlackList"; rule: "*.*"

@JimMadge
Copy link
Member

Permitted domains

@verify(UNIQUE)
class PermittedDomains(tuple[str, ...], Enum):
"""Permitted domains for outbound connections."""
APT_REPOSITORIES = (
# "apt.postgresql.org",
"archive.ubuntu.com",
"azure.archive.ubuntu.com",
"changelogs.ubuntu.com",
"cloudapp.azure.com", # this is where azure.archive.ubuntu.com is hosted

@JimMadge
Copy link
Member

Having looked at Snap Store Proxy, it looks like it isn't possible to get this working without creating a Ubuntu SSO account (and there may be limits on how many clients can connect without having a Canonical support contract).

docs

@craddm
Copy link
Contributor Author

craddm commented May 29, 2024

Ok, allowing the VMs to directly contact snapcraft now works. So finding a way to allow that allows us to use Jammy.

Currently, it creates a new application rule to allow Snapcraft through the firewall.

If Snap Store Proxy won't work, maybe we can use another proxy ourselves -

https://snapcraft.io/docs/system-options#heading--proxy

@JimMadge
Copy link
Member

As I understand it, we are talking about different methods of proxying here.

The Snap Store Proxy is much more like a snap store instance which has an upstream provider (quite like how we use Nexus).

The snapd proxy configuration is a general purpose http/https proxy, like we route all internet traffic through gateway.example.lan.

@craddm
Copy link
Contributor Author

craddm commented May 29, 2024

Ok, but we're just using a squid proxy for apt, so wouldn't something similar work for snapd?

@jemrobinson
Copy link
Member

My feeling is the drive towards snaps won't change. Now might be the opportunity to move to another distro. Fedora maybe.

Is Fedora supported on Azure? I have no problem with switching to another distro, but we want to avoid maintaining our own OS if possible. I remember you were interested in NixOS a few years ago @JimMadge - is that another possibility?

@jemrobinson
Copy link
Member

Ok, but we're just using a squid proxy for apt, so wouldn't something similar work for snapd?

We're using squid-deb-proxy which can act as a proxy to any .deb repository. There may be a similar thing for snap packages, or this may just work - I haven't looked into it.

@JimMadge
Copy link
Member

Is Fedora supported on Azure? I have no problem with switching to another distro, but we want to avoid maintaining our own OS if possible. I remember you were interested in NixOS a few years ago @JimMadge - is that another possibility?

I assumed there would be an official Fedora image, but it looks like there isn't.
Debian is endorsed by Microsoft, that should be an easier switch as it is an apt/deb-based distro.

NixOS would be a much more complex change. Moving to an immutable distro would involve changing how we think about configuration. I think it would be a good long term goal. There is an argument that immutable provides better security and reproducibility. There is also no official image on the Marketplace so it would involve building that ourselves.

@JimMadge
Copy link
Member

We're using squid-deb-proxy which can act as a proxy to any .deb repository. There may be a similar thing for snap packages, or this may just work - I haven't looked into it.

Yes. I think I wrote something like this in Slack and didn't comment here 🤦.

The Snap Store Proxy is like a dedicated proxy for snapd. I don't think it would be a good solution for us though.
We could proxy the http/https traffic, but I'm not sure if that gives us anything or improves security in a meaningful way.

@JimMadge
Copy link
Member

If this is working, let's get this in now for rc3 and open an issue to track the security question.

@craddm
Copy link
Contributor Author

craddm commented Jun 27, 2024

Latest update from testing with VM behind an Azure firewall.

Started with a fully locked down VM with nothing allowed out, then progressively allowed more and more traffic.

The list of required endpoints for snaps found on snapcraft.io is

Only api.snapcraft.io and *.snapcraftcontent.com are required to be able to download and install snaps.

You also need to allow cloud-images.ubuntu.com to allow snapcraft to download base images and build snaps.

Allowing dashboard.snapcraft.io and login.ubuntu.com allows you to login to the snap store and upload snaps you have built.

upload.apps.ubuntu.com does not appear to be needed - download, install, and upload of snaps worked without it being allowed.

Blocking dashboard.snapcraft.io prevented built snaps from being uploaded.

Blocking login.ubuntu.com prevents the user from logging in to the snap store, and since you need to be signed in to upload snaps, also stops snaps from being uploaded if it is blocked before the user can sign in. Blocking after signing in does not stop uploads.

Ultimately, it seems api.snapcraft.io and *.snapcraftcontent.com are sufficient, and uploads aren't possible with those alone.

@craddm craddm changed the title [WIP] Update Ubuntu VM images Update Ubuntu VM images Jun 27, 2024
@JimMadge
Copy link
Member

@craddm Is this ready for review?

@craddm
Copy link
Contributor Author

craddm commented Jun 27, 2024

Yes

@craddm craddm marked this pull request as ready for review June 27, 2024 15:04
@craddm craddm requested a review from a team as a code owner June 27, 2024 15:04
Copy link
Member

@JimMadge JimMadge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look good.

I think we might want to add a rule to explicitly disallow the endpoints dashboard.snapcraft.io and login.ubuntu.com with a high priority.

It would be good to have a summary of what you've found out about the endpoints and the consequences. I.e. we must block two endpoints, a user with an authorisation token (presumably?) can upload to dashboard.snapcraft.io.

@jemrobinson is there precedent for that? I can't see any disallow rules in a quick look through firewall.py.

data_safe_haven/types/enums.py Outdated Show resolved Hide resolved
@JimMadge JimMadge self-assigned this Jun 28, 2024
@JimMadge
Copy link
Member

I think we might want to add a rule to explicitly disallow the endpoints dashboard.snapcraft.io and login.ubuntu.com with a high priority.

I've tried this in 9a40c62 and 9ba6657.

@JimMadge
Copy link
Member

@craddm Are you happy with this?

@craddm
Copy link
Contributor Author

craddm commented Jun 28, 2024

Yes, LGTM

Copy link
Member

@jemrobinson jemrobinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM. One question about forbidden domains.

@JimMadge JimMadge merged commit 4d52013 into alan-turing-institute:develop Jun 28, 2024
11 checks passed
Comment on lines +242 to +264
network.AzureFirewallApplicationRuleCollectionArgs(
action=network.AzureFirewallRCActionArgs(
type=network.AzureFirewallRCActionType.DENY
),
name="workspaces-deny",
priority=FirewallPriorities.SRE_WORKSPACES,
rules=[
network.AzureFirewallApplicationRuleArgs(
description="Deny external Ubuntu Snap Store upload and login access",
name="DenyUbuntuSnapcraft",
protocols=[
network.AzureFirewallApplicationRuleProtocolArgs(
port=int(Ports.HTTP),
protocol_type=network.AzureFirewallApplicationRuleProtocolType.HTTP,
),
network.AzureFirewallApplicationRuleProtocolArgs(
port=int(Ports.HTTPS),
protocol_type=network.AzureFirewallApplicationRuleProtocolType.HTTPS,
),
],
source_addresses=props.subnet_workspaces_prefixes,
target_fqdns=ForbiddenDomains.UBUNTU_SNAPCRAFT,
),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking further - I'm happy with this rule, but I think AzureFirewall behaviour defaults to DENY, so it might not be needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm in two minds.

I know we block everything we don't explicitly allow. It feels like we should capture the fact that these domains in particular should always be blocked though. You might be tempted to allow them when debugging for example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to agree, feels like it's worth being explicit here

@craddm craddm deleted the update-to-gen2 branch July 16, 2024 09:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update Ubuntu images to gen2
3 participants