You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Confirm that a third party has carried out a full penetration test evaluating:
external attack surface
ability to exfiltrate data from the system
ability to transfer data between SREs
ability to escalate privileges on the SRD.
Update documentation
Update supported versions in SECURITY.md
Update pen test results in VERSIONING.md
Making the release
Merge release branch into latest
Create a tag of the form v0.0.1 pointing to the most recent commit on latest (the merge that you just made)
Publish your draft GitHub release using this tag
Ensure docs for the latest version are built and deployed on ReadTheDocs
Push a build to PyPI
Announce release on communications channels
Create a PR from latest into develop to ensure that release-specific changes are not lost
🌳 Deployment problems
Initial deployment of Tier 2 SRE
SSL certificate creation failed, but was successful on second attempt
Firewall failed to deploy
2nd attempt also failed with new error:
Pulumi error: + azure-native:network:Route sre_firewall_route_via_firewall creating
(9s) error: Code="MissingNextHopIpAddress" Message="NextHopIpAddress cannot be Null or Empty in
route ViaFirewall when NextHopType is VirtualAppliance."
Initial deployment of Tier 3 SRE failed
Diagnostics:
azure-native:network:AzureFirewall (sre_firewall_firewall):
error: 1 error occurred:
* GET
https://management.azure.com/subscriptions/3f1a8e26-eae2-4539-952a-0a6184ec248a/providers/Microsoft.Network/locations
/uksouth/operations/c9c47b66-593c-430d-819b-1433ef1d12ab
--------------------------------------------------------------------------------
RESPONSE 200: 200 OK
ERROR CODE: GatewayAllocationFailed
--------------------------------------------------------------------------------
{
"status": "Failed",
"error": {
"code": "GatewayAllocationFailed",
"message": "Compute allocation failed. Please retry later.",
"details": []
}
}
--------------------------------------------------------------------------------
azure-native:network:PrivateEndpoint (sre_data_storage_account_data_configuration_private_endpoint):
error: 1 error occurred:
* GET
https://management.azure.com/subscriptions/3f1a8e26-eae2-4539-952a-0a6184ec248a/providers/Microsoft.Network/locations
/uksouth/operations/d0b31274-dd1f-4557-bd42-47bee6642b72
--------------------------------------------------------------------------------
RESPONSE 200: 200 OK
ERROR CODE: RetryableError
--------------------------------------------------------------------------------
{
"status": "Failed",
"error": {
"code": "RetryableError",
"message": "A retryable error occurred.",
"details": [
{
"code": "ReferencedResourceNotProvisioned",
"message": "Cannot proceed with operation because resource
/subscriptions/3f1a8e26-eae2-4539-952a-0a6184ec248a/resourceGroups/shm-muppets-sre-rizzo-rg/providers/Microsoft.Netwo
rk/virtualNetworks/shm-muppets-sre-rizzo-vnet/subnets/DataConfigurationSubnet used by resource
/subscriptions/3f1a8e26-eae2-4539-952a-0a6184ec248a/resourceGroups/shm-muppets-sre-rizzo-rg/providers/Microsoft.Netwo
rk/networkInterfaces/shm-muppets-sre-rizzo-pep-storage-account-d.nic.59b9ffb7-8570-4b22-8836-f80344b66c35 is not in
Succeeded state. Resource is in Updating state and the last operation that updated/is updating the resource is
PutSubnetOperation."
}
]
}
}
--------------------------------------------------------------------------------
these issues may have been a temporary azure issue, as the deployment is now working
User in correct group is unable to log in to the workspace
possible DNS issue causing this. In the DNS logs, when a login attempt is made, the following is observed:
2025/01/16 16:47:21.538364 42#29868 [debug] filtering: found rule "*.*" for host "record.bauhqxgcxjnudchmhtgwk1hvig.zx.internal.cloudapp.net", filter list id: 0
2025/01/16 16:47:21.538380 42#29868 [debug] dnsforward: host "record.bauhqxgcxjnudchmhtgwk1hvig.zx.internal.cloudapp.net" is filtered, reason: "FilteredBlackList"
these problems were not replicated on fresh deployments of tier 2 and tier 3 SREs
the tier 3 suffered a series of problems during deployment - diagnostic settings for a variety of resources (e.g. storage accounts and firewall) were repeatedly reported as already existing and needed to be deleted before deployment could finish. after deployment, logging in would show no connections for a registered user. On inspection, this was because the guacamole-user-sync server could not make contact with the ldap server (apricot). The DNS entry for apricot was incorrect:
Manually correcting the DNS record allowed guacamole-user-sync to contact the ldap server successfully, and appear to sync. But still no connections appeared for the registered user.
Tore down and redeployed the tier 3, everything functional.
The text was updated successfully, but these errors were encountered:
2025/01/16 16:47:21.538364 42#29868 [debug] filtering: found rule "*.*" for host "record.bauhqxgcxjnudchmhtgwk1hvig.zx.internal.cloudapp.net", filter list id: 0
I would be surprised if this was a problem. There has always been a filter rule for *.* and the permitted domains shouldn't have changed, so I think this should be the same behaviour as before.
✅ Checklist
Refer to the Deployment section of our documentation when completing these steps.
data-safe-haven/VERSIONING.md
guide and determine the version number of the new release. Record it in the title of this issuerelease-v0.0.1
latest
develop
For patch releases only
For minor releases and above
For major releases only
Update documentation
SECURITY.md
VERSIONING.md
Making the release
latest
v0.0.1
pointing to the most recent commit onlatest
(the merge that you just made)latest
intodevelop
to ensure that release-specific changes are not lost🌳 Deployment problems
Initial deployment of Tier 2 SRE
Initial deployment of Tier 3 SRE failed
Manually correcting the DNS record allowed
guacamole-user-sync
to contact the ldap server successfully, and appear to sync. But still no connections appeared for the registered user.Tore down and redeployed the tier 3, everything functional.
The text was updated successfully, but these errors were encountered: