Once the component is deployed, the TRE administrator needs to add specific tag to workspace configurations to make backup enabled workspaces available to the researchers.
Note that the cost associated with a backup enabled workspace will be higher compared to the one without backups as the storage of backups will be added.
The below steps provide an illustration of enabling backups for a workspace.
-
Login to the SWB (Service Workbench) Web App as a TRE admin.
-
Navigate to
Workspace Types
-
Click
Edit
for one of the approved workspace types for which you want to enable backups. -
Click on
Configurations
tab, and clone an existing workspace configuration. Update the id, name and description of the workspace. -
In the last step add the backup tag as shown in screenshot. If you changed the backup tag value in
cdk.json
the same should go in here. -
Click
Done
. The workspace with backup configuration should now be available to researchers to create.
This section is intended for TRE admins to enable them to restore workspace backups when requested by a researcher.
Time to restore: Approximately 15 minutes
EBS volume backup are created for Workspaces backed by EC2 compute. There are multiple ways in which files from an EBS can be restored, this guide only explain restoring EBS by replacing it on the Workspace.
-
The workspace that needs to be restored must be in stopped state. This can be done by Researcher or Admin from the SWB Web App.
-
TRE Admin should have the workspace instance id handy for completing these steps.
-
Log in to the AWS Management Console of the TRE account using Admin privileges.
-
Navigate to the EC2 console, filter Instances with the instance id for which the restore needs to be carried out and click the instance id
-
Navigate to
Networking
tab and noteAvailability zone
. The EBS volume needs to be restored to the same Availability zone in which the EC2 instance is placed. -
Navigate to
Storage
tab and noteVolume ID
,Device name
, andVolume size (GiB)
-
Click through on the
Volume ID
and note theVolume Type
-
Click on the
Volume ID
which should take you tovolumes
page, select the volume, click onAction
->Detach volume
, confirmDetach
Note : At this point if the detached volume is no longer required it should be deleted.
-
Navigate to AWS Backup console, navigate to
Backup vaults
, click on the vault name. -
Use the
Volume ID
to search backups. Based on the backup frequency there will be multiple recovery points. Select the most appropriate recovery point. -
After identifying the restore point to restore the EBS volume, click on the recovery point ARN and select the Restore button.
-
The restore of the ARN will bring you to a Restore backup screen that will have the snapshot ID, and other configurations. Fill the details as per below table and click on
Restore backup
button.Parameter Value Resource Type Specify EBS volume. Volume type Select either the original volume type as noted earlier or a more appropriate type based on cost and performance requirements Size Select equivalent size of the backed up EBS volume as noted earlier. IOS 300/3000 - Baseline of 3 iops per GiB with a minimum of 100 IOPS, burstable to 3000 IOPS. Availability Zone Select the Availability Zone for the EC2 instance as noted in previous step Restore role Select Default role -
This will take you to restored jobs screen. The restored backup job will appear under Restore jobs in the the AWS Backup) console. Once the job status appears as completed, note the volume id of the restored volume.
-
Navigate to the Amazon EC2 console, select Volumes under Elastic Block Store to see the restored EBS volumes.
-
Select the volume, click on
Action
->Attach volume
, select the correctInstance
from the drop down and provide theDevice name
as noted in previous step. -
The restored volume should now be attached to the EC2 instance. The researcher or admin should now be able to start the Workspace.
-
Once the EC2 instance is in Running state, the TRE admin should verify that a tag with name
backupVolume
and valuetrue
should be added to the newly restored volume.
SageMaker notebook files are backed up to Amazon S3 bucket. A prefix for each workspace is created in the Backup bucket and files corresponding to each workspace are uploaded to it's corresponding prefix in S3.
The bucket has versioning enabled so user can restore a specific version of file if required.
Time to restore: Approximately 15 minutes
TRE user can follow these steps for restoring the files if TRE administrator has enabled self-service for restore while installing the backup component.
-
TRE user or admin logs into the SageMaker notebook. Keep notebook instance name handy.
-
TRE user or admin opens
Terminal
using File -> New -> Terminal -
Use list-objects aws cli command to list objects available to restore.
Example
aws s3api list-objects --bucket sagemaker-backup-bucket-AWSACCOUNTNUMBER-eu-west-2 --prefix NOTEBOOK_NAME/
-
Identify the object that you want to restore from the list. Use get-object aws cli command to restore the file.
Example
aws s3api get-object --bucket sagemaker-backup-bucket-AWSACCOUNTNUMBER-eu-west-2 --key NOTEBOOK_NAME/FILE_NAME FILE_NAME
By default the above command will restore the latest version of the file. Optionally, the user can also pass
--version-id
parameter to restore a specific version of the file.
Time to restore: Approximately 15 minutes
If the TRE Admin has not enabled self service to restore files the restoration work needs to be undertaken by the TRE Admin.
-
Log in to the AWS Management Console of your TRE Account with Administrative privileges.
-
Navigate to Amazon SageMaker console. Click on Notebook name on which files need to be restored.
-
Under Permissions and encryption click on the IAM role ARN.
-
This will re-direct to IAM console. Click Add permissions -> Create inline policy
-
Switch to JSON view and paste below policy to enable permissions to restore files from S3 bucket. Replace values for
BACKUP_BUCKET
andNOTEBOOK_NAME
variables.{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": ["s3:ListBucket"], "Resource": ["arn:aws:s3:::BACKUP_BUCKET/NOTEBOOK_NAME"], "Condition": { "StringEquals": { "s3:prefix": "NOTEBOOK_NAME" } } }, { "Effect": "Allow", "Action": ["s3:GetObject", "s3:GetObjectVersion"], "Resource": ["arn:aws:s3:::BACKUP_BUCKET/NOTEBOOK_NAME/*"] } ] }
-
Click on Review Policy. Provide policy name as
sagemaker-restore-policy-for-NOTEBOOK_NAME
after replacing NOTEBOOK_NAME with actual value. Click on Create Policy.
The above set of steps enables the IAM permissions for performing restore activity. Follow the self-service steps to restore necessary files.
Once restoration work is completed, TRE Admin should delete the inline policy added earlier. Failure to do so will result in errors while terminating the SageMaker Notebook workspace as CloudFormation will not be able to delete the IAM role associated with the Notebook.