Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/epic 1 us 2 #64

Merged
merged 27 commits into from
May 23, 2022
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
b5eaf1a
Merge pull request #62 from HicResearch/main
awskaran May 19, 2022
e0fab2e
Updated Arch documentation
awskaran May 19, 2022
9668f50
Design considerations updated
awskaran May 19, 2022
d45de1c
Costs for TRE
awskaran May 19, 2022
c7fd3ac
Update doc/architecture/Design-Considerations.md
awskaran May 23, 2022
e26b6cf
Update doc/architecture/Design-Considerations.md
awskaran May 23, 2022
2721da4
Update doc/architecture/Design-Considerations.md
awskaran May 23, 2022
508316c
Update doc/architecture/Design-Considerations.md
awskaran May 23, 2022
5a4daf0
Update doc/architecture/Design-Considerations.md
awskaran May 23, 2022
b05c68f
Update doc/architecture/Architecture.md
awskaran May 23, 2022
d9ddee1
Update doc/architecture/Architecture.md
awskaran May 23, 2022
a3221ab
Update doc/architecture/Architecture.md
awskaran May 23, 2022
8ab1cd7
Update doc/architecture/Architecture.md
awskaran May 23, 2022
5f07fad
Update doc/architecture/Design-Considerations.md
awskaran May 23, 2022
7781be2
Update doc/architecture/Cost.md
awskaran May 23, 2022
072772d
Update Cost.md
awskaran May 23, 2022
44d06e6
Update Architecture.md
awskaran May 23, 2022
3a70647
Update doc/architecture/Architecture.md
awskaran May 23, 2022
27a4657
Update doc/architecture/Architecture.md
awskaran May 23, 2022
008b286
Update doc/architecture/Architecture.md
awskaran May 23, 2022
d5cfc3a
Update doc/architecture/Architecture.md
awskaran May 23, 2022
90e05ba
Update doc/architecture/Architecture.md
awskaran May 23, 2022
ed0f9d3
Update doc/architecture/Architecture.md
awskaran May 23, 2022
7da1e31
Apply suggestions from code review
awskaran May 23, 2022
67ec5eb
Update Architecture.md
awskaran May 23, 2022
94e56cf
Updated Arch diagram
awskaran May 23, 2022
1fb52b9
update diagram for org structure
awskaran May 23, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 52 additions & 18 deletions doc/architecture/Architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,12 @@

---

## TREEHOOSE (TRE)
This document explains the high level architecture of
the Trusted Research Environment that would be deployed
on AWS Cloud following the installation
steps in this repository.

---

TREEHOOSE is the Trusted Research Environment (TRE) implementation
that will be deployed for each research project.
Deploying the solution with the **default parameters**
builds the following environment in the AWS Cloud.

![TREEHOOSE Architecture](../../res/images/TREEHOOSE-architecture.png)

### Overview
## Overview

---

awskaran marked this conversation as resolved.
Show resolved Hide resolved
Expand All @@ -31,11 +25,51 @@ provides optional add-on components to enable
- Workspace backups
- Budget controls

### Solution Overview
TREEHOOSE is the Trusted Research Environment (TRE) implementation
that will be deployed for each research project.
Deploying the solution with the **default parameters**
builds the following environment in AWS Cloud.

![TREEHOOSE Architecture](../../res/images/TREEHOOSE-architecture.png)
awskaran marked this conversation as resolved.
Show resolved Hide resolved

The solution uses Infrastructure as Code for deployment.
Additional sections in this document provide additional details about each component. Below is a brief explaination
awskaran marked this conversation as resolved.
Show resolved Hide resolved
of the numbered legends in the diagram.
awskaran marked this conversation as resolved.
Show resolved Hide resolved

1. TRE Data Managers use AWS Management console to upload
data to the TRE Data Lake to be used for research.
1. IT Administrators use Service Workbench web applications
awskaran marked this conversation as resolved.
Show resolved Hide resolved
to administer resources in the TRE environment.
1. The budgets add-on is used to set budget limits for the TRE
awskaran marked this conversation as resolved.
Show resolved Hide resolved
environment. IT admins can set the budget and any actions
awskaran marked this conversation as resolved.
Show resolved Hide resolved
to be taken when the budget thresholds are breached.
1. Backup functionality for research workspaces can also be
enabled for researcher workspaces. IT Admins can monitor
awskaran marked this conversation as resolved.
Show resolved Hide resolved
these through AWS Backup.
1. Data Managers can provide researchers access to relevant
awskaran marked this conversation as resolved.
Show resolved Hide resolved
data sets from the data lake.
1. Researchers can create approved workspaces through Service Workbench Web application.
awskaran marked this conversation as resolved.
Show resolved Hide resolved
They get secure access to compute resources using
Amazon AppStream. Details of connecting to workspaces is available within the Web App.
awskaran marked this conversation as resolved.
Show resolved Hide resolved
1. On research completion the researcher can request egress of
research results.
1. The egress request is processed through an add-on app
awskaran marked this conversation as resolved.
Show resolved Hide resolved
with a comprehensive review process with multiple approvers
before the data is available to be downloaded.
awskaran marked this conversation as resolved.
Show resolved Hide resolved
1. Egress requests that are approved, result in creation of
awskaran marked this conversation as resolved.
Show resolved Hide resolved
downloadable version of the data that Data Egress Manager
awskaran marked this conversation as resolved.
Show resolved Hide resolved
can share with Researcher.
awskaran marked this conversation as resolved.
Show resolved Hide resolved
1. Audit & Compliance teams get full visibility into all
user activities resulting in AWS API calls through centralised
CloudTrail logs. Additionally, they get breakglass
access to all TRE projects/accounts in the TRE through
a Lambda function role in Audit account.
awskaran marked this conversation as resolved.
Show resolved Hide resolved

## Component Overview

---

#### *AWS Control Tower*
### *AWS Control Tower*

---

awskaran marked this conversation as resolved.
Show resolved Hide resolved
Expand All @@ -53,7 +87,7 @@ that will be setup by using the TREEHOOSE solution.

awskaran marked this conversation as resolved.
Show resolved Hide resolved
![Multi-account structure](../../res/images/multi-account-setup.png)
awskaran marked this conversation as resolved.
Show resolved Hide resolved

awskaran marked this conversation as resolved.
Show resolved Hide resolved
#### *Service Workbench on AWS Solution*
### *Service Workbench on AWS Solution*

---

Expand All @@ -73,7 +107,7 @@ Key Components :
(more services as desired; this is customisable by providing Service Catalog templates).
- For the secure access environment: AWS AppStream 2.0

#### *Datalake*
### *Datalake*
awskaran marked this conversation as resolved.
Show resolved Hide resolved

---

awskaran marked this conversation as resolved.
Show resolved Hide resolved
Expand All @@ -90,7 +124,7 @@ Key Components :

- AWS Lake Formation
awskaran marked this conversation as resolved.
Show resolved Hide resolved

#### *Data Egress Application*
### *Data Egress Application*

---

awskaran marked this conversation as resolved.
Show resolved Hide resolved
Expand All @@ -117,7 +151,7 @@ Key Components :
- For the backend: AWS Step Functions, Amazon EFS,
AWS Lambda, Amazon DynamoDB, Amazon SES, Amazon S3, Amazon SNS, Amazon Cognito
awskaran marked this conversation as resolved.
Show resolved Hide resolved

#### *Workspace backup*
### *Workspace backup*

---

awskaran marked this conversation as resolved.
Show resolved Hide resolved
Expand Down Expand Up @@ -152,7 +186,7 @@ Key Components:
- For the backend: AWS Step Functions,
AWS Lambda, Amazon CloudWatch Events, AWS CloudForamtion, AWS Backup, Amazon S3
awskaran marked this conversation as resolved.
Show resolved Hide resolved

#### *Budget controls*
### *Budget controls*

---

awskaran marked this conversation as resolved.
Show resolved Hide resolved
Expand Down
93 changes: 92 additions & 1 deletion doc/architecture/Cost.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,100 @@

You are responsible for the cost of the AWS services used to run this solution.
As of January 2022, the cost for running this solution with the default settings
in the EU West (Ireland) AWS Region is approximately $X for TRE account with all add-ons.
in the EU West (Ireland) AWS Region is approximately **$30** for TRE account with all add-ons.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"EU West 1", do you think there will be a difference in costs between Ireland and London (since we're using London everywhere in this repo)?

Prices are subject to change.
For full details, see the pricing page for each AWS service used in this solution.

> **_NOTE:_** Many AWS Services include a Free Tier – a baseline amount of the service that customers can use at no charge.
> Actual costs may be more or less than the pricing examples provided.

The baseline cost is just for spinning up the infrastructure.
As the solution is based on Serverless architecture, you only
pay for what you use when you use.

Following factors will contribute to incremental costs for an actively used deployment or TRE account:

- Compute resources used by researchers in the form
of EC2 instances
- Volume of data stored in S3 buckets in the Data Lake account
- AppStream resources used by researchers to interact
with their research workspace
- Volume of backup data stored by AWS Backup

The cost of using and maintaining an AWS Control Tower
environment can be found [here](https://aws.amazon.com/controltower/pricing/).

The best place to calculate the cost of using this solution
is by using [AWS Pricing Calculator](https://calculator.aws/#/)
and putting in the correct usage information.

## Example cost table

---

The following table provides an example cost breakdown for deploying this
solution with the default settings in EU West (Ireland) AWS Region.

### Base Installation

An installation of TRE without any workspaces and users.

|AWS Service|Monthly cost|
|----|----|
|Networking services|$11|
|KMS|$6|
|Config|$4|
|CloudTrail|$3.5|
|EC2-other|$1.5|
awskaran marked this conversation as resolved.
Show resolved Hide resolved
|DynamoDB|$6|
|Service Catalog|$1|
|Step Functions|$0.09|
|Lambdas|$0.003|
|CloudFront|$0.0002|
|CloudWatch|$0.0003|
|Total|$33.0935|

### EC2 Usage

Below example is based on on-demand
pricing.
A researcher uses a workspace for 730 hours.

Example - 1
|AWS Service|Monthly cost|
|----|----|
|EC2 - t3.large|$66.58 |
|EBS - 10GB| $1.10|
|Total|$67.68|

Example - 2
|AWS Service|Monthly cost|
|----|----|
|EC2 - m6g.8xlarge|$1,004.48 |
|EBS - 80GB| $8.80|
|Total|$1,013.28|

### SageMaker Usage

A researcher use sagemaker notebook
for 730 hours on a project.

Example
|AWS Service|Monthly cost|
|----|----|
|SageMaker - notebook - ml.c5.large| $82.80|
|Total|$82.80|

### S3 storage

A researcher works on a 1 TB data study
and produces a 10 GB output to download
awskaran marked this conversation as resolved.
Show resolved Hide resolved

Example
|AWS Service|Monthly cost|
|----|----|
|S3 - study data| $23.58 |
|Data Egress| $0.90|
|Total|$24.48|

All cost examples provided above are indicative.
27 changes: 25 additions & 2 deletions doc/architecture/Design-Considerations.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@ Below are the key design considerations for TREEHOOSE

---

- All the core infrastructure is deployed using IaC (Infrastructure as Code).
- Maximised the use of AWS Serverless services for ease of operability and scalability.
- All the core infrastructure is deployed using [IaC (Infrastructure as Code)](https://docs.aws.amazon.com/whitepapers/latest/introduction-devops-aws/infrastructure-as-code.html).
- The solution is based on Serverless Architecture for ease of operability and scalability.

## Audit

Expand All @@ -19,6 +19,8 @@ Below are the key design considerations for TREEHOOSE
and the logs centralised for Auditing.
- [AWS Config](https://aws.amazon.com/config/) is enabled in all AWS accounts
and the config records centralised for Auditing.
- [Amazon CloudWatch](https://aws.amazon.com/cloudwatch/) is used for
log aggregation and metrics for each TRE project/AWS account
awskaran marked this conversation as resolved.
Show resolved Hide resolved

## Security

Expand All @@ -28,3 +30,24 @@ Below are the key design considerations for TREEHOOSE
- Encryption in-transit is enabled for all AWS services where applicable
and also enabled for all API calls.
- For all [AWS IAM](https://aws.amazon.com/iam/) policies principle of least privilege has been followed.
awskaran marked this conversation as resolved.
Show resolved Hide resolved
- [AWS Account](https://aws.amazon.com/account/) provides true billing and security boundary.
awskaran marked this conversation as resolved.
Show resolved Hide resolved
Hence each research project should be hosted in a seperate AWS account
awskaran marked this conversation as resolved.
Show resolved Hide resolved

## Considerations for End Users

---

These are some additional decisions that the end user of
TREEHOOSE should make based on their functional and
non-functional requirements.

- Centralise and enable AWS Security services like
awskaran marked this conversation as resolved.
Show resolved Hide resolved
- [AWS Security Hub](https://aws.amazon.com/security-hub/)
- [Amazon GuardDuty](https://aws.amazon.com/guardduty/)
- [Amazon Macie](https://aws.amazon.com/macie/)
- [AWS IAM Access Analyzer](https://docs.aws.amazon.com/IAM/latest/UserGuide/what-is-access-analyzer.html)

- Enable [AWS Web Application Firewall](https://aws.amazon.com/waf/) for Web Applications.
- Enable additional [Control Tower Guardrails](https://docs.aws.amazon.com/controltower/latest/userguide/guardrails.html).
- Use [Amazon EC2 reserved instances](https://aws.amazon.com/ec2/pricing/reserved-instances/)
awskaran marked this conversation as resolved.
Show resolved Hide resolved
- [Optimize](https://docs.aws.amazon.com/whitepapers/latest/best-practices-for-deploying-amazon-appstream-2/cost-optimization.html) how you use AppStream.
Binary file modified res/images/TREEHOOSE-architecture.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.