-
Notifications
You must be signed in to change notification settings - Fork 62
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
ramadu
committed
Dec 15, 2022
1 parent
6b913a7
commit 7fbdf5b
Showing
86 changed files
with
4,346 additions
and
8 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
.PHONY: debug cdk-install-requirements cdk-setup-vpc cdk-deploy-infra cdk-deploy-to-bucket cdk-list-stacks cdk-setup-mwaa-env cdk-diff cdk-setup-eks-role | ||
|
||
install-cdk-requirements: ## install the python dependencies needed to run cdk IaC commands | ||
@pip install -r infra/cdk/requirements.txt | ||
|
||
cdk-list: ## list all the stacks. due to SDK dependencies, this fails if run prior to S3 bucket creation | ||
@$(MAKE) -C infra/cdk list | ||
|
||
cdk-diff: ## list the local changes in cdk compared to the previously installed infrastructure | ||
@$(MAKE) -C infra/cdk diff | ||
|
||
cdk-deploy-infra: | ||
@S3_FLAG=False $(MAKE) -C infra/cdk infra | ||
|
||
cdk-deploy-to-bucket: ## setup VPC needed for the mwaa infrastructure using CDK | ||
@$(MAKE) -C infra/cdk s3-deploy | ||
|
||
cdk-setup-eks-role: ## setup the infrastructure dependencies for EKS cluster (eg: IAM Role) | ||
@$(MAKE) -C infra/cdk eks-role | ||
|
||
help: | ||
@grep -E '^[a-zA-Z_-]+:.*?#.*$$' $(MAKEFILE_LIST) | sort | awk 'BEGIN {FS = ":.*?## "}; {printf "\033[36m%-30s\033[0m %s\n", $$1, $$2}' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,97 @@ | ||
# mwaa-blueprints | ||
|
||
## Description | ||
|
||
This is a collection of getting started blueprints for using Amazon Managed Workflows for Apache Airflow (MWAA). Below | ||
is the high level structure and the key files | ||
|
||
```sh | ||
├── examples | ||
│ ├── AWSGlue | ||
│ │ ├── README.md | ||
│ │ ├── dags | ||
│ │ ├── infra | ||
│ │ └── scripts | ||
│ ├── EKS | ||
│ │ ├── dags | ||
│ │ ├── requirements.txt | ||
│ │ └── infra | ||
│ ├── EMR | ||
│ │ ├── dags | ||
│ │ └── spark | ||
│ ├── EMR_on_EKS | ||
│ │ ├── infra | ||
│ │ ├── dags | ||
│ │ ├── spark | ||
│ ├── Lambda | ||
│ │ ├── dags | ||
│ │ └── image | ||
└── infra | ||
├── cdk | ||
├── cloudformation | ||
└── terraform | ||
``` | ||
|
||
### Folder Structure Details | ||
|
||
- **README.md:** This file with instructions on how to use the blueprints | ||
|
||
- **Makefile:** A collection of make targets to run the various commands to setup infrastructure. To get detailed | ||
infromation about the make targets, run ```make help``` from the root folder | ||
|
||
- **examples:** This folder has a collection of technology specific DAGs organized into specific subfolders. Review the | ||
subfolders for details | ||
|
||
- **infra:** This folder has the infrastructure setup needed t o run the examples. Infrastructures are based | ||
on ```cloudformation```, ```cdk``` and ```terraform```. | ||
|
||
## Badges | ||
|
||
## Installation | ||
|
||
### CDK | ||
This example cretes MWAA environment and has the DAGs to create an EKS cluster. | ||
Setup Environment and execute examples [cdk](examples/EKS/README.md) | ||
|
||
### Terraform | ||
|
||
Access [terraform](infra/terraform/README.md) | ||
|
||
#### Examples | ||
|
||
Access [Examples](examples/) | ||
|
||
## Support | ||
|
||
Tell people where they can go to for help. It can be any combination of an issue tracker, a chat room, an email address, | ||
etc. | ||
|
||
## Roadmap | ||
|
||
If you have ideas for releases in the future, it is a good idea to list them in the README. | ||
|
||
## Contributing | ||
|
||
State if you are open to contributions and what your requirements are for accepting them. | ||
|
||
For people who want to make changes to your project, it's helpful to have some documentation on how to get started. | ||
Perhaps there is a script that they should run or some environment variables that they need to set. Make these steps | ||
explicit. These instructions could also be useful to your future self. | ||
|
||
You can also document commands to lint the code or run tests. These steps help to ensure high code quality and reduce | ||
the likelihood that the changes inadvertently break something. Having instructions for running tests is especially | ||
helpful if it requires external setup, such as starting a Selenium server for testing in a browser. | ||
|
||
## Authors and acknowledgment | ||
|
||
Show your appreciation to those who have contributed to the project. | ||
|
||
## License | ||
|
||
For open source projects, say how it is licensed. | ||
|
||
## Project status | ||
|
||
If you have run out of energy or time for your project, put a note at the top of the README saying that development has | ||
slowed down or stopped completely. Someone may choose to fork your project or volunteer to step in as a maintainer or | ||
owner, allowing your project to keep going. You can also make an explicit request for maintainers. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
|
||
.PHONY: all | ||
deploy: ## terraform | ||
terraform -chdir="./infra/terraform" init | ||
terraform -chdir="./infra/terraform" plan | ||
terraform -chdir="./infra/terraform" apply | ||
$(MAKE) post-provision | ||
|
||
post-provision: | ||
chmod 700 ./post_provision.sh | ||
./post_provision.sh $(mwaa_bucket) $(mwaa_execution_role_name) $(mwaa_env_name) | ||
|
||
undeploy: | ||
chmod 700 ./pre_termination.sh | ||
./pre_termination.sh $(mwaa_bucket) $(mwaa_execution_role_name) | ||
terraform -chdir="./infra/terraform" destroy | ||
|
||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,94 @@ | ||
# Glue with MWAA | ||
|
||
This example is a quick start for orchestrating AWS Glue crawler and AWS Glue Job with MWAA | ||
The example uses [NOAA Climatology data](https://docs.opendata.aws/noaa-ghcn-pds/readme.html) | ||
|
||
## Prerequisites: | ||
|
||
Ensure that you have installed the following tools on your machine. | ||
|
||
1. [aws cli](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html) | ||
3. [terraform](https://learn.hashicorp.com/tutorials/terraform/install-cli) | ||
4. [Amazon MWAA](https://aws.amazon.com/managed-workflows-for-apache-airflow/) | ||
|
||
|
||
_Note: If you do not have running MWAA environment, deploy it from the root of the project using terraform or CDK. | ||
|
||
## Deploy EKS Clusters with EMR on EKS feature | ||
|
||
Clone the repository | ||
|
||
```sh | ||
git clone https://github.com/aws-samples/amazon-mwaa-examples.git | ||
|
||
``` | ||
|
||
Navigate into one of the example directories and run `make` by passing MWAA environment related arguments | ||
|
||
```sh | ||
cd blueprints/examples/AWSGlue | ||
make deploy mwaa_bucket={MWAA_BUCKET} mwaa_execution_role_name={MWAA_EXEC_ROLE} mwaa_env_name={MWAA_ENV_NAME} | ||
``` | ||
|
||
## Login to MWAA | ||
|
||
Login to your Amazon MWAA environment. You should see a dag by the name 'emr_eks_weatherstation_job' | ||
|
||
Unpause the DAG and Run it from console | ||
|
||
## What does the makefile do? | ||
1. Create the infrastructure | ||
- IAM service role for AWS Glue, IAM policy with AWS Glue permissions that will be attached MWAA execution role | ||
- S3 buckets for Spark scripts and data | ||
2. Attaches Glue(IAM policy) access permissions to MWAA execution role | ||
3. Copy DAGs and Scripts to S3 buckets | ||
4. Update MWAA environment with Variables neeeded for DAGs. | ||
|
||
## What's needed for MWAA to access Glue cluster | ||
|
||
- Needs permissions to create/run AWS Glue crawler and Job. | ||
|
||
```json | ||
{ | ||
"Statement": [ | ||
{ | ||
"Action": [ | ||
"glue:CreateJob", | ||
"glue:ListCrawlers", | ||
"glue:ListJobs", | ||
"glue:CreateCrawler" | ||
"glue:GetCrawlerMetrics" | ||
"glue:GetCrawler", | ||
"glue:StartCrawler", | ||
"glue:UpdateCrawler" | ||
"glue:StartJobRun", | ||
"glue:GetJobRun", | ||
"glue:UpdateJob", | ||
"glue:GetJob" | ||
], | ||
"Effect": "Allow", | ||
"Resource": "*", | ||
"Sid": "Glue" | ||
}, | ||
{ | ||
"Action": [ | ||
"iam:PassRole", | ||
"iam:GetRole" | ||
], | ||
"Effect": "Allow", | ||
"Resource": [ | ||
"arn:aws:iam::{account}:role/{glue_service_role}", | ||
], | ||
"Sid": "Gluepassrole" | ||
} | ||
], | ||
"Version": "2012-10-17" | ||
} | ||
``` | ||
|
||
## Clean up | ||
```sh | ||
cd blueprints/examples/AWSGlue | ||
make deploy mwaa_bucket={MWAA_BUCKET} mwaa_execution_role_name={MWAA_EXEC_ROLE} mwaa_env_name={MWAA_ENV_NAME} | ||
``` | ||
- Login to AWS account and delete AWS Glue tables starting with `year_`, AWS Glue Crawler named `noaa-weather-station-data` and AWS Glue Job `noaa_weatherdata_transform` |
Oops, something went wrong.