Skip to content

Commit

Permalink
feat: Upgrade Airflow 2.7.1 (#440)
Browse files Browse the repository at this point in the history
  • Loading branch information
jagpk authored Feb 21, 2024
1 parent b16c4c5 commit 72b62ac
Show file tree
Hide file tree
Showing 9 changed files with 78 additions and 400 deletions.
6 changes: 4 additions & 2 deletions schedulers/terraform/self-managed-airflow/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ Checkout the [documentation website](https://awslabs.github.io/data-on-eks/docs/
| <a name="module_airflow_s3_bucket"></a> [airflow\_s3\_bucket](#module\_airflow\_s3\_bucket) | terraform-aws-modules/s3-bucket/aws | ~> 3.0 |
| <a name="module_amp_ingest_irsa"></a> [amp\_ingest\_irsa](#module\_amp\_ingest\_irsa) | aws-ia/eks-blueprints-addon/aws | ~> 1.0 |
| <a name="module_db"></a> [db](#module\_db) | terraform-aws-modules/rds/aws | ~> 5.0 |
| <a name="module_ebs_csi_driver_irsa"></a> [ebs\_csi\_driver\_irsa](#module\_ebs\_csi\_driver\_irsa) | terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks | ~> 5.20 |
| <a name="module_ebs_csi_driver_irsa"></a> [ebs\_csi\_driver\_irsa](#module\_ebs\_csi\_driver\_irsa) | terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks | ~> 5.34 |
| <a name="module_eks"></a> [eks](#module\_eks) | terraform-aws-modules/eks/aws | ~> 19.15 |
| <a name="module_eks_blueprints_addons"></a> [eks\_blueprints\_addons](#module\_eks\_blueprints\_addons) | aws-ia/eks-blueprints-addons/aws | ~> 1.2 |
| <a name="module_eks_data_addons"></a> [eks\_data\_addons](#module\_eks\_data\_addons) | aws-ia/eks-data-addons/aws | ~> 1.2.9 |
Expand All @@ -52,6 +52,7 @@ Checkout the [documentation website](https://awslabs.github.io/data-on-eks/docs/
| [aws_iam_policy.airflow_scheduler](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_policy) | resource |
| [aws_iam_policy.airflow_webserver](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_policy) | resource |
| [aws_iam_policy.airflow_worker](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_policy) | resource |
| [aws_iam_policy.fluentbit](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_policy) | resource |
| [aws_iam_policy.grafana](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_policy) | resource |
| [aws_iam_policy.spark](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_policy) | resource |
| [aws_prometheus_workspace.amp](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/prometheus_workspace) | resource |
Expand Down Expand Up @@ -87,6 +88,7 @@ Checkout the [documentation website](https://awslabs.github.io/data-on-eks/docs/
| [aws_ecrpublic_authorization_token.token](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/ecrpublic_authorization_token) | data source |
| [aws_eks_cluster_auth.this](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/eks_cluster_auth) | data source |
| [aws_iam_policy_document.airflow_s3_logs](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/iam_policy_document) | data source |
| [aws_iam_policy_document.fluent_bit](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/iam_policy_document) | data source |
| [aws_iam_policy_document.grafana](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/iam_policy_document) | data source |
| [aws_iam_policy_document.spark_operator](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/iam_policy_document) | data source |
| [aws_partition.current](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/partition) | data source |
Expand All @@ -98,7 +100,7 @@ Checkout the [documentation website](https://awslabs.github.io/data-on-eks/docs/
| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| <a name="input_db_private_subnets"></a> [db\_private\_subnets](#input\_db\_private\_subnets) | Private Subnets CIDRs. 254 IPs per Subnet/AZ for Airflow DB. | `list(string)` | <pre>[<br> "10.0.20.0/26",<br> "10.0.21.0/26"<br>]</pre> | no |
| <a name="input_eks_cluster_version"></a> [eks\_cluster\_version](#input\_eks\_cluster\_version) | EKS Cluster version | `string` | `"1.26"` | no |
| <a name="input_eks_cluster_version"></a> [eks\_cluster\_version](#input\_eks\_cluster\_version) | EKS Cluster version | `string` | `"1.29"` | no |
| <a name="input_eks_data_plane_subnet_secondary_cidr"></a> [eks\_data\_plane\_subnet\_secondary\_cidr](#input\_eks\_data\_plane\_subnet\_secondary\_cidr) | Secondary CIDR blocks. 32766 IPs per Subnet per Subnet/AZ for EKS Node and Pods | `list(string)` | <pre>[<br> "100.64.0.0/17",<br> "100.64.128.0/17"<br>]</pre> | no |
| <a name="input_enable_airflow"></a> [enable\_airflow](#input\_enable\_airflow) | Enable Apache Airflow | `bool` | `true` | no |
| <a name="input_enable_airflow_spark_example"></a> [enable\_airflow\_spark\_example](#input\_enable\_airflow\_spark\_example) | Enable Apache Airflow and Spark Operator example | `bool` | `false` | no |
Expand Down
86 changes: 49 additions & 37 deletions schedulers/terraform/self-managed-airflow/addons.tf
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
#---------------------------------------------------------------
module "ebs_csi_driver_irsa" {
source = "terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks"
version = "~> 5.20"
version = "~> 5.34"
role_name_prefix = format("%s-%s-", local.name, "ebs-csi-driver")
attach_ebs_csi_policy = true
oidc_providers = {
Expand Down Expand Up @@ -117,16 +117,16 @@ module "eks_blueprints_addons" {
#---------------------------------------
enable_aws_for_fluentbit = true
aws_for_fluentbit_cw_log_group = {
create = true
use_name_prefix = false
name = "/${local.name}/aws-fluentbit-logs" # Add-on creates this log group
retention_in_days = 30
}
# Additional IRSA policies for FluentBit add-on to access AWS services(e.g., CW Logs, S3 etc.)
aws_for_fluentbit = {
s3_bucket_arns = [
module.fluentbit_s3_bucket.s3_bucket_arn,
"${module.fluentbit_s3_bucket.s3_bucket_arn}/*}"
]
create_namespace = true
namespace = "aws-for-fluentbit"
create_role = true
role_policies = { "policy1" = aws_iam_policy.fluentbit.arn }
values = [templatefile("${path.module}/helm-values/aws-for-fluentbit-values.yaml", {
region = local.region,
cloudwatch_log_group = "/${local.name}/aws-fluentbit-logs"
Expand Down Expand Up @@ -155,7 +155,7 @@ module "eks_blueprints_addons" {
amp_url = "https://aps-workspaces.${local.region}.amazonaws.com/workspaces/${aws_prometheus_workspace.amp[0].id}"
}) : templatefile("${path.module}/helm-values/kube-prometheus.yaml", {})
]
chart_version = "48.1.1"
chart_version = "48.2.3"
set_sensitive = [
{
name = "grafana.adminPassword"
Expand All @@ -167,6 +167,7 @@ module "eks_blueprints_addons" {
tags = local.tags
}


#---------------------------------------------------------------
# Data on EKS Kubernetes Addons
#---------------------------------------------------------------
Expand All @@ -181,47 +182,23 @@ module "eks_data_addons" {
#---------------------------------------------------------------
enable_airflow = true
airflow_helm_config = {
airflow_namespace = try(kubernetes_namespace_v1.airflow[0].metadata[0].name, local.airflow_namespace)

namespace = try(kubernetes_namespace_v1.airflow[0].metadata[0].name, local.airflow_namespace)
version = "1.11.0"
values = [templatefile("${path.module}/helm-values/airflow-values.yaml", {
# Airflow Postgres RDS Config
airflow_version = local.airflow_version
airflow_db_user = local.airflow_name
airflow_db_pass = try(sensitive(aws_secretsmanager_secret_version.postgres[0].secret_string), "")
airflow_db_name = try(module.db[0].db_instance_name, "")
airflow_db_host = try(element(split(":", module.db[0].db_instance_endpoint), 0), "")
#Service Accounts
worker_service_account = try(kubernetes_service_account_v1.airflow_worker[0].metadata[0].name, local.airflow_workers_service_account)
scheduler_service_account = try(kubernetes_service_account_v1.airflow_scheduler[0].metadata[0].name, local.airflow_scheduler_service_account)
webserver_service_account = try(kubernetes_service_account_v1.airflow_webserver[0].metadata[0].name, local.airflow_webserver_service_account)
# S3 bucket config for Logs
s3_bucket_name = try(module.airflow_s3_bucket[0].s3_bucket_id, "")
webserver_secret_name = local.airflow_webserver_secret_name
efs_pvc = local.efs_pvc
})]
# Use only when Apache Airflow is enabled with `airflow-core.tf` resources
set = var.enable_amazon_prometheus ? [
{
name = "scheduler.serviceAccount.create"
value = false
},
{
name = "scheduler.serviceAccount.name"
value = try(kubernetes_service_account_v1.airflow_scheduler[0].metadata[0].name, local.airflow_scheduler_service_account)
},
{
name = "webserver.serviceAccount.create"
value = false
},
{
name = "webserver.serviceAccount.name"
value = try(kubernetes_service_account_v1.airflow_webserver[0].metadata[0].name, local.airflow_webserver_service_account)
},
{
name = "workers.serviceAccount.create"
value = false
},
{
name = "workers.serviceAccount.name"
value = try(kubernetes_service_account_v1.airflow_worker[0].metadata[0].name, local.airflow_workers_service_account)
}
] : []
}

#---------------------------------------------------------------
Expand All @@ -243,6 +220,11 @@ module "eks_data_addons" {
EOT
]
}

#---------------------------------------------------------------
# Enable Karpenter Resources for Spark team A
#---------------------------------------------------------------

enable_karpenter_resources = true
karpenter_resources_helm_config = {
spark-compute-optimized = {
Expand Down Expand Up @@ -353,6 +335,15 @@ resource "aws_secretsmanager_secret_version" "grafana" {
secret_string = random_password.grafana.result
}

#---------------------------------------------------------------
# IAM Policy for FluentBit Add-on
#---------------------------------------------------------------
resource "aws_iam_policy" "fluentbit" {
description = "IAM policy policy for FluentBit"
name = "${local.name}-fluentbit-additional"
policy = data.aws_iam_policy_document.fluent_bit.json
}

#---------------------------------------------------------------
# S3 log bucket for FluentBit
#---------------------------------------------------------------
Expand All @@ -374,3 +365,24 @@ module "fluentbit_s3_bucket" {

tags = local.tags
}

#---------------------------------------------------------------
# IAM policy for FluentBit
#---------------------------------------------------------------
data "aws_iam_policy_document" "fluent_bit" {
statement {
sid = ""
effect = "Allow"
resources = ["arn:${data.aws_partition.current.partition}:s3:::${module.fluentbit_s3_bucket.s3_bucket_id}/*"]

actions = [
"s3:ListBucket",
"s3:PutObject",
"s3:PutObjectAcl",
"s3:GetObject",
"s3:GetObjectAcl",
"s3:DeleteObject",
"s3:DeleteObjectVersion"
]
}
}
16 changes: 8 additions & 8 deletions schedulers/terraform/self-managed-airflow/airflow-core.tf
Original file line number Diff line number Diff line change
Expand Up @@ -8,15 +8,15 @@ module "db" {

identifier = local.airflow_name

# All available versions: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_PostgreSQL.html#PostgreSQL.Concepts
engine = "postgres"
engine_version = "14.10"
family = "postgres14"
major_engine_version = "14"
instance_class = "db.m6i.xlarge"
engine_version = "14"
family = "postgres14" # DB parameter group
major_engine_version = "14" # DB option group
instance_class = "db.t4g.large"

storage_type = "io1"
allocated_storage = 100
iops = 3000
allocated_storage = 20
max_allocated_storage = 100

db_name = local.airflow_name
username = local.airflow_name
Expand All @@ -33,7 +33,7 @@ module "db" {
enabled_cloudwatch_logs_exports = ["postgresql", "upgrade"]
create_cloudwatch_log_group = true

backup_retention_period = 5
backup_retention_period = 1
skip_final_snapshot = true
deletion_protection = false

Expand Down
Loading

0 comments on commit 72b62ac

Please sign in to comment.