Skip to content

Commit 96cd130

Browse files
committed
Modified pre-commit needs
1 parent 73c8451 commit 96cd130

File tree

11 files changed

+28
-29
lines changed

11 files changed

+28
-29
lines changed

streaming/spark-streaming/examples/consumer/app.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ def consume_and_write():
4343
'write.format.default'='parquet' -- Explicitly specifying Parquet format
4444
)
4545
""")
46-
46+
4747
# Read from Kafka
4848
df = spark.readStream \
4949
.format("kafka") \
@@ -78,4 +78,4 @@ def consume_and_write():
7878
query.awaitTermination() # Wait for the stream to finish
7979

8080
if __name__ == "__main__":
81-
consume_and_write()
81+
consume_and_write()

streaming/spark-streaming/examples/consumer/manifests/00_rbac_permissions.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,4 +39,4 @@ metadata:
3939
name: consumer-sa
4040
namespace: spark-operator
4141
annotations:
42-
eks.amazonaws.com/role-arn: "__MY_CONSUMER_ROLE_ARN__" # replace with your consumer role ARN: consumer_iam_role_arn
42+
eks.amazonaws.com/role-arn: "__MY_CONSUMER_ROLE_ARN__" # replace with your consumer role ARN: consumer_iam_role_arn

streaming/spark-streaming/examples/consumer/manifests/01_spark_application.yaml

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -13,17 +13,17 @@ spec:
1313
sparkVersion: "3.3.2"
1414
deps:
1515
jars:
16-
- "local:///app/jars/commons-logging-1.1.3.jar"
16+
- "local:///app/jars/commons-logging-1.1.3.jar"
1717
- "local:///app/jars/commons-pool2-2.11.1.jar"
1818
- "local:///app/jars/hadoop-client-api-3.3.2.jar"
19-
- "local:///app/jars/hadoop-client-runtime-3.3.2.jar"
19+
- "local:///app/jars/hadoop-client-runtime-3.3.2.jar"
2020
- "local:///app/jars/jsr305-3.0.0.jar"
2121
- "local:///app/jars/kafka-clients-2.8.1.jar"
2222
- "local:///app/jars/lz4-java-1.7.1.jar"
2323
- "local:///app/jars/scala-library-2.12.15.jar"
2424
- "local:///app/jars/slf4j-api-1.7.30.jar"
2525
- "local:///app/jars/snappy-java-1.1.8.1.jar"
26-
- "local:///app/jars/spark-sql-kafka-0-10_2.12-3.3.2.jar"
26+
- "local:///app/jars/spark-sql-kafka-0-10_2.12-3.3.2.jar"
2727
- "local:///app/jars/spark-tags_2.12-3.3.2.jar"
2828
- "local:///app/jars/spark-token-provider-kafka-0-10_2.12-3.3.2.jar"
2929
- "local:///app/jars/iceberg-spark-runtime-3.3_2.12-1.0.0.jar"
@@ -85,4 +85,3 @@ spec:
8585
- name: KAFKA_ADDRESS
8686
value: "__MY_KAFKA_BROKERS_ADRESS__" # Replace with your Kafka brokers address: bootstrap_brokers
8787
# value: "b-1.kafkademospark.mkjcj4.c12.kafka.us-west-2.amazonaws.com:9092,b-2.kafkademospark.mkjcj4.c12.kafka.us-west-2.amazonaws.com:9092"
88-

streaming/spark-streaming/examples/producer/00_deployment.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -23,11 +23,11 @@ spec:
2323
containers:
2424
- name: producer
2525
image: public.ecr.aws/data-on-eks/producer-kafka:1
26-
command: ["python", "app.py"]
26+
command: ["python", "app.py"]
2727
env:
2828
- name: AWS_REGION
29-
value: "__MY_AWS_REGION__" # Replace with your AWS region
30-
- name: BOOTSTRAP_BROKERS
29+
value: "__MY_AWS_REGION__" # Replace with your AWS region
30+
- name: BOOTSTRAP_BROKERS
3131
value: "__MY_KAFKA_BROKERS__" # Replace with your bootstrap brokers: bootstrap_brokers
3232
# value: "b-1.kafkademospark.mkjcj4.c12.kafka.us-west-2.amazonaws.com:9092,b-2.kafkademospark.mkjcj4.c12.kafka.us-west-2.amazonaws.com:9092"
3333
resources:

streaming/spark-streaming/examples/producer/app.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -90,4 +90,4 @@ def produce_data(producer, topic_name):
9090
produce_data(producer, topic_name)
9191
finally:
9292
producer.flush()
93-
producer.close()
93+
producer.close()
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,2 @@
11
boto3
2-
kafka-python
2+
kafka-python

streaming/spark-streaming/terraform/README.md

Lines changed: 13 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# Spark on K8s Operator with EKS
2-
Checkout the [documentation website](https://awslabs.github.io/data-on-eks/docs/blueprints/data-analytics/spark-operator-yunikorn) to deploy this pattern and run sample tests.
2+
Checkout the [documentation website](https://awslabs.github.io/data-on-eks/docs/blueprints/streaming-platforms/spark-streaming) to deploy this pattern and run sample tests.
33

44
<!-- BEGINNING OF PRE-COMMIT-TERRAFORM DOCS HOOK -->
55
## Requirements
@@ -19,20 +19,20 @@ Checkout the [documentation website](https://awslabs.github.io/data-on-eks/docs/
1919
|------|---------|
2020
| <a name="provider_aws"></a> [aws](#provider\_aws) | >= 3.72 |
2121
| <a name="provider_aws.ecr"></a> [aws.ecr](#provider\_aws.ecr) | >= 3.72 |
22-
| <a name="provider_kubernetes"></a> [kubernetes](#provider\_kubernetes) | >= 2.10 |
2322
| <a name="provider_random"></a> [random](#provider\_random) | 3.3.2 |
2423

2524
## Modules
2625

2726
| Name | Source | Version |
2827
|------|--------|---------|
2928
| <a name="module_amp_ingest_irsa"></a> [amp\_ingest\_irsa](#module\_amp\_ingest\_irsa) | aws-ia/eks-blueprints-addon/aws | ~> 1.0 |
29+
| <a name="module_consumer_iam_role"></a> [consumer\_iam\_role](#module\_consumer\_iam\_role) | terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks | n/a |
3030
| <a name="module_ebs_csi_driver_irsa"></a> [ebs\_csi\_driver\_irsa](#module\_ebs\_csi\_driver\_irsa) | terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks | ~> 5.34 |
3131
| <a name="module_eks"></a> [eks](#module\_eks) | terraform-aws-modules/eks/aws | ~> 19.15 |
3232
| <a name="module_eks_blueprints_addons"></a> [eks\_blueprints\_addons](#module\_eks\_blueprints\_addons) | aws-ia/eks-blueprints-addons/aws | ~> 1.2 |
3333
| <a name="module_eks_data_addons"></a> [eks\_data\_addons](#module\_eks\_data\_addons) | aws-ia/eks-data-addons/aws | ~> 1.30 |
34+
| <a name="module_producer_iam_role"></a> [producer\_iam\_role](#module\_producer\_iam\_role) | terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks | n/a |
3435
| <a name="module_s3_bucket"></a> [s3\_bucket](#module\_s3\_bucket) | terraform-aws-modules/s3-bucket/aws | ~> 3.0 |
35-
| <a name="module_spark_team_a_irsa"></a> [spark\_team\_a\_irsa](#module\_spark\_team\_a\_irsa) | aws-ia/eks-blueprints-addon/aws | ~> 1.0 |
3636
| <a name="module_vpc"></a> [vpc](#module\_vpc) | terraform-aws-modules/vpc/aws | ~> 5.0 |
3737
| <a name="module_vpc_endpoints"></a> [vpc\_endpoints](#module\_vpc\_endpoints) | terraform-aws-modules/vpc/aws//modules/vpc-endpoints | ~> 5.0 |
3838
| <a name="module_vpc_endpoints_sg"></a> [vpc\_endpoints\_sg](#module\_vpc\_endpoints\_sg) | terraform-aws-modules/security-group/aws | ~> 5.0 |
@@ -41,23 +41,21 @@ Checkout the [documentation website](https://awslabs.github.io/data-on-eks/docs/
4141

4242
| Name | Type |
4343
|------|------|
44+
| [aws_iam_policy.consumer_s3_kafka](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_policy) | resource |
4445
| [aws_iam_policy.grafana](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_policy) | resource |
45-
| [aws_iam_policy.spark](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_policy) | resource |
46+
| [aws_iam_policy.producer_s3_kafka](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_policy) | resource |
47+
| [aws_msk_cluster.kafka_test_demo](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/msk_cluster) | resource |
4648
| [aws_prometheus_workspace.amp](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/prometheus_workspace) | resource |
49+
| [aws_s3_bucket.iceberg_data](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/s3_bucket) | resource |
4750
| [aws_s3_object.this](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/s3_object) | resource |
4851
| [aws_secretsmanager_secret.grafana](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/secretsmanager_secret) | resource |
4952
| [aws_secretsmanager_secret_version.grafana](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/secretsmanager_secret_version) | resource |
50-
| [kubernetes_cluster_role.spark_role](https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs/resources/cluster_role) | resource |
51-
| [kubernetes_cluster_role_binding.spark_role_binding](https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs/resources/cluster_role_binding) | resource |
52-
| [kubernetes_namespace_v1.spark_team_a](https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs/resources/namespace_v1) | resource |
53-
| [kubernetes_secret_v1.spark_team_a](https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs/resources/secret_v1) | resource |
54-
| [kubernetes_service_account_v1.spark_team_a](https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs/resources/service_account_v1) | resource |
53+
| [aws_security_group.msk_security_group](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/security_group) | resource |
5554
| [random_password.grafana](https://registry.terraform.io/providers/hashicorp/random/3.3.2/docs/resources/password) | resource |
5655
| [aws_availability_zones.available](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/availability_zones) | data source |
5756
| [aws_caller_identity.current](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/caller_identity) | data source |
5857
| [aws_ecrpublic_authorization_token.token](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/ecrpublic_authorization_token) | data source |
5958
| [aws_iam_policy_document.grafana](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/iam_policy_document) | data source |
60-
| [aws_iam_policy_document.spark_operator](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/iam_policy_document) | data source |
6159
| [aws_partition.current](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/partition) | data source |
6260
| [aws_region.current](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/region) | data source |
6361
| [aws_secretsmanager_secret_version.admin_password_version](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/secretsmanager_secret_version) | data source |
@@ -70,7 +68,7 @@ Checkout the [documentation website](https://awslabs.github.io/data-on-eks/docs/
7068
| <a name="input_eks_data_plane_subnet_secondary_cidr"></a> [eks\_data\_plane\_subnet\_secondary\_cidr](#input\_eks\_data\_plane\_subnet\_secondary\_cidr) | Secondary CIDR blocks. 32766 IPs per Subnet per Subnet/AZ for EKS Node and Pods | `list(string)` | <pre>[<br> "100.64.0.0/17",<br> "100.64.128.0/17"<br>]</pre> | no |
7169
| <a name="input_enable_amazon_prometheus"></a> [enable\_amazon\_prometheus](#input\_enable\_amazon\_prometheus) | Enable AWS Managed Prometheus service | `bool` | `true` | no |
7270
| <a name="input_enable_vpc_endpoints"></a> [enable\_vpc\_endpoints](#input\_enable\_vpc\_endpoints) | Enable VPC Endpoints | `bool` | `false` | no |
73-
| <a name="input_enable_yunikorn"></a> [enable\_yunikorn](#input\_enable\_yunikorn) | Enable Apache YuniKorn Scheduler | `bool` | `true` | no |
71+
| <a name="input_enable_yunikorn"></a> [enable\_yunikorn](#input\_enable\_yunikorn) | Enable Apache YuniKorn Scheduler | `bool` | `false` | no |
7472
| <a name="input_name"></a> [name](#input\_name) | Name of the VPC and EKS Cluster | `string` | `"spark-operator-doeks"` | no |
7573
| <a name="input_private_subnets"></a> [private\_subnets](#input\_private\_subnets) | Private Subnets CIDRs. 254 IPs per Subnet/AZ for Private NAT + NLB + Airflow + EC2 Jumphost etc. | `list(string)` | <pre>[<br> "10.1.1.0/24",<br> "10.1.2.0/24"<br>]</pre> | no |
7674
| <a name="input_public_subnets"></a> [public\_subnets](#input\_public\_subnets) | Public Subnets CIDRs. 62 IPs per Subnet/AZ | `list(string)` | <pre>[<br> "10.1.0.0/26",<br> "10.1.0.64/26"<br>]</pre> | no |
@@ -82,10 +80,14 @@ Checkout the [documentation website](https://awslabs.github.io/data-on-eks/docs/
8280

8381
| Name | Description |
8482
|------|-------------|
83+
| <a name="output_bootstrap_brokers"></a> [bootstrap\_brokers](#output\_bootstrap\_brokers) | Bootstrap brokers for the MSK cluster |
8584
| <a name="output_cluster_arn"></a> [cluster\_arn](#output\_cluster\_arn) | The Amazon Resource Name (ARN) of the cluster |
8685
| <a name="output_cluster_name"></a> [cluster\_name](#output\_cluster\_name) | The Amazon Resource Name (ARN) of the cluster |
8786
| <a name="output_configure_kubectl"></a> [configure\_kubectl](#output\_configure\_kubectl) | Configure kubectl: make sure you're logged in with the correct AWS profile and run the following command to update your kubeconfig |
87+
| <a name="output_consumer_iam_role_arn"></a> [consumer\_iam\_role\_arn](#output\_consumer\_iam\_role\_arn) | IAM role ARN for the consumer |
8888
| <a name="output_grafana_secret_name"></a> [grafana\_secret\_name](#output\_grafana\_secret\_name) | Grafana password secret name |
89+
| <a name="output_producer_iam_role_arn"></a> [producer\_iam\_role\_arn](#output\_producer\_iam\_role\_arn) | IAM role ARN for the producer |
90+
| <a name="output_s3_bucket_id_iceberg_bucket"></a> [s3\_bucket\_id\_iceberg\_bucket](#output\_s3\_bucket\_id\_iceberg\_bucket) | Spark History server logs S3 bucket ID |
8991
| <a name="output_s3_bucket_id_spark_history_server"></a> [s3\_bucket\_id\_spark\_history\_server](#output\_s3\_bucket\_id\_spark\_history\_server) | Spark History server logs S3 bucket ID |
9092
| <a name="output_s3_bucket_region_spark_history_server"></a> [s3\_bucket\_region\_spark\_history\_server](#output\_s3\_bucket\_region\_spark\_history\_server) | Spark History server logs S3 bucket ID |
9193
| <a name="output_subnet_ids_starting_with_100"></a> [subnet\_ids\_starting\_with\_100](#output\_subnet\_ids\_starting\_with\_100) | Secondary CIDR Private Subnet IDs for EKS Data Plane |

streaming/spark-streaming/terraform/apps.tf

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -88,4 +88,4 @@ module "consumer_iam_role" {
8888
namespace_service_accounts = ["spark-operator:consumer-sa"]
8989
}
9090
}
91-
}
91+
}

streaming/spark-streaming/terraform/msk.tf

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -76,12 +76,10 @@ resource "aws_msk_cluster" "kafka_test_demo" {
7676

7777
unauthenticated = true
7878
}
79-
#Lyfecycle to ignore
79+
#Lyfecycle to ignore
8080
lifecycle {
8181
ignore_changes = [
8282
client_authentication[0].tls
8383
]
8484
}
8585
}
86-
87-

streaming/spark-streaming/terraform/outputs.tf

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -67,4 +67,4 @@ output "producer_iam_role_arn" {
6767
output "consumer_iam_role_arn" {
6868
description = "IAM role ARN for the consumer"
6969
value = module.consumer_iam_role.iam_role_arn
70-
}
70+
}

0 commit comments

Comments
 (0)