cloud-platform-terraform-s3-bucket

This Terraform module will create an Amazon S3 bucket for use on the Cloud Platform.

Usage

module "s3" {
  source = "github.com/ministryofjustice/cloud-platform-terraform-s3-bucket?ref=version" # use the latest release

  # S3 configuration
  versioning = true

  # Tags
  business_unit          = var.business_unit
  application            = var.application
  is_production          = var.is_production
  team_name              = var.team_name
  namespace              = var.namespace
  environment_name       = var.environment
  infrastructure_support = var.infrastructure_support
}

Migrate from existing buckets

You can use a combination of the Cloud Platform IRSA module and Service pod module to access your source bucket using the AWS CLI.

IRSA and Service Pod example configuration

In the cloud-platform-environments repository, within your namespace which contains your destination s3 bucket configuration, add the following terraform, substituting values as necessary:

module "cross_irsa" {
  source                 = "github.com/ministryofjustice/cloud-platform-terraform-irsa?ref=[latest-release-here]"
  business_unit          = var.business_unit
  application            = var.application
  eks_cluster_name       = var.eks_cluster_name
  namespace              = var.namespace
  service_account_name   = "${var.namespace}-cross-service"
  is_production          = var.is_production
  team_name              = var.team_name
  environment_name       = var.environment
  infrastructure_support = var.infrastructure_support
  role_policy_arns       = { s3 = aws_iam_policy.s3_migrate_policy.arn }
}

data "aws_iam_policy_document" "s3_migrate_policy" {
  # List & location for source & destination S3 bucket.
  statement {
    actions = [ 
      "s3:ListBucket",
      "s3:GetBucketLocation"
    ]
    resources = [ 
      module.s3_bucket.bucket_arn,
      "arn:aws:s3:::[source-bucket-name]"
    ]
  }
  # Permissions on source S3 bucket contents. 
  statement {
    actions = [ 
      "s3:GetObject",
      "s3:GetObjectVersion",
      "s3:GetObjectTagging"
    ]
    resources = [ "arn:aws:s3:::[source-bucket-name]/*" ]   # take note of trailing /* here
  }
  # Permissions on destination S3 bucket contents. 
  statement {
    actions = [
      "s3:PutObject",
      "s3:PutObjectTagging",
      "s3:GetObject",
      "s3:DeleteObject"
    ]
    resources = [ "${module.s3_bucket.bucket_arn}/*" ]
  }
}

resource "aws_iam_policy" "s3_migrate_policy" {
  name   = "s3_migrate_policy"
  policy = data.aws_iam_policy_document.s3_migrate_policy.json

  tags = {
    business-unit          = var.business_unit
    application            = var.application
    is-production          = var.is_production
    environment-name       = var.environment
    owner                  = var.team_name
    infrastructure-support = var.infrastructure_support
  }
}

# store irsa rolearn in k8s secret for retrieving to provide within source bucket policy
resource "kubernetes_secret" "cross_irsa" {
  metadata {
    name      = "cross-irsa-output"
    namespace = var.namespace
  }
  data = {
    role           = module.cross_irsa.role_name
    rolearn        = module.cross_irsa.role_arn
    serviceaccount = module.cross_irsa.service_account.name
  }
}

# set up the service pod
module "cross_service_pod" {
  source = "github.com/ministryofjustice/cloud-platform-terraform-service-pod?ref=[latest-release-here]"
  namespace            = var.namespace
  service_account_name = module.cross_irsa.service_account.name
}

Source bucket policy

The source bucket must permit your IRSA role to "read" from its bucket explicitly.

First, retrieve the IRSA rolearn using cloud-platform CLI and jq

cloud-platform decode-secret -s cross-irsa-output | jq -r '.data.rolearn'

You should get output similar to below:

arn:aws:iam::754256621582:role/cloud-platform-irsa-randomstring1234

Example for the source bucket (using retrieved ARN from above):

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AllowSourceBucketAccess",
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket",
                "s3:GetObject"
            ],
            "Principal": {
                "AWS": "arn:aws:iam::754256621582:role/cloud-platform-irsa-randomstring1234"
            },
            "Resource": [
                "arn:aws:s3:::source-bucket",
                "arn:aws:s3:::source-bucket/*"
            ]
        }
    ]
}

Note the bucket being listed twice, this is needed not a typo - the first is for the bucket itself, second for objects within it.

Synchronization

Once configured, you can exec into your service pod and execute the following. This will add new, update existing and delete objects (not in source).

kubectl exec --stdin --tty cloud-platform-7e1f25a0c851c02c-service-pod-abc123 -- /bin/sh

aws s3 sync --delete \
  s3://source_bucket_name \
  s3://destination_bucket_name \
  --source-region source_region \
  --region destination_region

Decompressing Files Stored in S3

If you have some files stored in S3 that are compresses (e.g. .zip, .gzip, .bz2, .p7z) you don't need to fully download and re-upload them in order to decompress them you can quite easily decompress them on the cloud platform kubernetes cluster with a Job.

The following example, is a Job pod connected to a 50Gb persistent volume (so any temporary storage does not fill up a cluster node), using bunzip2 to decompress a .bz2 file and re-upload it to S3.

For your needs, simply substitute the namespace, AWS creds, the bucket/filename and the compression tool, then you should be able to use this to decompress a file of any size without having to download them locally to your machine.

---
apiVersion: batch/v1
kind: Job
metadata:
  name: s3-decompression
  namespace: default
spec:
  backoffLimit: 0
  template:
    spec:
      serviceAccountName: irsa-service-account-name 
      restartPolicy: Never
      containers:
        - name: tools
          image: ministryofjustice/cloud-platform-tools:2.9.0
          command:
            - /bin/bash
            - -c
            - |
              cd /unpack
              aws s3 cp s3://${S3_BUCKET}/<filename>.bz2 - \
                | bunzip2 \
                | aws s3 cp - s3://${S3_BUCKET}/<filename>
          env:
            - name: S3_BUCKET
              value: <s3-bucket-name>
          resources: {}
          volumeMounts:
            - name: unpack
              mountPath: "/unpack"
          securityContext:
            runAsNonRoot: true
            runAsUser: 1000
            runAsGroup: 1000
      volumes:
        - name: unpack
          persistentVolumeClaim:
            claimName: unpack-small

---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: unpack-small
  namespace: default
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: "gp2-expand"
  resources:
    requests:
      storage: 50Gi

For further guidance on using IRSA, for example accessing AWS buckets in different accounts, see the following links:

Use IAM Roles for service accounts to access resources in a different AWS account

Accessing AWS APIs and resources from your namespace

[Cloud Platform service pod for AWS CLI access]https://user-guide.cloud-platform.service.justice.gov.uk/documentation/other-topics/cloud-platform-service-pod.html)

See the examples/ folder for more information.

Requirements

Name	Version
terraform	>= 1.2.5
aws	>= 4.0.0
random	>= 3.0.0

Providers

Name	Version
aws	>= 4.0.0
random	>= 3.0.0

Modules

No modules.

Resources

Name	Type
aws_iam_policy.irsa	resource
aws_s3_bucket.bucket	resource
aws_s3_bucket_public_access_block.block_public_access	resource
random_id.id	resource
aws_iam_policy_document.irsa	data source

Inputs

Name	Description	Type	Default	Required
acl	The bucket ACL to set	`string`	`"private"`	no
application	Application name	`string`	n/a	yes
bucket_name	Set the name of the S3 bucket. If left blank, a name will be automatically generated (recommended)	`string`	`""`	no
bucket_policy	The S3 bucket policy to set. If empty, no policy will be set	`string`	`""`	no
business_unit	Area of the MOJ responsible for the service	`string`	n/a	yes
cors_rule	cors rule	`any`	`[]`	no
enable_allow_block_pub_access	Enable whether to allow for the bucket to be blocked from public access	`bool`	`true`	no
environment_name	Environment name	`string`	n/a	yes
infrastructure_support	The team responsible for managing the infrastructure. Should be of the form ()	`string`	n/a	yes
is_production	Whether this is used for production or not	`string`	n/a	yes
lifecycle_rule	lifecycle	`any`	`[]`	no
log_path	Set the path of the logs	`string`	`""`	no
log_target_bucket	Set the target bucket for logs	`string`	`""`	no
logging_enabled	Set the logging for bucket	`bool`	`false`	no
namespace	Namespace name	`string`	n/a	yes
team_name	Team name	`string`	n/a	yes
versioning	Enable object versioning for the bucket	`bool`	`false`	no

Outputs

Name	Description
bucket_arn	S3 bucket ARN
bucket_domain_name	Regional bucket domain name
bucket_name	S3 bucket name
irsa_policy_arn	IAM policy ARN for access to the S3 bucket

Name		Name	Last commit message	Last commit date
Latest commit History 212 Commits
.github		.github
examples		examples
test		test
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.tf		main.tf
outputs.tf		outputs.tf
variables.tf		variables.tf
versions.tf		versions.tf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cloud-platform-terraform-s3-bucket

Usage

Migrate from existing buckets

IRSA and Service Pod example configuration

Source bucket policy

Synchronization

Decompressing Files Stored in S3

Requirements

Providers

Modules

Resources

Inputs

Outputs

Tags

Reading Material

About

Releases 25

Packages

Contributors 25

Languages

License

ministryofjustice/cloud-platform-terraform-s3-bucket

Folders and files

Latest commit

History

Repository files navigation

cloud-platform-terraform-s3-bucket

Usage

Migrate from existing buckets

IRSA and Service Pod example configuration

Source bucket policy

Synchronization

Decompressing Files Stored in S3

Requirements

Providers

Modules

Resources

Inputs

Outputs

Tags

Reading Material

About

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases 25

Packages 0

Contributors 25

Languages

Packages