Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need to have a way to update certificate authority for RDS automatically #2044

Open
gecube opened this issue Mar 20, 2024 · 3 comments
Open
Labels
kind/feature Categorizes issue or PR as related to a new feature. service/rds Indicates issues or PRs that are related to rds-controller.

Comments

@gecube
Copy link

gecube commented Mar 20, 2024

Good day!

I am facing the issue that my RDS instances are running on old CA bundles. I know this because I am getting the next message in the Amazon Console:

image

As a DevOps engineer I want to have a way to re-roll ca bundles for my instances in semi-automatic way. Like a separate field with the current bundle version which I could change or some well documented process which DOES NOT involve manual actions in Amazon Console.

@gecube
Copy link
Author

gecube commented Mar 21, 2024

some investigations

First of all if I omit field caCertificateIdentifier in spec of DB it is propagated automatically by operator, I believe.

The source manifest:

apiVersion: rds.services.k8s.aws/v1alpha1
kind: DBInstance
metadata:
  name: octagon
  namespace: infra-production
spec:
  copyTagsToSnapshot: true
  enableCloudwatchLogsExports:
    - audit
    - error
    - general
    - slowquery
  performanceInsightsEnabled: false
  deletionProtection: true
  enableIAMDatabaseAuthentication: true
  allocatedStorage: 20
  maxAllocatedStorage: 40
  dbInstanceClass: db.r5.large
  dbInstanceIdentifier: octagon
  engine: mysql
  engineVersion: "5.7"
  masterUsername: "root"
  masterUserPassword:
    namespace: infra-production
    name: "dragoncoin-password"
    key: password
  dbSubnetGroupRef:
    from:
      name: rds-subnet
  publiclyAccessible: false
  vpcSecurityGroupRefs:
    - from:
        name: limit-rds-to-subnet
  monitoringInterval: 5
  monitoringRoleARN: "arn:aws:iam::966321756598:role/rds-enhanced-monitoring-role"

the target object in k8s api:

apiVersion: rds.services.k8s.aws/v1alpha1
kind: DBInstance
metadata:
  annotations:
    rds.services.k8s.aws/last-applied-secret-reference: infra-production/dragoncoin-password.password
  name: dbserver-8
  generation: 24
  namespace: infra-production
  finalizers:
    - finalizers.rds.services.k8s.aws/DBInstance
  labels:
    kustomize.toolkit.fluxcd.io/name: infra-management
    kustomize.toolkit.fluxcd.io/namespace: flux-system
spec:
  engine: mysql
  preferredMaintenanceWindow: 'sat:23:25-sat:23:55'
  caCertificateIdentifier: rds-ca-2019
  enableIAMDatabaseAuthentication: true
  dbInstanceClass: db.t4g.micro
  storageThroughput: 0
  deletionProtection: true
  masterUserPassword:
    key: password
    name: dragoncoin-password
    namespace: infra-production
  licenseModel: general-public-license
  storageEncrypted: false
  autoMinorVersionUpgrade: true
  publiclyAccessible: false
  monitoringInterval: 5
  copyTagsToSnapshot: true
  dbSubnetGroupRef:
    from:
      name: rds-subnet
  multiAZ: false
  enableCloudwatchLogsExports:
    - audit
    - error
    - general
    - slowquery
  preferredBackupWindow: '03:28-03:58'
  allocatedStorage: 20
  storageType: gp2
  vpcSecurityGroupRefs:
    - from:
        name: limit-rds-to-subnet
  engineVersion: '8.0'
  performanceInsightsEnabled: false
  maxAllocatedStorage: 40
  masterUsername: root
  dbInstanceIdentifier: dbserver-8
  backupRetentionPeriod: 1
  monitoringRoleARN: 'arn:aws:iam::966321756598:role/rds-enhanced-monitoring-role'

I omitted the meaningless fields. We can see that caCertificateIdentifier: rds-ca-2019 was populated somehow.

Now I want to change the rds. Let's try:

caCertificateIdentifier: rds-ca-2024

I am getting in the status:

    - lastTransitionTime: '2024-03-21T06:53:19Z'
      status: 'True'
      type: ACK.ReferencesResolved
    - message: "CertificateNotFound: Certificate not found: rds-ca-2024\n\tstatus code: 404, request id: 969eb888-5eef-4f98-84c6-caca7812263a"
      status: 'True'
      type: ACK.Recoverable
    - lastTransitionTime: '2024-03-21T06:53:19Z'
      message: Unable to determine if desired resource state matches latest observed state
      reason: "CertificateNotFound: Certificate not found: rds-ca-2024\n\tstatus code: 404, request id: 969eb888-5eef-4f98-84c6-caca7812263a"
      status: Unknown
      type: ACK.ResourceSynced

O.k. I am stupid and forget that the CA names could be from a specific list. I found it in Amazon doc here:
https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/UsingWithRDS.SSL.html

So we have the next options:

  • rds-ca-2019
  • rds-ca-rsa2048-g1
  • rds-ca-rsa4096-g1
  • rds-ca-ecc384-g1

I changed the value to the proper one (rds-ca-rsa4096-g1) and observed that as soon as controller picked up this change, the database was re-rolled. Probably with downtime.

So the conclusions:

  1. (probably) we need to make caCertificateIdentifier field mandatory when creating DBInstance.
  2. (definitely) we need to make better defaults for caCertificateIdentifier. Not rds-ca-2019 (because it does not have certificate autorotation), but something like rds-ca-rsa4096-g1
  3. (good to have) some validation web hook that will disallow putting the incorrect values into this field (like rds-ca-2024)
  4. (good to have) have some protection from instant change. Use case: I want to re-roll CA during the next maintenance window. But the controller changes CA as soon as field caCertificateIdentifier was changed in manifests.

@a-hilaly a-hilaly added service/rds Indicates issues or PRs that are related to rds-controller. kind/feature Categorizes issue or PR as related to a new feature. labels Mar 22, 2024
@ack-bot
Copy link
Collaborator

ack-bot commented Sep 18, 2024

Issues go stale after 180d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 60d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Provide feedback via https://github.com/aws-controllers-k8s/community.
/lifecycle stale

@ack-prow ack-prow bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 18, 2024
@gecube
Copy link
Author

gecube commented Sep 19, 2024

/remove-lifecycle stale

@ack-prow ack-prow bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. service/rds Indicates issues or PRs that are related to rds-controller.
Projects
None yet
Development

No branches or pull requests

3 participants