Skip to content

Vector Does Not Properly Reload Updated AWS Credentials #18591

@hillmandj

Description

@hillmandj

A note for the community

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Problem

We have a process in which secrets (files) for AWS are updated via a sidecar. The vector documentation states:

If your AWS credentials expire, Vector will automatically search for up-to-date credentials in the places (and order) described above.

We have attempted various changes in configuration such as setting auth.credentials_file to point to the file path that gets updated as indicated here. Regardless, we still got errors that the token had expired for our aws_s3 sink.

Also, as a separate experiment, we went into a pod that had credential files recently refreshed, and manually triggered a HUP signal to see if vector would pick it up and it failed to do so. The only way we could get vector to pick up the new creds was to issue a TERM signal. We're handling this now by issuing a TERM signal to vector whenever we detect a file change for the credentials via a separate container/process, which seems wrong.

In a separate open issue, it is mentioned that Vector should be able to pick this up via SIGHUP and that AWS's credential_process configuration should help, but we have not seen this work. Everything with respect to the AWS secrets is pretty standard. The contents of the secret directory look like this:

/vector# ls -la /secrets/aws/
total 28
drwxr-xr-x 2 1000 root 180 Sep 11 15:24 .
drwxrwxrwt 3 root root 200 Sep 11 15:24 ..
-rw-r--r-- 1 1000 root  72 Sep 11 15:24 config
-rw-r--r-- 1 1000 root 546 Sep 11 15:24 credentials
-rw-r--r-- 1 1000 root 581 Sep 11 15:24 credentials.json
-rw-r--r-- 1 1000 root  20 Sep 11 15:24 IAM_AWS_ACCESS_KEY_ID
-rw-r--r-- 1 1000 root  40 Sep 11 15:24 IAM_AWS_SECRET_ACCESS_KEY
-rw-r--r-- 1 1000 root 416 Sep 11 15:24 IAM_AWS_SESSION_TOKEN
-rw-r--r-- 1 1000 root  78 Sep 11 15:24 metadata

The credentials file itself looks like this:

/vector# cat /secrets/aws/credentials
[default]
aws_access_key_id=[redacted]
aws_secret_access_key=[redacted]
aws_session_token=[redacted]

And the configuration file looks like this:

/vector# cat /secrets/aws/config
[profile default]
credential_process = cat /secrets/aws/credentials.json

The credentials.json file which is used in credential_process follows the AWS specification, so cat should yield the result we expect based on the AWS documentation:

{
  "Version": 1,
  "AccessKeyId": "an AWS access key",
  "SecretAccessKey": "your AWS secret access key",
  "SessionToken": "the AWS session token for temporary credentials", 
  "Expiration": "ISO8601 timestamp when the credentials expire"
}  

One thing to note is this all began happening after we upgraded to vector 0.31.0 (we are on a later version of vector now but the problem still persists). Is there something we're missing here? It seems like this is all standard and vector should be able to handle changes to credential files automatically. Thanks!

Configuration

sinks:
  security_logs:
    type: aws_s3
    inputs:
      - security_output_cleanup
    bucket: [redacted]
    key_prefix: [redacted]
    compression: gzip
    region: [redacted]
    encoding:
      codec: json
    healthcheck:
      enabled: false
    batch:
      max_bytes: 20971520
      max_events: 500
      timeout_secs: 300
    buffer:
      type: disk
      max_size: 5000000000 #5GB

Version

0.32.1

Debug Output

No response

Example Data

No response

Additional Context

The AWS credentials.json file contents, notice the expiration date:

/vector# cat /secrets/aws/credentials.json
{"AccessKeyId":"[REDACTED]","SecretAccessKey":"[REDACTED]","SessionToken":"[REDACTED]","Expiration":"2023-08-17T03:04:06Z","Version":1}

Execution of a manual SIGHUP on the same pod, notice the timestamp as being earlier than the expiration date above:

{"timestamp":"2023-08-16T14:23:47.304714Z","level":"INFO","message":"Signal received.","signal":"SIGHUP","target":"vector::signal"}
{"timestamp":"2023-08-16T14:23:47.381175Z","level":"INFO","message":"Datadog API key provided. Integration with Datadog Observability Pipelines is enabled.","target":"vector::config::enterprise"}
{"timestamp":"2023-08-16T14:23:47.381227Z","level":"INFO","message":"Reloading running topology with new configuration.","target":"vector::topology::running"}
{"timestamp":"2023-08-16T14:23:47.405028Z","level":"INFO","message":"Attempting to report configuration to Datadog Observability Pipelines.","target":"vector::config::enterprise"}
{"timestamp":"2023-08-16T14:23:47.406248Z","level":"INFO","message":"Running healthchecks.","target":"vector::topology::running"}
{"timestamp":"2023-08-16T14:23:47.406465Z","level":"INFO","message":"New configuration loaded successfully.","target":"vector::topology::running"}
{"timestamp":"2023-08-16T14:23:47.406689Z","level":"INFO","message":"Starting journalctl.","target":"vector::sources::journald","span":{"component_id":"systemd","component_kind":"source","component_name":"systemd","component_type":"journald","name":"source"},"spans":[{"component_id":"systemd","component_kind":"source","component_name":"systemd","component_type":"journald","name":"source"}]}
{"timestamp":"2023-08-16T14:23:47.445069Z","level":"INFO","message":"Vector has reloaded.","path":"[File(\"config.yaml\", Some(Yaml))]","target":"vector"}
{"timestamp":"2023-08-16T14:23:47.482756Z","level":"INFO","message":"Vector config 097fd97c001853214b2a44269051bdf7ef482e905c45ec678d3fb781d12cd064 successfully reported to Datadog Observability Pipelines.","target":"vector::config::enterprise"}
{"timestamp":"2023-08-16T14:27:45.023107Z","level":"ERROR","message":"Non-retriable error; dropping the request.","error":"service error","internal_log_rate_limit":true,"target":"vector::sinks::util::retries","span":{"request_id":110,"name":"request"},"spans":[{"component_id":"security_logs","component_kind":"sink","component_name":"security_logs","component_type":"aws_s3","name":"sink"},{"request_id":110,"name":"request"}]}
{"timestamp":"2023-08-16T14:27:45.023214Z","level":"ERROR","message":"Service call failed. No retries or retries exhausted.","error":"Some(ServiceError(ServiceError { source: PutObjectError { kind: Unhandled(Unhandled { source: Error { code: Some(\"ExpiredToken\"), message: Some(\"The provided token has expired.\"), request_id: Some(\"[redacted]\"), extras: {\"s3_extended_request_id\": \"[redacted]\"} } }), meta: Error { code: Some(\"ExpiredToken\"), message: Some(\"The provided token has expired.\"), request_id: Some(\"[redacted]\"), extras: {\"s3_extended_request_id\": \"[redacted]"} } }, raw: Response { inner: Response { status: 400, version: HTTP/1.1, headers: {\"x-amz-request-id\": \"[redacted]\", \"x-amz-id-2\": \"[redacted]\", \"content-type\": \"application/xml\", \"transfer-encoding\": \"chunked\", \"date\": \"Wed, 16 Aug 2023 14:27:44 GMT\", \"server\": \"AmazonS3\", \"connection\": \"close\"}, body: SdkBody { inner: Once(Some(b\"<?xml version=\\\"1.0\\\" encoding=\\\"UTF-8\\\"?>\\n<Error><Code>ExpiredToken</Code><Message>The provided token has expired.</Message><Token-0>[redacted]</Token-0><RequestId>[redacted]</RequestId><HostId>[redacted]</HostId></Error>\")), retryable: true } }, properties: SharedPropertyBag(Mutex { data: PropertyBag, poisoned: false, .. }) } }))","request_id":110,"error_type":"request_failed","stage":"sending","internal_log_rate_limit":true,"target":"vector_common::internal_event::service","span":{"request_id":110,"name":"request"},"spans":[{"component_id":"security_logs","component_kind":"sink","component_name":"security_logs","component_type":"aws_s3","name":"sink"},{"request_id":110,"name":"request"}]}
{"timestamp":"2023-08-16T14:27:45.023287Z","level":"ERROR","message":"Events dropped","intentional":false,"count":1,"reason":"Service call failed. No retries or retries exhausted.","internal_log_rate_limit":true,"target":"vector_common::internal_event::component_events_dropped","span":{"request_id":110,"name":"request"},"spans":[{"component_id":"security_logs","component_kind":"sink","component_name":"security_logs","component_type":"aws_s3","name":"sink"},{"request_id":110,"name":"request"}]}

References

#12585

Metadata

Metadata

Assignees

No one assigned

    Labels

    type: bugA code related bug.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions