Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S3 GetObjectRequest flakily has empty RequestID #5141

Closed
christodenny opened this issue Jan 12, 2024 · 2 comments
Closed

S3 GetObjectRequest flakily has empty RequestID #5141

christodenny opened this issue Jan 12, 2024 · 2 comments
Assignees
Labels
bug This issue is a bug. closed-for-staleness p2 This is a standard priority issue response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days.

Comments

@christodenny
Copy link

christodenny commented Jan 12, 2024

Describe the bug

  • req.Send() flakily fails with RequestError: send request failed
  • sub error varies (we've seen read tcp 10.205.2.84:50600->52.216.106.252:443: read: connection reset by peer and dial tcp 52.92.136.58:443: i/o timeout
  • We already have an open support ticket with AWS for these errors, however they're asking for request IDs to correlate the failures with logs on their side
  • We do log the RequestID, but in these failures the RequestID field is empty

Expected Behavior

RequestID field is populated so we can log and attach to AWS support ticket

Current Behavior

During these failures the RequestID field is empty

Reproduction Steps

The following flakily repros. When i run it locally one off the req succeeds, but in prod across millions of requests it occasionally doesn't have RequestID populated:

var netTransport = &http.Transport{
	Dial: (&net.Dialer{
		Timeout: time.Duration(60) * time.Second,
	}).Dial,
	TLSHandshakeTimeout: time.Duration(60) * time.Second,
}
var httpClient = http.Client{
	Transport: netTransport,
}
params := &s3.GetObjectInput{
	Bucket: aws.String(bucket),
	Key:    aws.String(key),
}
req, resp := s3.New(session.Must(session.NewSession(&aws.Config{HTTPClient: &httpClient}))).GetObjectRequest(params)
err := req.Send()
if err != nil {
	log.Fatal("AWS request failed, requestID: ", req.RequestID, ", error: ", err.Error())
}

Possible Solution

I noticed locally editing this logic that if I log req.RequestID before calling req.Send() the RequestID is always empty.

Is this ID only populated after successfully contacting the S3 server? These failures to populate the RequestID are correlated with connection reset by peer and i/o timeout. I wonder if those errors imply something failed before we either generate a request ID or get one back from S3.

Additional Information/Context

No response

SDK version used

v1.43.39

Environment details (Version of Go (go version)? OS name and version, etc.)

go version go1.21.4 darwin/amd64

@christodenny christodenny added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Jan 12, 2024
@RanVaknin
Copy link
Contributor

RanVaknin commented Jan 24, 2024

Hi @christodenny,

A couple of points.

  1. RequestId is a field present on the response, not request. This is kind of confusing, because the req.RequestID gets actuall deserialized from the response itself.

  2. We already have an open support ticket with AWS for these errors, however they're asking for request IDs to correlate the failures with logs on their side

The error you are seeing is a networking error indicating the server is closing the connection prematurely. This means that the server's repsonse is not going to make it to the client, and therefore the request object will not contain the requestID object.

Here is a visual aid:

package main

import (
	"github.com/aws/aws-sdk-go/aws"
	"github.com/aws/aws-sdk-go/aws/session"
	"github.com/aws/aws-sdk-go/service/s3"
	"log"
)

func main() {
	sess, err := session.NewSession(&aws.Config{
		Region:   aws.String("us-east-1"),
		LogLevel: aws.LogLevel(aws.LogDebugWithHTTPBody),
	})
	if err != nil {
		panic(err)
	}

	params := &s3.GetObjectInput{
		Bucket: aws.String("fake-bucket"),
		Key:    aws.String("fake-key"),
	}

	req, _ := s3.New(sess).GetObjectRequest(params)
	err = req.Send()
	if err != nil {
		log.Fatal("AWS request failed, requestID: ", req.RequestID, ", error: ", err.Error())
	}
}

Output log:

---[ REQUEST POST-SIGN ]-----------------------------
GET /fake-key HTTP/1.1
Host: fake-bucket.s3.amazonaws.com
User-Agent: aws-sdk-go/1.49.4 (go1.19.1; darwin; arm64)
Authorization: AWS4-HMAC-SHA256 Credential=REDACTED/20240124/us-east-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=REDACTED
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Date: 20240124T205914Z
Accept-Encoding: gzip

---[ RESPONSE ]--------------------------------------
HTTP/1.1 403 Forbidden
Transfer-Encoding: chunked
Content-Type: application/xml
Date: Wed, 24 Jan 2024 20:59:14 GMT
Server: AmazonS3
X-Amz-Id-2: tIyM/bpHMDysEvVR9HiCFqbTGGZxCzvNAvNCtzRcvhtbvqWW3LAiix715LCe3exn6+G/oS4eM+A=
X-Amz-Request-Id: 6MWJEHGZYD2CRTZW   <-- notice the requestID is returned in the response


-----------------------------------------------------
2024/01/24 12:59:14 <?xml version="1.0" encoding="UTF-8"?>
<Error><Code>AccessDenied</Code><Message>Access Denied</Message><RequestId>6MWJEHGZYD2CRTZW</RequestId><HostId>tIyM/bpHMDysEvVR9HiCFqbTGGZxCzvNAvNCtzRcvhtbvqWW3LAiix715LCe3exn6+G/oS4eM+A=</HostId></Error>

2024/01/24 12:59:14 AWS request failed, requestID: 6MWJEHGZYD2CRTZW, error: AccessDenied: Access Denied
        status code: 403, request id: 6MWJEHGZYD2CRTZW, host id: tIyM/bpHMDysEvVR9HiCFqbTGGZxCzvNAvNCtzRcvhtbvqWW3LAiix715LCe3exn6+G/oS4eM+A=

Because in your case the response is not a valid server response, instead it is a networking stack error, that information will not be present, and the SDK will not be able to deserialize that information from the response.

I'm not sure why the service is resetting the connection, but the service team might be able to use the user-agent header to identify and investigate your request.

Let me know how it goes.
Thanks,
Ran~

@RanVaknin RanVaknin reopened this Jan 24, 2024
@aws aws deleted a comment from github-actions bot Jan 24, 2024
@RanVaknin RanVaknin self-assigned this Jan 24, 2024
@RanVaknin RanVaknin added response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. p2 This is a standard priority issue and removed needs-triage This issue or PR still needs to be triaged. labels Jan 24, 2024
Copy link

This issue has not received a response in 1 week. If you want to keep this issue open, please just leave a comment below and auto-close will be canceled.

@github-actions github-actions bot added closing-soon This issue will automatically close in 4 days unless further comments are made. closed-for-staleness and removed closing-soon This issue will automatically close in 4 days unless further comments are made. labels Jan 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. closed-for-staleness p2 This is a standard priority issue response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days.
Projects
None yet
Development

No branches or pull requests

2 participants