Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Request Handling on Requests with Trailing Headers #8753

Draft
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

shirady
Copy link
Contributor

@shirady shirady commented Feb 3, 2025

Explain the changes

  1. Accept more types of content sha256 headers (STREAMING-UNSIGNED-PAYLOAD-TRAILER, STREAMING-AWS4-HMAC-SHA256-PAYLOAD-TRAILER), as without those headers on clients that add the checksum headers with trailing we would fail.
  2. Add a condition of if (header_items[1]) to make sure we won't run the split function on undefined value.
  3. Change the state of the machine and add more states to support the trailing headers.
  4. Replace buffer slice function with subarray as the function slice was deprecated (see reference).

Issues:

  1. There was an update with some of the AWS clients (for example in AWS SDK Go V2 as you can see in PR Add AWS SDK Go V2 Packages and a Test Script Using It  noobaa-operator#1521 in the comments I attached we were failing on PutObject and UploadPart), it is also reported in boto3 version in DFBUGS-1513.

List of GAPs:

  1. Currently, we move the state machine to STATE_WAIT_NL_DATA just to avoid the parsing from failing, but we need to improve it and ensure that it is the trailer that we expect to see according to the request header (else we should return this.error_state();).
  2. We have an implementation GAP about the checksum calculation as was reported in #S3 checksum support for validating object integrity #8487.

Testing Instructions:

  1. As mentioned in PR disable UDP for n2n connections #1521, for example: go run script/client_script.go -bucket fourteen.bucket -key test-key -mpu test-mpu-key -endpoint "https://localhost:12443" -disable-deletion and expect to see all tests passing:
Running a couple of tests using AWS SDK Go V2...
--------------------------------------------------
Running on configured endpoint https://localhost:12443

creating bucket fourteen.bucket
succeeded in create bucket fourteen.bucket.

put object test-key in bucket fourteen.bucket
succeeded in put object test-key bucket fourteen.bucket.

create multi part upload test-mpu-key in bucket fourteen.bucket
succeeded in create multipart upload test-mpu-key bucket fourteen.bucket.

upload part test-mpu-key in bucket fourteen.bucket
succeeded in upload part 1 test-mpu-key in bucket fourteen.bucket.

complete multi part test-mpu-key in bucket fourteen.bucket
succeeded in upload part test-mpu-key in bucket fourteen.bucket.
--------------------------------------------------
disable-deletion flag was set, will not operate delete commands
--------------------------------------------------

Total Tests: 5
Passing Tests:  5 [createBucket putObject createMultipartUpload uploadPart completeMultiPartUpload]
Failing Tests:  0 []
Skipped Tests:  0 []
  • Doc added/updated
  • Tests added

@shirady
Copy link
Contributor Author

shirady commented Feb 3, 2025

Attached printing from the logs as an example:

In the headers of the request we can see 'x-amz-trailer': 'x-amz-checksum-crc32'.

Feb-3 14:35:44.552 [Endpoint/596]   [LOG] core.endpoint.s3.s3_rest:: SDSD print headers (req.headers) { host: 'localhost:12443', 'user-agent': 'aws-sdk-go-v2/1.34.0 ua/2.1 os/macos lang/go#1.23.2 md/GOOS#darwin md/GOARCH#arm64 api/s3#1.74.0 m/E,Z', 'content-length': '58', 'accept-encoding': 'identity', 'amz-sdk-invocation-id': 'ba0504c2-9712-4960-920a-7d45d93b1139', 'amz-sdk-request': 'attempt=1; max=3', authorization: 'AWS4-HMAC-SHA256 Credential=75GCqPFLPiMNYLkp54WY/20250203/us-east-1/s3/aws4_request, SignedHeaders=accept-encoding;amz-sdk-invocation-id;amz-sdk-request;content-encoding;content-length;content-type;host;x-amz-content-sha256;x-amz-date;x-amz-decoded-content-length;x-amz-trailer, Signature=c94d61a052a669f7d84c5af8f3d2d8fdbbd32a5927720a04f9dbb10a683bdc73', 'content-encoding': 'aws-chunked', 'content-type': 'application/octet-stream', 'x-amz-content-sha256': 'STREAMING-UNSIGNED-PAYLOAD-TRAILER', 'x-amz-date': '20250203T143545Z', 'x-amz-decoded-content-length': '16', 'x-amz-trailer': 'x-amz-checksum-crc32' }

We added some printings in the function parse(buf) to see how we parse the Chunked HTTP request and how it moves to different state machines.

Feb-3 14:35:44.553 [Endpoint/596]   [LOG] CONSOLE:: SDSD JSON.stringify(buf.toString()) "10\r\n"
Feb-3 14:35:44.553 [Endpoint/596]   [LOG] CONSOLE:: SDSD header_items [ '10' ]
Feb-3 14:35:44.553 [Endpoint/596]   [LOG] CONSOLE:: SDSD this.state STATE_SEND_DATA
Feb-3 14:35:44.554 [Endpoint/596]   [LOG] CONSOLE:: SDSD JSON.stringify(buf.toString()) "body for example"
Feb-3 14:35:44.554 [Endpoint/596]   [LOG] CONSOLE:: SDSD this.state STATE_WAIT_CR_DATA
Feb-3 14:35:44.554 [Endpoint/596]   [LOG] CONSOLE:: SDSD JSON.stringify(buf.toString()) "\r\n0\r\n"
Feb-3 14:35:44.554 [Endpoint/596]   [LOG] CONSOLE:: SDSD header_items [ '0' ]
Feb-3 14:35:44.554 [Endpoint/596]   [LOG] CONSOLE:: SDSD this.state STATE_SEND_DATA
Feb-3 14:35:44.554 [Endpoint/596]   [LOG] CONSOLE:: SDSD JSON.stringify(buf.toString()) "x-amz-checksum-crc32:uOMGCw==\r\n"
Feb-3 14:35:44.554 [Endpoint/596]   [LOG] CONSOLE:: SDSD in STATE_CONTENT_END
Feb-3 14:35:44.554 [Endpoint/596]   [LOG] CONSOLE:: SDSD this.state STATE_CONTENT_END
Feb-3 14:35:44.554 [Endpoint/596]   [LOG] CONSOLE:: SDSD JSON.stringify(buf.toString()) "\r\n"
Feb-3 14:35:44.554 [Endpoint/596]   [LOG] CONSOLE:: SDSD in STATE_CONTENT_END
Feb-3 14:35:44.554 [Endpoint/596]   [LOG] CONSOLE:: SDSD this.state STATE_CONTENT_END

The client is using PutObject with body "body for example", it is 16 characters (decimal) in hexadecimal base it is 10.
Analyzing the HTTP Chunked Request.

  • 10\r\n (chunk header -> first chunk is 16 bytes)
  • body for example (content -> the chunk itself)
  • \r\n (end of chunk)
  • 0\r\n (end of request body)
  • x-amz-checksum-crc32:uOMGCw==\r\n (trailing header) -> we don't want to fail the request on this chunk (currently we don't validate it).

@shirady shirady requested review from guymguym and jackyalbo February 3, 2025 15:26
@shirady shirady self-assigned this Feb 3, 2025
@shirady shirady force-pushed the checksum-issue-checks branch from 057ec16 to d7251c3 Compare February 4, 2025 11:41
@pull-request-size pull-request-size bot added size/M and removed size/S labels Feb 4, 2025
…AD-TRAILER, STREAMING-AWS4-HMAC-SHA256-PAYLOAD-TRAILER)

Signed-off-by: shirady <[email protected]>
Signed-off-by: shirady <[email protected]>
Signed-off-by: shirady <[email protected]>
@shirady shirady force-pushed the checksum-issue-checks branch 2 times, most recently from a12235c to 385f83c Compare February 4, 2025 19:51
@shirady shirady requested a review from guymguym February 4, 2025 19:55
this.chunk_size -= content.length;
index += content.length - 1;
if (content.length) this.push(content);
if (!this.chunk_size) this.state = STATE_WAIT_CR_DATA;
if (!this.chunk_size) {
this.state = this.last_chunk ? STATE_READ_TRAILER_DATA : STATE_WAIT_CR_DATA;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
this.state = this.last_chunk ? STATE_READ_TRAILER_DATA : STATE_WAIT_CR_DATA;
this.state = this.last_chunk ? STATE_CHECK_TRAILER : STATE_WAIT_CR_DATA;

this.state = STATE_WAIT_NL_TRAILER;
break;
}
this.chunk_trailer_str = buf.toString('utf8', start_trailer, index);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since it might overlap a buffer, we need to append the string

Suggested change
this.chunk_trailer_str = buf.toString('utf8', start_trailer, index);
this.chunk_trailer_str += buf.toString('utf8', start_trailer, index);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also this needs to be outside the loop...

}
} else if (this.state === STATE_WAIT_NL_TRAILER) {
if (buf[index] !== NL_CODE) return this.error_state();
this.trailers.push(this.chunk_trailer_str);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also need to set chunk_trailer_str to '' here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would write like this

Suggested change
this.trailers.push(this.chunk_trailer_str);
if (this.chunk_trailer_str) {
this.trailers.push(this.chunk_trailer_str);
this.chunk_trailer_str = '';
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants