Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible memory leak when uploading files using S3Client #4916

Closed
e200 opened this issue Jul 18, 2023 · 13 comments
Closed

Possible memory leak when uploading files using S3Client #4916

e200 opened this issue Jul 18, 2023 · 13 comments
Assignees
Labels
guidance Question that needs advice or information.

Comments

@e200
Copy link

e200 commented Jul 18, 2023

Describe the bug

This memory leak occurs when uploading content using the S3Client.

Expected Behavior

We expect the process of uploading content to not allocate so much memory.

Current Behavior

If you upload 300 file contents of 2mb to S3 (both local and in the actual AWS) it allocates an average memory of 1490.07MB in total and the memory is never freed.

The issue is within the github.com/aws/aws-sdk-go/service/s3/s3manager.(*maxSlicePool).newSlice as you can see in the following pprof output:

      flat  flat%   sum%        cum   cum%
 1490.07MB 70.29% 70.29%  1490.07MB 70.29%  github.com/aws/aws-sdk-go/service/s3/s3manager.(*maxSlicePool).newSlice
  601.01MB 28.35% 98.64%   601.01MB 28.35%  github.com/mazen160/go-random.Bytes
   12.38MB  0.58% 99.22%    15.90MB  0.75%  io.copyBuffer
    0.50MB 0.024% 99.24%  2102.08MB 99.16%  main.main.func1
         0     0% 99.24%    12.38MB  0.58%  bufio.(*Writer).ReadFrom
         0     0% 99.24%  1490.07MB 70.29%  github.com/aws/aws-sdk-go/service/s3/s3manager.(*maxSlicePool).Get
         0     0% 99.24%  1490.07MB 70.29%  github.com/aws/aws-sdk-go/service/s3/s3manager.(*uploader).nextReader
         0     0% 99.24%  1500.57MB 70.78%  github.com/aws/aws-sdk-go/service/s3/s3manager.(*uploader).upload
         0     0% 99.24%  1500.57MB 70.78%  github.com/aws/aws-sdk-go/service/s3/s3manager.Uploader.Upload (inline)
         0     0% 99.24%  1500.57MB 70.78%  github.com/aws/aws-sdk-go/service/s3/s3manager.Uploader.UploadWithContext

The same problem doesn't happen if you use the same codebase but replace the client from yours to minio.

Reproduction Steps

Follow the instructions in the readme.md

https://github.com/e200/s3client_memory_leak

Possible Solution

Take a look at the pool. The issue happens for both synchronous and asynchronous executions. The memory is being allocated for each upload and not freed.

Additional Information/Context

We were having this issue in production and our pods were getting out of resources many times because of the amount of allocated resources by the client. We use the client to cache some compiled content and the problem only occurs when uploading, but on download everything seems normal. The allocated memory is never freed, if it is, it is taking to long to be available again.

The same code works as expected if replacing the aws sdk s3 client by the package minio without any other changes (simply by replacing the package and using its equivalent method PutObject).

SDK version used

v1.44.302

Environment details (Version of Go (go version)? OS name and version, etc.)

MacOS Ventura 13.2.1, go 1.20.6

@e200 e200 added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Jul 18, 2023
@RanVaknin RanVaknin self-assigned this Jul 20, 2023
@RanVaknin RanVaknin added needs-review This issue or pull request needs review from a core team member. p2 This is a standard priority issue and removed needs-triage This issue or PR still needs to be triaged. labels Jul 20, 2023
@RanVaknin
Copy link
Contributor

Hi @e200,

Thanks for the in depth repro steps. I have gotten similar results by running your code.
I'll add this to our backlog.

Thanks,
Ran~

@RanVaknin RanVaknin added queued This issues is on the AWS team's backlog and removed needs-review This issue or pull request needs review from a core team member. labels Jul 20, 2023
@e200
Copy link
Author

e200 commented Aug 2, 2023

This bug is making it impossible to use S3. Our services are being restarted every single time we upload files due to the containers running out of resources.

@awaken
Copy link

awaken commented Aug 2, 2023

@e200 you are absolutely right! We are in the same position. I think this should be fixed with highest priority, otherwise the current stable version of AWS SDK for Go is completely useless for accessing to S3. Is that possible?

@e200
Copy link
Author

e200 commented Aug 2, 2023

@awaken why the irony? I didn't mean to say that the AWS SDK is completely useless, only the current S3 client is causing issues. It may not affect others, but it is surely affecting us.

@awaken
Copy link

awaken commented Aug 2, 2023

@e200 there is a misunderstanding here. I had no intention to make any joke. I was serious. As I said, I think right now the sdk is useless for accessing to S3. For any other use case, I agree with you, it should be working. My intention was to ring a bell for the resolution of the issue.

@RanVaknin
Copy link
Contributor

Hi @e200 ,

Running the same profiler for v2 of the SDK didnt bring up the same results. Until the team has bandwidth to prioritize investigating this, can you give v2 a try and see if this helps in the meantime?

Thanks,
Ran~

@e200
Copy link
Author

e200 commented Aug 2, 2023

I wasn't aware of the new version. Will try... Thank you for the feedback.

@nikolaydimitrov
Copy link

+1

@bagavp
Copy link

bagavp commented Aug 30, 2023

@e200 Did V2 work ?

@e200
Copy link
Author

e200 commented Aug 31, 2023

@bagavp I didn't try anymore.

I solved the issue from our side by making O(1) operation instead.

@baozaolaoba-top
Copy link

@e200 I tried the v2 version. there is a memory leak. Add defer, it works.Hope it helps you.

	body, err := os.ReadFile("test.txt")
	if err != nil {
		return nil, err
	}

	reader := bytes.NewReader(body)
        defer reader.Reset([]byte("")) // <-- here

	putObjectInput := &s3.PutObjectInput{
		Body:   reader,
		Bucket: aws.String(bucket),
		Key:    aws.String(filename),
	}

@lucix-aws
Copy link
Contributor

lucix-aws commented Dec 26, 2023

The original analysis here is inaccurate.

The pprof allocs profile shows the amount of memory allocated over the lifetime of the application, not the current memory usage.

See the profile descriptions here: https://pkg.go.dev/runtime/pprof#Profile.

The heap profile will give us the information we're looking for, running this after the 300th upload in your example:

$ curl http://localhost:8000/debug/pprof/heap > heap.mem
$ go tool pprof heap.mem 
Type: inuse_space
Time: Dec 26, 2023 at 1:52pm (EST)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top10
Showing nodes accounting for 4151.64kB, 100% of 4151.64kB total
Showing top 10 nodes out of 25
      flat  flat%   sum%        cum   cum%
 2086.21kB 50.25% 50.25%  2086.21kB 50.25%  github.com/mazen160/go-random.Bytes
 1537.27kB 37.03% 87.28%  1537.27kB 37.03%  github.com/aws/aws-sdk-go/aws/endpoints.init
  528.17kB 12.72%   100%   528.17kB 12.72%  regexp.(*bitState).reset
         0     0%   100%   528.17kB 12.72%  github.com/aws/aws-sdk-go/aws/request.(*HandlerList).Run
         0     0%   100%   528.17kB 12.72%  github.com/aws/aws-sdk-go/aws/request.(*Request).Build
         0     0%   100%   528.17kB 12.72%  github.com/aws/aws-sdk-go/aws/request.(*Request).Send
         0     0%   100%   528.17kB 12.72%  github.com/aws/aws-sdk-go/aws/request.(*Request).Sign
         0     0%   100%   528.17kB 12.72%  github.com/aws/aws-sdk-go/service/s3.dnsCompatibleBucketName
         0     0%   100%   528.17kB 12.72%  github.com/aws/aws-sdk-go/service/s3.endpointHandler
         0     0%   100%   528.17kB 12.72%  github.com/aws/aws-sdk-go/service/s3.hostCompatibleBucketName

As we can see above, the only thing we're holding onto is the most recent 2MiB buffer used as the upload input.

Aside: the reason it's allocating ~1500MiB over the lifetime of your upload loop is because of how the underlying byte slice pool works. The pool grows and shrinks within the life of individual Uploads - so since you're making 300 calls in-sequence, a new byte slice is allocated and eventually freed per call. If you increase the payload size or the number of uploads, you will see the cumulative allocation value of newSlice increase accordingly.

Copy link

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

@lucix-aws lucix-aws added guidance Question that needs advice or information. and removed bug This issue is a bug. p2 This is a standard priority issue queued This issues is on the AWS team's backlog labels Dec 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
guidance Question that needs advice or information.
Projects
None yet
Development

No branches or pull requests

7 participants