Skip to content

storage: panic: slice bounds out of range in gRPCWriter.uploadBuffer #12227

@winterjung

Description

@winterjung

Client

Storage gRPC Client

Description

While uploading a file using the cloud.google.com/go/storage SDK, a runtime error: slice bounds out of range panic occurred within the gRPCWriter.uploadBuffer function. This panic originates from a goroutine created internally by the SDK, making it difficult for the application level to recover.

Error Log

panic: runtime error: slice bounds out of range [:-199229440]

goroutine 29284526 [running]:
cloud.google.com/go/storage.(*gRPCWriter).uploadBuffer(0xc1e9b00240, 0x856f5a, 0xc000000, 0x1)
	/go/pkg/mod/cloud.google.com/go/[email protected]/grpc_client.go:2123 +0xbcd
cloud.google.com/go/storage.(*grpcStorageClient).OpenWriter.func1()
	/go/pkg/mod/cloud.google.com/go/[email protected]/grpc_client.go:1223 +0x130
created by cloud.google.com/go/storage.(*grpcStorageClient).OpenWriter in goroutine 150
	/go/pkg/mod/cloud.google.com/go/[email protected]/grpc_client.go:1185 +0x42e

Steps to Reproduce

The exact steps to reproduce are difficult to pinpoint. This error has not occurred in the last 6 months, and our workload averages 450MB/s uploads per day. The issue might be related to large file uploads or unstable network conditions.

Potential Problem Area and Hypothesis

According to the error log, the panic occurred at line 2123 in the cloud.google.com/go/[email protected]/grpc_client.go file:

// ...
		// Prepare chunk section for upload.
		data := toWrite[sent : sent+bytesToSendInCurrReq] // grpc_client.go:2123
// ...

It appears that the index sent : sent+bytesToSendInCurrReq for the toWrite slice is either negative or out of bounds ([:-199229440]). This could be due to abnormal values in the bytesToSendInCurrReq or sent variables.

func (w *gRPCWriter) uploadBuffer(recvd int, start int64, doneReading bool) (*storagepb.Object, int64, error) {
goroutine 29284526 [running]:
cloud.google.com/go/storage.(*gRPCWriter).uploadBuffer(0xc1e9b00240, 0x856f5a, 0xc000000, 0x1)
	/go/pkg/mod/cloud.google.com/go/[email protected]/grpc_client.go:2123 +0xbcd

It seems like recvd was 0x856f5a = 8,744,794, start was 0xc000000 = 201,326,592(192MiB) and doneReading was true

Relevant code:

// ...
sendBytes: // label this loop so that we can use a continue statement from a nested block
	for {
		bytesNotYetSent := recvd - sent
		remainingDataFitsInSingleReq := bytesNotYetSent <= maxPerMessageWriteSize

		if remainingDataFitsInSingleReq && doneReading {
			lastWriteOfEntireObject = true
		}

		// Send the maximum amount of bytes we can, unless we don't have that many.
		bytesToSendInCurrReq := maxPerMessageWriteSize
		if remainingDataFitsInSingleReq {
			bytesToSendInCurrReq = bytesNotYetSent
		}

		// Prepare chunk section for upload.
		data := toWrite[sent : sent+bytesToSendInCurrReq] // panic occurred here
// ...

Hypothesis:

  1. The recvd (received bytes) or sent (sent bytes) values might have been miscalculated for some reason, causing bytesNotYetSent to become negative. Consequently, bytesToSendInCurrReq could also become negative, leading to a panic when accessing the slice.
  2. The sent value is calculated as writeOffset - start. The writeOffset is updated within the determineOffset function via queryProgress. During this process, writeOffset might be incorrectly set to a value greater than start + recvd. This would cause sent to exceed recvd, eventually making bytesNotYetSent and bytesToSendInCurrReq negative.

Regarding Panic Recovery

As seen in lines 1183-1185 of grpc_client.go, the SDK creates its own goroutine for the write operation:

// ...
	// This function reads the data sent to the pipe and sends sets of messages
	// on the gRPC client-stream as the buffer is filled.
	go func() { // grpc_client.go:1185
		defer close(params.donec)
// ...

This internal goroutine makes it impossible for the package caller to wrap the call in a recover block to handle such panics. Is there any recommended way to recover from this type of panic when it originates from within the SDK's internally managed goroutine?

Environment Information

  • Docker (multi stage) on AWS EKS
    • build image: golang:1.23-bookworm
    • runtime image: gcr.io/distroless/base-debian12
  • Go version: 1.23
  • go.mod
    • google.golang.org/api v0.210.0
    • google.golang.org/grpc v1.67.1

Metadata

Metadata

Assignees

Labels

api: storageIssues related to the Cloud Storage API.priority: p2Moderately-important priority. Fix may not be included in next release.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions