Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent leaking file descriptor during snapshotting and provide better logging of errors #19093

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 30 additions & 9 deletions client/v3/snapshot/v3_snapshot.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
import (
"context"
"crypto/sha256"
"errors"
"fmt"
"io"
"os"
Expand Down Expand Up @@ -54,38 +55,58 @@
if err != nil {
return "", err
}
defer cli.Close()
defer func() {
err = cli.Close()
if err != nil {
lg.Error("Failed to close client", zap.Error(err))

Check warning on line 61 in client/v3/snapshot/v3_snapshot.go

View check run for this annotation

Codecov / codecov/patch

client/v3/snapshot/v3_snapshot.go#L61

Added line #L61 was not covered by tests
}
}()

partpath := dbPath + ".part"
defer os.RemoveAll(partpath)
defer func() {
err = os.RemoveAll(partpath)
if err != nil {
lg.Error("Failed to cleanup .part file", zap.Error(err))

Check warning on line 69 in client/v3/snapshot/v3_snapshot.go

View check run for this annotation

Codecov / codecov/patch

client/v3/snapshot/v3_snapshot.go#L69

Added line #L69 was not covered by tests
}
}()

var f *os.File
f, err = os.OpenFile(partpath, os.O_WRONLY|os.O_CREATE|os.O_TRUNC, fileutil.PrivateFileMode)
f, err := os.OpenFile(partpath, os.O_WRONLY|os.O_CREATE|os.O_TRUNC, fileutil.PrivateFileMode)
if err != nil {
return "", fmt.Errorf("could not open %s (%w)", partpath, err)
}
defer func() {
err = f.Close()
if err != nil && !errors.Is(err, os.ErrClosed) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works. But I think a better and more explicit way is to use sync.Once to ensure f.Close is only executed once. So that it's well aware that we need to close the file before renaming, but also need to ensure it's closed in case early return due to any error.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked https://pkg.go.dev/os#File.Close and the api can guarantee it can be executed only once. sync.Once is not neccesary here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a big deal, please feel free to merge this PR and backport to 3.5 and 3.4.

lg.Error("Could not close file descriptor", zap.Error(err))

Check warning on line 80 in client/v3/snapshot/v3_snapshot.go

View check run for this annotation

Codecov / codecov/patch

client/v3/snapshot/v3_snapshot.go#L80

Added line #L80 was not covered by tests
}
}()
lg.Info("created temporary db file", zap.String("path", partpath))

start := time.Now()
resp, err := cli.SnapshotWithVersion(ctx)
if err != nil {
return "", err
}
defer resp.Snapshot.Close()
defer func() {
err = resp.Snapshot.Close()
if err != nil {
lg.Error("Could not close snapshot stream", zap.Error(err))

Check warning on line 93 in client/v3/snapshot/v3_snapshot.go

View check run for this annotation

Codecov / codecov/patch

client/v3/snapshot/v3_snapshot.go#L93

Added line #L93 was not covered by tests
}
}()
lg.Info("fetching snapshot", zap.String("endpoint", cfg.Endpoints[0]))
var size int64
size, err = io.Copy(f, resp.Snapshot)
if err != nil {
return resp.Version, err
return resp.Version, fmt.Errorf("could not write snapshot: %w", err)

Check warning on line 100 in client/v3/snapshot/v3_snapshot.go

View check run for this annotation

Codecov / codecov/patch

client/v3/snapshot/v3_snapshot.go#L100

Added line #L100 was not covered by tests
}
if !hasChecksum(size) {
return resp.Version, fmt.Errorf("sha256 checksum not found [bytes: %d]", size)
}
if err = fileutil.Fsync(f); err != nil {
return resp.Version, err
return resp.Version, fmt.Errorf("could not fsync snapshot: %w", err)

Check warning on line 106 in client/v3/snapshot/v3_snapshot.go

View check run for this annotation

Codecov / codecov/patch

client/v3/snapshot/v3_snapshot.go#L106

Added line #L106 was not covered by tests
}
if err = f.Close(); err != nil {
return resp.Version, err
return resp.Version, fmt.Errorf("could not close file descriptor: %w", err)

Check warning on line 109 in client/v3/snapshot/v3_snapshot.go

View check run for this annotation

Codecov / codecov/patch

client/v3/snapshot/v3_snapshot.go#L109

Added line #L109 was not covered by tests
}
lg.Info("fetched snapshot",
zap.String("endpoint", cfg.Endpoints[0]),
Expand All @@ -98,5 +119,5 @@
return resp.Version, fmt.Errorf("could not rename %s to %s (%w)", partpath, dbPath, err)
}
lg.Info("saved", zap.String("path", dbPath))
return resp.Version, nil
return resp.Version, err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just curious about using err here. nil is more accurate

}
Loading