-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement zlib compression #1487
Conversation
…kwards compatibility
* rows: implement driver.RowsColumnTypeScanType Implementation for time.Time not yet complete! * rows: implement driver.RowsColumnTypeNullable * rows: move fields related code to fields.go * fields: use NullTime for nullable datetime fields * fields: make fieldType its own type * rows: implement driver.RowsColumnTypeDatabaseTypeName * fields: fix copyright year * rows: compile time interface implementation checks * rows: move tests to versioned driver test files * rows: cache parseTime in resultSet instead of mysqlConn * fields: fix string and time types * rows: implement ColumnTypeLength * rows: implement ColumnTypePrecisionScale * rows: fix ColumnTypeNullable * rows: ColumnTypes tests part1 * rows: use keyed composite literals in ColumnTypes tests * rows: ColumnTypes tests part2 * rows: always use NullTime as ScanType for datetime * rows: avoid errors through rounding of time values * rows: remove parseTime cache * fields: remove unused scanTypes * rows: fix ColumnTypePrecisionScale implementation * fields: sort types alphabetical * rows: remove ColumnTypeLength implementation for now * README: document ColumnType Support
AWS Aurora returns a 1290 after failing over requiring the connection to be closed and opened again to be able to perform writes.
Most forks won't be in goveralls and so this command in travis.yml was, previously, failing and causing the build to fail. Now, it doesn't!
* Drop support for Go 1.6 and lower * Remove cloneTLSConfig for legacy Go versions
…#623) * Added support for custom string types. * Add author name * Added license header * Added a newline to force a commit. * Remove newline.
* Also add conversions for additional types in ConvertValue ref golang/go@d7c0de9
* Fixed broken import for appengine/cloudsql appengine.go import path of appengine/cloudsql has changed to google.golang.org/appengine/cloudsql - Fixed. * Added my name to the AUTHORS
* Differentiate between BINARY and CHAR When looking up the database type name, we now check the character set for the following field types: * CHAR * VARCHAR * BLOB * TINYBLOB * MEDIUMBLOB * LONGBLOB If the character set is 63 (which is the binary pseudo character set), we return the binary names, which are (respectively): * BINARY * VARBINARY * BLOB * TINYBLOB * MEDIUMBLOB * LONGBLOB If any other character set is in use, we return the text names, which are (again, respectively): * CHAR * VARCHAR * TEXT * TINYTEXT * MEDIUMTEXT * LONGTEXT To facilitate this, mysqlField has been extended to include a uint8 field for character set, which is read from the appropriate packet. Column type tests have been updated to ensure coverage of binary and text types. * Increase test coverage for column types
* Fix prepared statement When there are many args and maxAllowedPacket is not enough, writeExecutePacket() attempted to use STMT_LONG_DATA even for 0byte string. But writeCommandLongData() doesn't support 0byte data. So it caused to send malfold packet. This commit loosen threshold for using STMT_LONG_DATA. * Change minimum size of LONG_DATA to 64byte * Add test which reproduce issue 730 * TestPreparedManyCols test only numParams = 65535 case * s/as possible//
Are you planning to make this work for both zlib and zstd? Maybe instead of |
In next release, no. This is API in my head: import _ "github.com/go-sql-driver/mysql/zstd" // enable zstd compression I will use https://github.com/klauspost/compress as a zstd library. |
@methane it is ok to not include that now. But I think the options and/or API should be made with multiple compression algorithms in mind. Using an import to enable zstd would make it a compile time option. Maybe some applications need this as runtime option. |
Why do you think runtime option is needed? I don't like thousands of options. After I understand that the option to disable zstd even if its available is needed for more than 1% users, I will add it. |
@methane I agree that reducing options in general is better for users and may also remove complexity for developers. However there are some options where the user may want to choose the compression algorithm. To make things worse there is also a compression level option in the protocol. As example, this is what
I'm not saying we should do the same, especially as Situations where a runtime option would be needed/useful:
|
All of your use cases doesn't change my mind. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Outside diff range and nitpick comments (16)
compress.go (1)
56-56
: Fix typo in comment: 'chuecksum' should be 'checksum'Apply this diff to correct the typo:
- err = zr.Close() // zr.Close() may return chuecksum error. + err = zr.Close() // zr.Close() may return checksum error.packets.go (2)
Line range hint
432-438
: Ensure sequence synchronization by callingmc.syncSequence()
after writing command packetsIn
writeCommandPacket
,mc.syncSequence()
is not called aftermc.writePacket(data)
. This may lead to sequence number mismatches when compression is enabled.Apply this diff to synchronize sequences:
func (mc *mysqlConn) writeCommandPacket(command byte) error { // Reset Packet Sequence mc.resetSequence() data, err := mc.buf.takeSmallBuffer(4 + 1) if err != nil { return err } // Add command byte data[4] = command // Send CMD packet - return mc.writePacket(data) + err = mc.writePacket(data) + mc.syncSequence() + return err }
Line range hint
470-480
: Ensure sequence synchronization by callingmc.syncSequence()
after writing command packetsSimilarly, in
writeCommandPacketUint32
, consider callingmc.syncSequence()
after writing the packet to maintain proper sequence numbers when compression is enabled.Apply this diff to synchronize sequences:
func (mc *mysqlConn) writeCommandPacketUint32(command byte, arg uint32) error { // Reset Packet Sequence mc.resetSequence() data, err := mc.buf.takeSmallBuffer(4 + 1 + 4) if err != nil { return err } // Add command byte data[4] = command // Add arg [32 bit] data[5] = byte(arg) data[6] = byte(arg >> 8) data[7] = byte(arg >> 16) data[8] = byte(arg >> 24) // Send CMD packet - return mc.writePacket(data) + err = mc.writePacket(data) + mc.syncSequence() + return err }compress_test.go (2)
37-51
: Simplify error handling logicThe error handling can be made more explicit and clearer.
Consider this improvement:
func uncompressHelper(t *testing.T, mc *mysqlConn, compressedPacket []byte) []byte { conn := new(mockConn) conn.data = compressedPacket mc.netConn = conn uncompressedPacket, err := mc.readPacket() - if err != nil { - if err != io.EOF { - t.Fatalf("non-nil/non-EOF error when reading contents: %s", err.Error()) - } + if err != nil && err != io.EOF { + t.Fatalf("unexpected error when reading contents: %v", err) } return uncompressedPacket }
59-119
: Consider adding edge cases to test suiteThe test suite is comprehensive but could benefit from additional edge cases:
- Empty packet (0 bytes)
- Maximum packet size (16MB)
- Packets with special characters or null bytes
- Packets with repeated patterns that test compression efficiency
Example additional test cases:
tests := []struct { uncompressed []byte desc string }{ + {uncompressed: []byte{}, + desc: "empty packet"}, + {uncompressed: make([]byte, 1<<24-1), + desc: "max packet size"}, + {uncompressed: bytes.Repeat([]byte{0}, 1000), + desc: "null bytes"}, + {uncompressed: []byte("Hello\x00World\n\r\t"), + desc: "special characters"}, // ... existing test cases ... }const.go (1)
14-14
: Improve documentation for debug constantThe debug constant's purpose and impact should be better documented. Consider adding:
- What debugging information is exposed when enabled
- Security implications of enabling it in production
- Where the debug output is directed
- debug = false // for debugging. Set true only in development. + debug = false // Controls detailed logging of compression operations. + // WARNING: Setting this to true in production may impact + // performance and expose sensitive information in logs. + // Use only during development.buffer.go (2)
18-21
: Enhance readerFunc documentationThe documentation for readerFunc could be more comprehensive.
-// readerFunc is a function that compatible with io.Reader. -// We use this function type instead of io.Reader because we want to -// just pass mc.readWithTimeout. +// readerFunc is a function type that implements the reading behavior of io.Reader. +// It is used instead of io.Reader interface to allow direct usage of +// connection-specific read methods (like readWithTimeout) without wrapper types. +// This improves performance by avoiding interface allocations and indirection.
Line range hint
46-85
: Consider adding error documentation for fill methodThe fill method should document possible error types it can return.
-func (b *buffer) fill(need int, r readerFunc) error { +// fill reads into the buffer until at least need bytes are available. +// It returns io.ErrUnexpectedEOF if EOF is encountered before need bytes are read, +// io.EOF if exactly need bytes are read before EOF, or any other error encountered +// during reading. +func (b *buffer) fill(need int, r readerFunc) error {infile.go (1)
Line range hint
1-24
: Consider adding compression-related documentationSince this file handles data transmission, it would be helpful to add documentation about how compression affects the LOCAL INFILE protocol.
connection_test.go (1)
22-22
: Consider adding compression-specific test casesWhile the buffer initialization changes are correct, we should add test cases specifically for compression scenarios.
Consider adding tests for:
- Compressed packet handling
- Sequence synchronization with compression enabled
- Edge cases with large compressed packets
Also applies to: 42-42, 69-69, 86-86, 102-102, 164-164, 181-181
packets_test.go (1)
Line range hint
276-306
: Consider consolidating buffer resetsThe test has multiple buffer reset operations that could be consolidated into a helper function to improve maintainability.
+func resetTestBuffer(mc *mysqlConn) { + mc.buf = newBuffer() + mc.sequence = 0 +} + // reset conn.closed = false conn.reads = 0 -mc.sequence = 0 -mc.buf = newBuffer() +resetTestBuffer(mc)benchmark_test.go (1)
70-72
: Consider adding compression level benchmarksWhile the basic compression benchmark is good, consider adding benchmarks with different compression levels to help users optimize their configuration.
+func BenchmarkQueryCompressionLevels(b *testing.B) { + levels := []int{1, 3, 6, 9} // Different compression levels + for _, level := range levels { + b.Run(fmt.Sprintf("level_%d", level), func(b *testing.B) { + benchmarkQueryHelper(b, true) // TODO: Add compression level parameter + }) + } +}dsn.go (1)
542-546
: Consider improving the error message for invalid compression values.The error handling is correct, but the message could be more helpful by explicitly stating the valid values.
- return errors.New("invalid bool value: " + value) + return errors.New("invalid value for compress: " + value + " (valid values: true, false, 1, 0)")utils.go (1)
493-503
: Consider enhancing the documentation for 24-bit integer functions.While the implementation is correct, the documentation could be more detailed to explain the byte order and usage context.
-// 24bit integer: used for packet headers. +// putUint24 writes a 24-bit integer in little-endian byte order. +// It is used for MySQL packet headers where the first 3 bytes represent +// the payload length. func putUint24(data []byte, n int) { data[2] = byte(n >> 16) data[1] = byte(n >> 8) data[0] = byte(n) } +// getUint24 reads a 24-bit integer in little-endian byte order. +// It is used for MySQL packet headers where the first 3 bytes represent +// the payload length. func getUint24(data []byte) int { return int(data[2])<<16 | int(data[1])<<8 | int(data[0]) }README.md (2)
41-41
: Fix markdown list indentation.The feature list item should maintain consistent indentation with other items.
- * Supports zlib compression. +* Supports zlib compression.🧰 Tools
🪛 Markdownlint (0.37.0)
41-41: Expected: 0; Actual: 2
Unordered list indentation(MD007, ul-indent)
273-279
: Add language identifier to the code block.The code block should specify the language for proper syntax highlighting.
-``` +```properties Type: bool Valid Values: true, false Default: false🧰 Tools
🪛 Markdownlint (0.37.0)
273-273: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (17)
.github/workflows/test.yml
(1 hunks)AUTHORS
(2 hunks)README.md
(2 hunks)benchmark_test.go
(7 hunks)buffer.go
(5 hunks)compress.go
(1 hunks)compress_test.go
(1 hunks)connection.go
(3 hunks)connection_test.go
(7 hunks)connector.go
(2 hunks)const.go
(1 hunks)driver_test.go
(3 hunks)dsn.go
(4 hunks)infile.go
(1 hunks)packets.go
(13 hunks)packets_test.go
(8 hunks)utils.go
(1 hunks)
🧰 Additional context used
🪛 Markdownlint (0.37.0)
README.md
41-41: Expected: 0; Actual: 2
Unordered list indentation
(MD007, ul-indent)
273-273: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🪛 golangci-lint (1.62.2)
compress_test.go
20-20: Error return value of rand.Read
is not checked
(errcheck)
compress.go
171-171: Error return value is not checked
(errcheck)
driver_test.go
182-182: Error return value of db.Exec
is not checked
(errcheck)
190-190: Error return value of db2.Exec
is not checked
(errcheck)
198-198: Error return value of db3.Exec
is not checked
(errcheck)
🔇 Additional comments (18)
connection.go (2)
31-41
: LGTM: Addition of compression-related fields in mysqlConn
struct
The added fields compIO
, compressSequence
, and compress
correctly integrate compression handling into the connection struct.
67-85
: Proper implementation of timeout handling in readWithTimeout
and writeWithTimeout
The functions correctly set read and write deadlines based on the configuration, enhancing robustness.
.github/workflows/test.yml (1)
86-86
: LGTM: Increasing max_allowed_packet
to 48MB
The increase is appropriate to support larger packet sizes required by the zlib compression implementation.
compress_test.go (2)
18-22
:
Handle error returned by rand.Read
The function does not check the error returned by rand.Read
. Ignoring this error can lead to unexpected behavior if the random number generator fails.
Apply this fix:
func makeRandByteSlice(size int) []byte {
randBytes := make([]byte, size)
- rand.Read(randBytes)
+ if _, err := rand.Read(randBytes); err != nil {
+ panic(fmt.Sprintf("failed to generate random bytes: %v", err))
+ }
return randBytes
}
🧰 Tools
🪛 golangci-lint (1.62.2)
20-20: Error return value of rand.Read
is not checked
(errcheck)
24-35
:
Add sequence number synchronization
The function should call mc.syncSequenceNr()
after writing the packet to maintain proper sequence numbering, as mentioned in previous reviews.
Apply this fix:
func compressHelper(t *testing.T, mc *mysqlConn, uncompressedPacket []byte) []byte {
conn := new(mockConn)
mc.netConn = conn
err := mc.writePacket(append(make([]byte, 4), uncompressedPacket...))
if err != nil {
t.Fatal(err)
}
+ mc.syncSequenceNr()
return conn.written
}
Likely invalid or redundant comment.
infile.go (1)
175-175
: Verify sequence synchronization with compression enabled
The addition of syncSequence()
is essential for maintaining proper packet sequencing when compression is enabled. However, we should verify its behavior in edge cases.
✅ Verification successful
Based on the search results, I can now generate the final response:
Sequence synchronization is correctly implemented
The syncSequence()
call at line 175 is correctly placed and follows the established pattern in the codebase. Key observations:
- It's called after writing the final packet but before reading the response, consistent with other implementations
- The pattern
writePacket() followed by syncSequence()
is used in multiple places for maintaining proper packet ordering - The compression-enabled path properly handles sequence synchronization through the
writePackets
abstraction
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for potential race conditions or edge cases in sequence handling
# Look for other places where sequence synchronization might be needed
# Search for packet write operations that might need sequence sync
rg -A 3 "writePacket|readPacket"
# Look for compression-related sequence handling
ast-grep --pattern 'syncSequence()'
Length of output: 5728
AUTHORS (1)
24-24
: LGTM! Author entries are properly formatted and ordered
The new author entries follow the file's guidelines and are correctly placed in alphabetical order.
Also applies to: 66-66
connector.go (2)
169-172
: LGTM! Proper compression initialization
The compression initialization correctly checks both the client configuration and server capabilities before enabling compression. This follows the MySQL protocol specification for compression negotiation.
130-130
: Verify buffer initialization without netConn
The buffer is now initialized without the network connection. Ensure that error handling for buffer operations is properly implemented throughout the codebase.
✅ Verification successful
Let me gather more context about the buffer implementation and usage to make a more informed decision.
Let me check how the buffer is used with the network connection and error handling patterns.
From the gathered information, I can now provide the final response:
Buffer initialization change is safe and well-implemented
The change to initialize the buffer without netConn is actually an improvement in design. The buffer struct is a simple wrapper around byte slices with proper initialization in newBuffer()
. Error handling is comprehensive throughout the codebase as evidenced by the numerous error checks found. The buffer operations are properly decoupled from the network connection, with errors being consistently propagated up the call stack.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for error handling patterns around buffer operations
rg -A 3 "mc\.buf\." | grep -E "if.*err|return.*err"
Length of output: 536
Script:
#!/bin/bash
# Check buffer implementation and its error handling
ast-grep --pattern 'func newBuffer() $_' -A 10
# Find buffer read/write operations
rg -A 3 "mc\.buf\.(Read|Write|Flush)"
Length of output: 711
Script:
#!/bin/bash
# Check buffer usage with network connection
rg -A 5 "type buffer struct"
rg -A 5 "func \(b \*buffer\) (Read|Write|fill)"
# Check how errors are propagated from buffer operations
ast-grep --pattern 'if err := $_.buf.$_; err != nil {
$$$
}'
Length of output: 687
packets_test.go (1)
Line range hint 101-117
: LGTM! Consistent buffer initialization in tests
The buffer initialization has been updated consistently with the new pattern across test cases while maintaining the original test logic.
benchmark_test.go (1)
49-54
: LGTM! Clean compression parameter handling
The compression parameter is cleanly integrated into the DSN string construction while maintaining backward compatibility.
dsn.go (3)
76-79
: LGTM! Well-structured addition of the compress field.
The placement in the unexported fields section and documentation are appropriate.
130-136
: LGTM! Clean implementation of the EnableCompression option.
The function follows the established functional options pattern consistently.
310-312
: LGTM! Consistent DSN formatting for compression.
The implementation follows the established pattern for boolean parameters in the DSN string.
driver_test.go (4)
150-154
: Improved error message format for connection failures.
The error message now includes the DSN in the error output, which helps with debugging connection issues.
167-174
: New test case setup for compression feature.
Added test connection with compression enabled via compress=true
DSN parameter. This aligns with the PR objective of implementing zlib compression.
195-201
: Test coverage for compression feature.
Good addition of test coverage for the new compression feature. The test ensures that the same test cases work with compression enabled.
🧰 Tools
🪛 golangci-lint (1.62.2)
198-198: Error return value of db3.Exec
is not checked
(errcheck)
981-983
: Improved error handling in datetime tests.
The changes add proper error handling for row scanning operations in datetime tests.
Also applies to: 988-990
@shogo82148 Do you have time to review this in this year? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Description
Implemented the SQL compression protocol. This new feature is enabled by:
compress=true
in DSN.cfg.Apply(Compress(True))
A new PR to revive, rebase, and complete #649.
Checklist
Summary by CodeRabbit
compress
parameter for compression modes.