Skip to content

Conversation

@tanji
Copy link
Collaborator

@tanji tanji commented Jan 3, 2026

Summary

  • standardize env prefixes and precedence, plus env-based cluster discovery and secret propagation
  • improve runtime defaults with non-root working dir and key path fallback, and guard Cloud18 config read
  • refine mail init behavior and add implementation docs/tests

Testing

  • go test -vet=off ./server

tanji and others added 30 commits December 26, 2025 09:13
…dules

Split the massive utils/dbhelper/dbhelper.go into 8 focused files organized
by functional responsibility:

- types.go (906 lines): All struct definitions and methods
- replication.go (1159 lines): Replication control operations
- status.go (430 lines): Server status and variable queries
- performance.go (363 lines): Performance Schema and query analysis
- binlog.go (314 lines): Binary log operations
- transaction.go (129 lines): Transaction control and locking
- connection.go (97 lines): Connection management
- dbhelper.go (31 lines): Package documentation

Benefits:
- Improved code organization and maintainability
- Better testability with focused modules
- Easier navigation and code discovery
- Clear separation of concerns
- Reduced cognitive load for developers

Original file preserved as reference (not committed).
All 162 exported functions categorized and extracted.
Zero compilation errors, full backward compatibility maintained.
Add DatabaseVendor interface with MySQL, MariaDB, and PostgreSQL implementations
to eliminate vendor-specific conditionals throughout the codebase.

- vendor.go (497 lines): interface and three vendor implementations
- VENDOR_USAGE.md: migration guide and usage examples

Encapsulates vendor differences in query syntax, replication commands, and
terminology. Enables easier testing, cleaner code, and simplified addition
of new database vendors.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Add comprehensive safety section to VENDOR_USAGE.md explaining that the vendor
layer is completely additive and does not modify existing code. Include safe
migration patterns and risk assessment guidance.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
…eterized queries

Phase 3 of dbhelper refactoring implements comprehensive SQL injection prevention:

Security Infrastructure (security.go):
- Add ValidateIdentifier, ValidateGTIDMode, ValidateBinlogFormat, ValidateChannel
- Add ValidateFilename, ValidateNumeric, ValidateBoolean validators
- Implement QuoteIdentifier with MySQL/PostgreSQL vendor support
- Create SafeQueryBuilder for complex parameterized query construction
- Add EscapeSingleQuotes for cases where parameterization is impossible

Parameterized Query Refactoring (25+ functions):
- performance.go: GetSampleQueryFromPFS, SetLongQueryTime, KillThread, KillQuery
- binlog.go: SetBinlogFormat, PurgeBinlogTo, SetMaxBinlogTotalSize
- transaction.go: SetRelayLogSpaceLimit, SetMaxConnections
- replication.go: SetGTIDSlavePos, SetMySQLGtidMode, SetEnforceGTIDConsistency
- mysql.go: HaveErrantTransactions now uses parameterized GTID values

Schema Management (schema.go):
- Extract table/schema functions from original monolithic file
- Add GetTables, SetEventStatus, AnalyzeTable, SetUserPassword
- Implement proper columnRow/indexRow structs for information_schema queries
- Maintain TODO markers for remaining SQL injection risks in legacy functions

Benchmark Enhancements (bench.go):
- Add ChecksumTable with identifier validation
- Implement InjectLongTrx, WriteConcurrent2, BenchCleanup
- Security-hardened benchmark infrastructure

Documentation (SECURITY_AUDIT.md):
- Comprehensive security audit documenting all changes
- Risk assessment: Critical to Low
- Testing recommendations and future enhancements
- Compatibility notes confirming no breaking changes

Risk Mitigation Strategy:
- DDL statements that cannot be parameterized use strict validation + escaping
- USE statements validated with ValidateIdentifier + quoted
- SET STATEMENT syntax numerically validated before concatenation
- Dynamic vendor-specific queries use validated sources only

🤖 Generated with Claude Code (https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
…ze PostgreSQL GetPrivileges

- Add ValidateIdentifier checks for old_user_name, user_host, new_user_name in DuplicateUserPassword
- Replace string concatenation with QuoteMySQLIdentifier in SHOW GRANTS query
- Parameterize PostgreSQL GetPrivileges queries using $1 placeholder (both IPv6 and IPv4 paths)
- MySQL GetPrivileges already correctly uses ? placeholders

Addresses security review findings for actual exploitable vulnerabilities.
Both functions now properly validate/parameterize user-provided identifiers.

🤖 Generated with Claude Code (https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
…L queries

- SetMaxConnections: convert connections string → int64
- SetRelayLogSpaceLimit: convert size string → int64
- SetLongQueryTime: convert querytime string → float64
- KillThread: convert id string → int64
- KillQuery: convert id string → int64

MariaDB/MySQL expect proper types for these variables. While ValidateNumeric
ensures the string format is correct, the database driver may not auto-convert
strings to integers/floats for SET GLOBAL statements. This change explicitly
converts validated strings to proper types using strconv before passing to
parameterized queries.

Fixes type mismatch errors when setting numeric database variables.

🤖 Generated with Claude Code (https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
…v parsing

Remove ValidateNumeric() calls in SetLongQueryTime, KillThread, KillQuery, and SetMaxConnections since strconv.ParseFloat/ParseInt already validate numeric input and return errors for invalid values
Removes duplicate conditional block that was appending IGNORE_DOMAIN_IDS parameter twice to the change master command
Replace string concatenation with parameterized queries in GetBinlogEventPseudoGTID,
GetBinlogPosAfterSkipNumberOfEvents, and GetNumberOfEventsAfterPos. Add input validation
for binlog filenames and positions to prevent injection attacks. Also fix typo in
PostgreSQL error message.
Add concise file-level documentation comments to all 14 files in utils/dbhelper/
explaining the purpose and responsibilities of each file. Documentation includes:

- arbitration.go: split-brain arbitration and heartbeat management
- bench.go: database connection benchmarking utilities
- binlog.go: binary log operations and configuration
- connection.go: database connection helpers and address resolution
- map.go: thread-safe concurrent map wrappers
- mysql.go: MySQL-specific utilities
- performance.go: Performance Schema queries and analysis
- replication.go: replication control (GTID, slave/master management)
- schema.go: table/schema/user/event management
- security.go: SQL injection prevention utilities
- status.go: server status queries and monitoring
- transaction.go: transaction control and InnoDB settings
- types.go: data structure definitions
- vendor.go: database vendor abstraction layer

Improves godoc readability and helps developers quickly understand each file's purpose.
Add extensive test coverage for the dbhelper package focusing on:

**Security & Validation Tests** (security_test.go):
- 11 test functions with 115+ test cases
- SQL injection prevention for identifiers, filenames, formats
- Input validation for GTID modes, channels, numeric values
- Safe query builder and identifier quoting tests

**Binary Log Tests** (binlog_test.go):
- Binlog filename validation and safety checks
- Binary log format validation
- Purge and flush operations
- Metadata map management

**Vendor Abstraction Tests** (vendor_test.go):
- MySQL, MariaDB, PostgreSQL vendor detection
- GTID support across versions
- Replication terminology (Master/Slave vs Source/Replica)
- Query building for different database types

**Server Status Tests** (status_test.go):
- Variable source detection (performance_schema vs information_schema)
- Status variable retrieval
- Process list queries
- InnoDB status parsing

**Connection Tests** (connection_test.go):
- TCP and Unix socket address generation
- DNS resolution and validation
- Database connection helpers

**Replication Tests** (replication_test.go):
- Slave/replica command generation
- GTID mode validation
- Replication parameter validation
- CHANGE MASTER command generation

**Transaction Tests** (transaction_test.go):
- Table locking commands
- Read-only mode settings
- Global variable configuration
- Sync settings validation

**Test Infrastructure**:
- Shared test helpers (testhelpers.go)
- Graceful test skipping when DB unavailable
- Comprehensive testing documentation (README_TESTING.md)

All tests pass with focus on validation logic and command generation.
Integration tests skip gracefully when database connection unavailable.
Add MySQL driver import to testhelpers.go to enable database
connections in integration tests. This allows tests to connect
to MySQL/MariaDB when TEST_DB_DSN is configured.
required mariadb-binlog utility is in client
@claude
Copy link

claude bot commented Jan 3, 2026

PR Review: Environment Config Handling and Runtime Defaults

Strengths

Environment Variable Standardization: Clear, consistent prefix scheme (REPLICATION_MANAGER_DEFAULT_*, REPLICATION_MANAGER_<CLUSTER>_*) improves usability.

Cluster Discovery from Env: Dynamic cluster detection from environment variables eliminates the need for explicit cluster configuration, useful for containerized deployments.

Non-root Friendly Defaults: Fallback to $HOME/.local/replication-manager/data for working directory is practical for non-root deployments.

Security Improvements:

  • Key path fallback mechanism avoids hardcoded paths (/home/repman)
  • SQL injection prevention in dbhelper (parametrized queries, identifier validation)
  • Nil guards added to ReadCloud18Config

Configuration Precedence: Well-documented precedence: config file < env vars < CLI flags, properly enforced.

Test Coverage: 118+ new test cases covering env overrides, precedence, and cluster discovery.

Documentation: Implementation docs clearly explain changes.

Issues & Concerns

⚠️ Duplicate Code in Config Processing (HIGH)

  • Lines 1545-1548 in server.go: DeprecatedKeys assignment is duplicated:
if clRead != nil {
    repman.DeprecatedKeys[clusterName] = repman.GetUsedAliasKeys(clRead, true)
}
clOrch := conf.ProvOrchestrator
if clRead != nil {
    repman.DeprecatedKeys[clusterName] = repman.GetUsedAliasKeys(clRead, true) // DUPLICATE
}

Remove the first occurrence.

⚠️ Cluster Discovery Efficiency (MEDIUM)

  • discoverClustersFromEnv() rebuilds suffix list on every call and iterates all environment variables O(n). Consider caching or using env var prefix filtering if performance matters with many env vars.

⚠️ Working Directory Creation Missing Error Handling (MEDIUM)

  • When conf.WorkingDir is set based on home directory fallback, the code doesn't verify the directory is actually created. Should add:
if err := os.MkdirAll(conf.WorkingDir, 0755); err != nil {
    // Handle error
}

⚠️ Mailer Guard Logic (MINOR)

  • isInitMailEnabled() and mailerConfigMissing() methods referenced but not shown in diff. Verify these properly check all required SMTP fields (host, port, user, password).

⚠️ Docker Changes (MINOR)

  • Removed cluster config file copies and mariadb-server package install in Docker builds. This is a breaking change—ensure it's intentional and documented. The new docker-compose.yml example is good but doesn't mirror the removed configs.

⚠️ Type Conversion Risk (MINOR)

  • dbhelper changes use ValidateIdentifier() and QuoteMySQLIdentifier() for SQL injection prevention. These are good, but ensure they handle all edge cases (backticks in identifiers, reserved keywords).

Test Coverage Gaps

⚠️ Missing Tests:

  • Non-root working directory fallback creation
  • Nil pointer handling in ReadCloud18Config (only guards added, no test shown)
  • Mailer initialization guard behavior with missing config
  • Docker compose multi-node cluster topology validation

Recommendations

  1. Fix duplicate code in cluster config processing immediately
  2. Add directory creation validation for non-root working dir fallback
  3. Document Docker breaking changes or restore removed configs
  4. Add tests for working directory creation and mailer guard conditions
  5. Consider caching env discovery results if cluster count is large

Overall Assessment

Quality: Good | Risk: Low-Medium

The PR improves operational flexibility and security significantly. Main concerns are code duplication and missing validation in the working directory fallback logic. These are straightforward fixes that don't require architectural changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants