Releases: Altinity/clickhouse-operator
release-0.23.5
Changed
- 'chi' label is currently added to 'clickhouse_operator_*' metrics
- CHI annotations are now added as labels to 'chi_*' metrics
- clickhouse-operator Helm charts are moved from 'deploy/helm' to 'deploy/helm/clickhouse-operator'
Fixed
- Fixed a bug when nodes were not correctly excluded from remote_servers on replicated clusters with 4 or more shards, that might result in failed distributed queries and unnecessary reconciliation delays.
Full Changelog: release-0.23.4...release-0.23.5
release-0.23.4
Changed
- Allow adding pod labels in Helm chart by @bruno-s-coelho in #1369. Closes #1356
- The default service type has been changed from LoadBalancer to ClusterIP
- Cluster restart rules were fixed for some structured settings like 'logger/*'. Now changing of those would not cause a restart of pods.
- Operator does not fail if ClickHouserKeeperInstallation resource type is missing in k8s
- Added 'clickhouse_operator_chi_reconciles_aborted' operator metric
NOTE: There was a lot of internal refactoring related to Keeper code, but no functional changes yet. The Keeper configuration functionality will be improved in the next major release.
Full Changelog: release-0.23.3...release-0.23.4
release-0.23.3
Changed
- Enabled parameterized databases propagation to shards and replicas in ClickHouse 22.12+. That includes support for Replicated and MySQL database engines. Closes #1076
- Removed object storage disks from DiskTotal/Free metrics since those do not make any sense
- Added ability to import packages with operator APIs by @dmvolod in #1229
- Introduced number of unchanged hosts in CHI status
Fixed
- Fixed an issue with cluster stop could be taking long time. Closes #1346
- Fixed a bug when inconsistent cluster definition might result in a crash. Closes #1319
- Fixed a bug when hosts-completed could be incorrectly reported in status when reconcile is re-started in the middle
Full Changelog: release-0.23.2...release-0.23.3
release-0.23.2
What's Changed
- Fix environment variables generation for secrets that might be off in some cases. Closes #1344
- Golang is upgraded to 1.20. Closes CVEs in dependent libraries.
Full Changelog: release-0.23.1...release-0.23.2
release-0.23.1
Fixed
- Fixed generation of users that could be broken in some cases. Closes #1324 and #1332
- Fixed metrics-exporter that might fail to export metrics in some cases. Closes #1336
- Fixed Keeper examples
- Include installation of ClickhouseKeeperInstallations CRD in Helm chart readme by @echozio in #1330
Improved
- Updated grants example by @lesandie in #1333
- Upgrade ClickHouse version to 23.8-lts by @orginux in #1338
Full Changelog: release-0.23.0...release-0.23.1
release-0.23.0
Added
- Kubernetes secrets are currently supported with the standard syntax for user passwords, configuration settings, and configuration files, for example:
users:
user1/password:
valueFrom:
secretKeyRef:
name: clickhouse_secret
key: pwduser1
settings:
s3/my_bucket/access_key:
valueFrom:
secretKeyRef:
name: s3-credentials
key: AWS_ACCESS_KEY_ID
files:
server.key:
valueFrom:
secretKeyRef:
name: clickhouse-certs
key: server.key
See updated Security Hardening Guide for more detail.
kind: ClickHouseKeeperInstallation
See examples in there: https://github.com/Altinity/clickhouse-operator/tree/0.23.0/docs/chk-examples
The implementation is not final, following things yet needs to be done:
- dynamic reconfiguration, that is required in order to support adding and removing Keeper replicas
- integration with ClickHouseInstallation, so Keeper could be referenced by a reference, instead by a service
- CHI labels are now added to exported Prometheus metrics
Changed
- Services are now re-created if ServiceType is changed in order to workaround Kubernetes issue. Closes #1302
- Operator now waits for ClickHouse service endpoints to respond when checking node is up.
- CHI templates are now automatically reloaded by operator. Before, templates were only reloaded during startup. In order to apply changes, CHI update needs to be triggered.
- Operator will now crash if operator configuration is broken or can not be parsed. That prevents the fallback to the defaults in case of errors.
Fixed
- Fixed schema propagation on new replicas for ClickHouse 23.11 and above
- Fixed data recovery when PVC is deleted by a user. Closes #1310
Improved
- Improve helm, update values.yaml to properly generate helm/README.md by @Slach in #1278
- Improve clickhouse-keeper manifests by @Slach in #1234
- chore: remove refs to deprecated io/ioutil by @testwill in #1273
- Update URL for accepted logging levels by @madrisan in #1270
- Add a chi example for sync users by @ccsxs in #1304
- Bump zookepper operator version to 0.2.15 by @GrahamCampbell in #1303
- Optional values.rbac to deploy rbac resources by @Salec in #1316
- update helm chart generator to treat config.yaml as yaml in values by @echozio in #1317
Full Changelog: release-0.22.2...release-0.23.0
release-0.22.2
What's Changed
- Fixed a bug when operator did not restart ClickHouse pods if 'files' section was changed without 'config.d' destination, e.g.
files/settings.xml
. - Fix ServiceMonitor endpoints #1276 by @MiguelNdeCarvalho, and #1290 by @muicoder. Closes #1287
- Disabled prefer_localhost_replica in default profile
Full Changelog: release-0.22.1...release-0.22.2
release-0.22.1
Added
- New 'Aborted' status for CHI is set when reconcile is aborted by an operator
Changed
- Allow shard weight to be zero (#1192 by maxistua)
- Removed excessive logging for pod update events
- Removed 30s delay after creating a service
- Allow empty values for CRD status and some other fields in order to facilitate migration from old operator versions that were upgraded without upgrading CRD first. Fixes #842, #890 and similar issues.
Full Changelog: release-0.22.0...release-0.22.1
release-0.22.0
Added
- Support volume re-provisioning. If volume is broken and PVC detects it as lost, operator re-provisions the volume
- When new CHI is created, all hosts are created in parallel
- Allow to turn off waiting for running queries to complete. This can be done both in operator configuration or in CHI itself:
In operator configuration:
spec:
reconcile:
host:
wait:
queries: "false"
In CHI:
spec:
reconciling:
policy: nowait
- When changes are applied to clusters with a lot of shards, the change is probed on a first node only. Is successul, it is applied on 50% of shards. This can be configured in operator configuration:
reconcile:
# Reconcile runtime settings
runtime:
# Max number of concurrent CHI reconciles in progress
reconcileCHIsThreadsNumber: 10
# The operator reconciles shards concurrently in each CHI with the following limitations:
# 1. Number of shards being reconciled (and thus having hosts down) in each CHI concurrently
# can not be greater than 'reconcileShardsThreadsNumber'.
# 2. Percentage of shards being reconciled (and thus having hosts down) in each CHI concurrently
# can not be greater than 'reconcileShardsMaxConcurrencyPercent'.
# 3. The first shard is always reconciled alone. Concurrency starts from the second shard and onward.
# Thus limiting number of shards being reconciled (and thus having hosts down) in each CHI by both number and percentage
# Max number of concurrent shard reconciles within one CHI in progress
reconcileShardsThreadsNumber: 5
# Max percentage of concurrent shard reconciles within one CHI in progress
reconcileShardsMaxConcurrencyPercent: 50
- Operator-related metrics are exposed to Prometheus now:
clickhouse_operator_chi_reconciles_started
clickhouse_operator_chi_reconciles_completed
clickhouse_operator_chi_reconciles_timings
clickhouse_operator_host_reconciles_started
clickhouse_operator_host_reconciles_completed
clickhouse_operator_host_reconciles_restarts
clickhouse_operator_host_reconciles_errors
clickhouse_operator_host_reconciles_timings
clickhouse_operator_pod_add_events
clickhouse_operator_pod_update_events
clickhouse_operator_pod_delete_events
Changed
- fix typo in operator_installation_details.md by @seeekr in #1219
- Set operator release date fot createdAt CSV field by @dmvolod in #1223
- Fix type for exclude and include fields in 70-chop-config.yaml example by @dmvolod in #1222
- change dashboard refresh rate 1m and add min_duration_ms, max_duration_ms dashboard variables, rename query_type to query_kind by @Slach in #1235
- add securityContext to helm chart by @farodin91 in #1255
- metrics-exporter collects all hosts and queries in parallel
Fixed
- Fixed a bug when operator could break multiple nodes if incorrect configuration has been deployed several times in a row
- Fixed a bug when schema could not be created on new nodes, if nodes took too long to start
- Fixed a bug when services were not reconciled in rare cases
New Contributors
- @seeekr made their first contribution in #1219
- @dmvolod made their first contribution in #1223
- @farodin91 made their first contribution in #1255
Full Changelog: release-0.21.3...release-0.22.0
release-0.21.3
Added
- Added selectors to CHITemplates. Now it is possible to define a template that is applied to certain CHI. Example here: https://github.com/Altinity/clickhouse-operator/blob/0.21.3/docs/chi-examples/50-CHIT-04-auto-template-volume-with-selector.yaml
- Added '.status.useTemplates' to reflect all templates used in CHI manually or auto
Changed
- CHITemplates are now re-loaded automatically without a need to restart operator. Changes in CHITemplates are not applied automatically to affected CHI.
Fixed
- Fix nil pointer deref in metrics exporter (#1187) by @zcross in #1188
- Migrate piechart plugin on Grafana Dashboard by @MiguelNdeCarvalho in #1190
- Permission error when deleting Pod sometimes
New Contributors
- @MiguelNdeCarvalho made their first contribution in #1190
Full Changelog: release-0.21.2...release-0.21.3