For restoration of backups, kafka-backup
requires an API which is
currently (June 2019) under review and which is also required by
Mirror Maker 2. At the time of writing it is planned for Kafka 2.4
(Current version is 2.2). Using the old Kafka Connect results in the
consumer offsets not being synced. This may be acceptable in some
cases but otherwise you need to compile Kafka Connect 2.4 yourself or
use our provided Docker image
Kafka Connect is shipped only with a small number of connectors. All
other connectors are added by putting the connector jar
file to the
CLASSPATH
where the JVM can find the classes.
See Build and Run.
Get the kafka-backup
jar (See Build and Run) and
add it to your Kafka Connect Dockerfile. For example if you use the
confluent Docker images:
FROM confluentinc/cp-kafka-connect
COPY ./build/libs/kafka-backup.jar /etc/kafka-connect/jars
Run the Connect image as documented (e.g. here)
Configure a Backup Sink Connector
(e.g. create a file connect-backup-sink.properties
):
name=backup-sink
connector.class=de.azapps.kafkabackup.sink.BackupSinkConnector
tasks.max=1
topics.regex=*
key.converter=org.apache.kafka.connect.converters.ByteArrayConverter
value.converter=org.apache.kafka.connect.converters.ByteArrayConverter
header.converter=org.apache.kafka.connect.converters.ByteArrayConverter
target.dir=/my/backup/dir
# 1GiB
max.segment.size.bytes=1073741824
cluster.bootstrap.servers=localhost:9092
Name | Required? | Recommended Value | Comment |
---|---|---|---|
name |
✓ | backup-sink |
A unique name identifying this connector jobs |
connector.class |
✓ | de.azapps.kafkabackup.sink.BackupSinkConnector |
Must be this class to use kafka-backup |
tasks.max |
✓ | 1 | Must be 1 . Currently no support for multi-task backups |
topics |
- | Explicit, comma-separated list of topics to back up | |
topics.regex |
- | * |
Topic regex to back up |
key.converter |
✓ | org.apache.kafka.connect.converters.ByteArrayConverter |
Must be this class to interpret the data as bytes |
value.converter |
✓ | org.apache.kafka.connect.converters.ByteArrayConverter |
Must be this class to interpret the data as bytes |
header.converter |
✓ | org.apache.kafka.connect.converters.ByteArrayConverter |
Must be this class to interpret the data as bytes |
target.dir |
✓ | /my/backup/dir |
Where to store the backup |
max.segment.size |
✓ | 1073741824 (1 GiB ) |
Max size of the backup files. When the size is reached, a new file is created. No data is overwritten. |
cluster.bootstrap.servers |
✓ | my.kafka.cluster:9092 |
bootstrap.servers property to connect to the cluster to back up. |
cluster.* |
- | none | Other consumer configuration options required to connect to the cluster (e.g. SSL settings) |
Configure the Sink Connector.
Using curl:
curl -X POST -H "Content-Type: application/json" \
--data "@path/to/connect-backup-sink.properties"
http://my.connect.server:8083/connectors
Using Confluent CLI:
confluent load backup-source -d path/to/connect-backup-sink.properties
- Watch Kafka Connect logs (e.g.
confluent log connect
) - Watch the consumer lag for the sink connector. The consumer group is
probably named
connect-backup-sink
. Use for examplekafka-consumer-groups --bootstrap-server localhost:9092 --describe --group connect-backup-sink
to monitor it.
Configure a Backup Source Connector
(e.g. create a file connect-backup-source.properties
):
name=backup-source
connector.class=de.azapps.kafkabackup.source.BackupSourceConnector
tasks.max=1
topics=topic1,topic2,topic3
key.converter=org.apache.kafka.connect.converters.ByteArrayConverter
value.converter=org.apache.kafka.connect.converters.ByteArrayConverter
header.converter=org.apache.kafka.connect.converters.ByteArrayConverter
source.dir=/my/backup/dir
batch.size=500
Name | Required? | Recommended Value | Comment |
---|---|---|---|
name |
✓ | backup-source |
A unique name identifying this connector jobs |
connector.class |
✓ | de.azapps.kafkabackup.source.BackupSourceConnector |
Must be this class to use kafka-backup |
tasks.max |
✓ | 1 | Must be 1 . Currently no support for multi-task backups |
topics |
✓ | topic1,topic2,topic3 |
A list of topics to restore. Only explicit list of topics is currently supported |
key.converter |
✓ | org.apache.kafka.connect.converters.ByteArrayConverter |
Must be this class to interpret the data as bytes |
value.converter |
✓ | org.apache.kafka.connect.converters.ByteArrayConverter |
Must be this class to interpret the data as bytes |
header.converter |
✓ | org.apache.kafka.connect.converters.ByteArrayConverter |
Must be this class to interpret the data as bytes |
source.dir |
✓ | /my/backup/dir |
Location of the backup files. |
batch.size |
- | 500 |
How many messages should be processed in one batch? |
- Watch the Kafka Connect log for the message
All records read. Restore was successful
- Currently there is no other direct way to detect when the restore finished.