Skip to content

zicat/tributary

Repository files navigation

Welcome to Tributary

Tributary is a reliable, stateless, fault-tolerance service for efficiently collecting and moving huge amounts of records. It has a simple and flexible architecture based on streaming records flows.

image

Why Choose Tributary

Tributary is designed to solve the reliability and isolation with multiple sinks consuming one channel. For example, sink data to HDFS and Kafka from the same source for batching and streaming computing requirement.

Persistently receiving and forwarding data to external systems while achieving mutual fault isolation is a highly challenging task.

The current mainstream solution, such as Apache Flume, adopts a write amplification strategy to address isolation issues. Specifically, the received data is stored in multiple channels, and each sink consumes a separate channel.

Tributary adopts a read amplification to address isolation issues and supports multiple sinks consuming the same channel based on GroupOffset.

The benefits of reading amplification not only ensure fault isolation, but also greatly reduce the overhead, especially when the data volume and number of sinks increase.

Documentation

Tributary User Guide

Tributary Design Guide