- Munich, Germany
Stars
📙 Awesome Data Catalogs and Observability Platforms.
The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance …
🎨 Diagram as Code for prototyping cloud system architectures
Apache Pinot - A realtime distributed OLAP datastore
Apache Linkis builds a computation middleware layer to facilitate connection, governance and orchestration between the upper applications and the underlying data engines.
APM, Application Performance Monitoring System
A platform that makes it easy for developers to build realtime, cost-effective, operations-focused applications
🧙 Build, run, and manage data pipelines for integrating and transforming data.
Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
Make stream processing easier! Easy-to-use streaming application development framework and operation platform.
A list of free datasets that provide streaming data
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
This is the Rust course used by the Android team at Google. It provides you the material to quickly teach Rust.
High Performance Inter-Thread Messaging Library
An Open Standard for lineage metadata collection
A curated list of awesome distributed systems books, papers, resources and shiny things.
One of the 'BEST' markdown preview extensions for Atom editor!
Dataframes powered by a multithreaded, vectorized query engine, written in Rust
Blazing-fast query execution engine speaks Apache Spark language and has Arrow-DataFusion at its core.
A cross platform way to express data transformation, relational algebra, standardized record expression and plans.
🏆 Welcome to the wonderland of "AI" = f(DL, RL, DRL, ML, NLP, KG, MLOPS)
⚡ Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io
A comprehensive reference for all topics related to Natural Language Processing
📖 A curated list of resources dedicated to Natural Language Processing (NLP)
Free MLOps course from DataTalks.Club
🐶 A tool to package, serve, and deploy any ML model on any platform. Archived to be resurrected one day🤞
Visually explore your JMH Benchmarks