Skip to content

Latest commit

 

History

History
161 lines (131 loc) · 3.82 KB

DATA_INFRA.md

File metadata and controls

161 lines (131 loc) · 3.82 KB

Reading

Distributed Frameworks

Warehouse

  • redshift
  • snowflake
  • bigquery
  • vertica

Data Lake

Change data capture / streaming

data versioning

OLAP

Streaming

  • kafka
  • flink
  • beam

real-time/document-search

  • elasticsearch
  • rockset

Notebooks / exploration

Labeling

Notebook management

Orchestration

Dataflow/transformation managers

  • streamset
  • nifi

unstructured/semi-structured data prep

Solutions

data quality monitoring

anomaly detection as a service

event data

  • segment
  • Rudderstack
  • Metarouter
  • snowplow

A/B testing

Behavioral analytics

Feature flagging

Catalogue / metadata

BI

etl / ingestion-as-a-service

reverse-etl: piping data around for salesOps/marketingOps/CRM etc

Sales analytics

timeseries forecasting

Neat utils

  • query CSVs w/ SQL: cq
  • jq

Generating fake data

privacy / compliance