One ETL tool to rule them all
-
Updated
May 29, 2024 - Python
One ETL tool to rule them all
Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
A framework for moving data into a data warehouse.
Singer (ETL) Pipedrive playground (with Redash (Data Visualization))
Source code and test material for developing ETL components for use in SD2E
Northwind OLTP ETL Package using SSIS
Extract, Transformation & Load analytical worflow for INEGI data for defunciones, year 2012.
Customisable ETL utility to validate, filter and merge CSV files. Off-the-shelf merges files from Google COVID-19 repository while checking the input data for errors, inconsistencies etc.
Phone-Matchup a Phone Prediction Model which uses ETL Pipeline for data extraction.
Import data from GitLab to PostgreSQL with singer tap-gitlab
Fraud detection on mobile banking transactions
Application for Managing your Different DataSources . Still in Alpha.please be patient
Project uses Pandas to create multiple DataFrames from CSV files containing Disneyland Reviews and Chocolate Reviews.. Cleaned those DataFrames, then loaded to PostgreSQL to create a relational database to join everything together.
Functions to deal with "dirty" data.
Add a description, image, and links to the etl-components topic page so that developers can more easily learn about it.
To associate your repository with the etl-components topic, visit your repo's landing page and select "manage topics."