A principled library for tuning, training and evaluating tabular data synthesis on fidelity, privacy and utility.
-
Updated
Jun 1, 2024 - Python
A principled library for tuning, training and evaluating tabular data synthesis on fidelity, privacy and utility.
Mimesis is a robust data generator for Python that can produce a wide range of fake data in multiple languages.
Open source data anonymization and synthetic data orchestration for developers. Create high fidelity synthetic data and sync it across your environments.
Project at Harvard (HMS and MGH) for Deep Learning-powered WMH quantification. Please refer to the official website for more recent information.
Open-source version of the TDspora synthetic data generation algorithm.
Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelines
Generative modeling of synthetic time series data and time series augmentations
a curated list of data for reasoning ai
Examples scripts that showcase how to use Private AI Text to de-identify, redact, hash, tokenize, mask and synthesize PII in text.
Synthesizer - a code for creating synthetic astrophysical spectra
Synthetic Patient Population Simulator
⚗️ distilabel is a framework for synthetic data and AI feedback for AI engineers that require high-quality outputs, full data ownership, and overall efficiency.
Software for evaluating the quality of synthetic data compared with real data.
Simple interface to synthesize complex and highly dimensional datasets using Gretel APIs.
Benchmarking synthetic data generation methods.
Library and CLI for randomly generating medical data like you might get out of an Electronic Health Records (EHR) system
The Gretel Python Client allows you to interact with the Gretel REST API.
PostgreSQL database anonymization tool
Design, conduct and analyze results of AI-powered surveys and experiments. Simulate social science and market research with large numbers of AI agents and LLMs.
[CVPR 2024] Official code for EgoGen: An Egocentric Synthetic Data Generator
Add a description, image, and links to the synthetic-data topic page so that developers can more easily learn about it.
To associate your repository with the synthetic-data topic, visit your repo's landing page and select "manage topics."