#计算机科学#Code for Machine Learning for Algorithmic Trading, 2nd edition.
#大语言模型#Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷
Mimesis is a robust data generator for Python that can produce a wide range of fake data in multiple languages.
#计算机科学#The easiest tool for fine-tuning LLM models, synthetic data generation, and collaborating on datasets.
Open Source Data Security Platform for Developers to Monitor and Detect PII, Anonymize Production Data and Sync it across environments.
A procedural Blender pipeline for photorealistic training image generation
#计算机科学#Synthetic data generation for tabular data
Python Library for Causal and Probabilistic Modeling using Bayesian Networks
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
Synthetic Patient Population Simulator
#大语言模型#SDG is a specialized framework designed to generate high-quality structured tabular data.
#计算机科学#UnrealCV: Connecting Computer Vision to Unreal Engine
#计算机科学#Synthetic data generators for tabular and time-series data
PostgreSQL database anonymization and synthetic data generation tool
#自然语言处理#Synthetic data curation for post-training and structured data extraction
The Declarative Data Generator
Conditional GAN for generating synthetic tabular data.
A framework for comprehensive diagnosis and optimization of agents using simulated, realistic synthetic interactions
#自然语言处理#DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. 🤖💤
#大语言模型#A lightweight library for generating synthetic instruction tuning datasets for your data without GPT.