Open Source Data Security Platform for Developers to Monitor and Detect PII, Anonymize Production Data and Sync it across environments.
#计算机科学#Synthetic data generation for tabular data
Conditional GAN for generating synthetic tabular data.
#计算机科学#A library to model multivariate data using copulas.
#计算机科学#Synthetic Data SDK ✨
#计算机科学#Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and text alignment capabilities.
#计算机科学#Unity's privacy-preserving human-centric synthetic data generator
#数据仓库#[IMC 2020 (Best Paper Finalist)] Using GANs for Sharing Networked Time Series Data: Challenges, Initial Promise, and Open Questions
#计算机科学#Synthetic Data Generation for mixed-type, multivariate time series.
INGenious Playwright Studio
#计算机科学#(SIGCOMM '22) Practical GAN-based Synthetic IP Header Trace Generation using NetShare
Synthetic Data Engine 💎
Structural Entropy Guided Agent for Detecting and Repairing Knowledge Deficiencies in LLMs
[TMLR] GraphMaker: Can Diffusion Models Generate Large Attributed Graphs?
#自然语言处理#This tool helps automatic generation of grammatically valid synthetic Code-mixed data by utilizing linguistic theories such as Equivalence Constant Theory and Matrix Language Theory.
A toolset to test data classification engines that generates mock data in various file formats, sizes and data profiles.
#大语言模型#[ACL 2024 Findings] This is the code for our paper "Knowledge-Infused Prompting: Assessing and Advancing Clinical Text Data Generation with Large Language Models".
#计算机科学#Unity's Privacy-Preserving Novel Human Body Model Trained Solely on Synthetic Data and Corresponding Dense Anthropometric Measurements
#计算机科学#Flow Matching implemented in PyTorch
#计算机科学#Codebase for "Generating multivariate time series with COmmon Source CoordInated GAN (COSCI-GAN)"