GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

synthetic-data

Website
Wikipedia
stefan-jansen/machine-learning-for-trading
https://static.github-zh.com/github_avatars/stefan-jansen?size=40
stefan-jansen / machine-learning-for-trading

#计算机科学#Code for Machine Learning for Algorithmic Trading, 2nd edition.

机器学习tradinginvestmentfinance数据科学investment-strategies人工智能trading-strategies深度学习synthetic-data
Jupyter Notebook 15.36 k
1 年前
https://static.github-zh.com/github_avatars/modelscope?size=40
modelscope / data-juicer

#大语言模型#Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷

数据分析数据科学large-language-models大语言模型数据可视化instruction-tuningpre-trainingmulti-modalsynthetic-datadatadata-pipelinedata-processingfoundation-models
Python 4.9 k
4 小时前
lk-geimfari/mimesis
https://static.github-zh.com/github_avatars/lk-geimfari?size=40
lk-geimfari / mimesis

Mimesis is a robust data generator for Python that can produce a wide range of fake data in multiple languages.

mimesisfakedataGeneratorfixturesdummyschemaTestingPythonmocksynthetic-datadatasciencedataframepandaspolarspytest-pluginfactory
Python 4.6 k
14 天前
Kiln-AI/Kiln
https://static.github-zh.com/github_avatars/Kiln-AI?size=40
Kiln-AI / Kiln

#计算机科学#The easiest tool for fine-tuning LLM models, synthetic data generation, and collaborating on datasets.

人工智能chain-of-thoughtcollaborationdataset-generationfine-tuning机器学习macOSollamaopenaipromptprompt-engineeringPythonrlhfsynthetic-dataWindowsevalsevaluation
Python 4.01 k
11 小时前
nucleuscloud/neosync
https://static.github-zh.com/github_avatars/nucleuscloud?size=40
nucleuscloud / neosync

Open Source Data Security Platform for Developers to Monitor and Detect PII, Anonymize Production Data and Sync it across environments.

DockerGosynthetic-databenthosetlorchestrationTestingTypeScriptOpen SourceReact自托管KubernetesNexttest-data-generatorfine-tuningsynthetic-data-generationfakerMySQLPostgreSQL
Go 3.9 k
13 小时前
DLR-RM/BlenderProc
https://static.github-zh.com/github_avatars/DLR-RM?size=40
DLR-RM / BlenderProc

A procedural Blender pipeline for photorealistic training image generation

blender-pipelinesegmentationdepth-imagescamera-positionssuncg-scenecamera-samplingblender-installationsyntheticblenderrenderingpose-estimationsynthetic-dataPython3Dcomputer-graphics3d-reconstruction
Python 3.17 k
3 天前
sdv-dev/SDV
https://static.github-zh.com/github_avatars/sdv-dev?size=40
sdv-dev / SDV

#计算机科学#Synthetic data generation for tabular data

synthetic-data机器学习relational-datasetsmulti-tabletime-seriessynthetic-data-generationsdvdata-generationGenerative Adversarial Networkgans深度学习generative-aigenerative-modelgenerativeai
Python 3.09 k
13 小时前
https://static.github-zh.com/github_avatars/pgmpy?size=40
pgmpy / pgmpy

Python Library for Causal and Probabilistic Modeling using Bayesian Networks

Pythonbayesian-networkscausal-inferencecausal-discoverycausal-identificationcausal-modelsprobabilistic-inferencemixed-datasynthetic-datacausal-validationSimulation
Python 3.01 k
2 天前
argilla-io/distilabel
https://static.github-zh.com/github_avatars/argilla-io?size=40
argilla-io / distilabel

Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.

人工智能huggingface大语言模型openaiPythonrlhfsynthetic-datasynthetic-dataset-generation
Python 2.82 k
3 天前
https://static.github-zh.com/github_avatars/synthetichealth?size=40
synthetichealth / synthea

Synthetic Patient Population Simulator

syntheasynthetic-datasynthetic-populationfhirhealth-dataSimulation
Java 2.64 k
2 天前
https://static.github-zh.com/github_avatars/hitsz-ids?size=40
hitsz-ids / synthetic-data-generator

#大语言模型#SDG is a specialized framework designed to generate high-quality structured tabular data.

深度学习Generative Adversarial Networkgenerative-ai机器学习隐私synthetic-datatabular-dataagentdata-generator大语言模型
Python 2.37 k
10 天前
https://static.github-zh.com/github_avatars/unrealcv?size=40
unrealcv / unrealcv

#计算机科学#UnrealCV: Connecting Computer Vision to Unreal Engine

virtual-worlds机器视觉虚幻引擎embodied-ai机器学习Simulationsynthetic-data
C++ 2.03 k
2 个月前
ydataai/ydata-synthetic
https://static.github-zh.com/github_avatars/ydataai?size=40
ydataai / ydata-synthetic

#计算机科学#Synthetic data generators for tabular and time-series data

Generative Adversarial Network深度学习synthetic-datatensorflow2机器学习training-dataPythontimeseriesgansPyTorchtime-series
Jupyter Notebook 1.56 k
8 天前
https://static.github-zh.com/github_avatars/GreenmaskIO?size=40
GreenmaskIO / greenmask

PostgreSQL database anonymization and synthetic data generation tool

GoobfuscationPostgreSQL安全staginganonymizationsynthetic-data人工智能pgsecurity-automationTestingqadbafaker数据库自动化transformersdata-masking
Go 1.47 k
10 天前
bespokelabsai/curator
https://static.github-zh.com/github_avatars/bespokelabsai?size=40
bespokelabsai / curator

#自然语言处理#Synthetic data curation for post-training and structured data extraction

synthetic-dataagents大语言模型promptPythonsynthetic-dataset-generation深度学习fine-tuninginstruction-tuning机器学习自然语言处理
Python 1.46 k
21 天前
https://static.github-zh.com/github_avatars/shuttle-hq?size=40
shuttle-hq / synth

The Declarative Data Generator

Rusttest-data-generatorsynthetic-dataJSONdata-generationPostgreSQLHacktoberfest
Rust 1.44 k
10 个月前
sdv-dev/CTGAN
https://static.github-zh.com/github_avatars/sdv-dev?size=40
sdv-dev / CTGAN

Conditional GAN for generating synthetic tabular data.

synthetic-dataGenerative Adversarial Networktabular-datadata-generationsynthetic-data-generation
Python 1.43 k
3 天前
https://static.github-zh.com/github_avatars/plurai-ai?size=40
plurai-ai / intellagent

A framework for comprehensive diagnosis and optimization of agents using simulated, realistic synthetic interactions

agentevaluationllmopssimulatorsynthetic-data
Python 1.1 k
1 个月前
datadreamer-dev/DataDreamer
https://static.github-zh.com/github_avatars/datadreamer-dev?size=40
datadreamer-dev / DataDreamer

#自然语言处理#DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models.   🤖💤

深度学习机器学习自然语言处理nlp-libraryPythonPyTorchtransformersalignmentfine-tuninggptinstruction-tuning大语言模型llmopsopenaisynthetic-datasynthetic-dataset-generation
Python 1.04 k
6 个月前
https://static.github-zh.com/github_avatars/BatsResearch?size=40
BatsResearch / bonito

#大语言模型#A lightweight library for generating synthetic instruction tuning datasets for your data without GPT.

大语言模型synthetic-datasynthetic-dataset-generationzero-shot-learningdomain-adaptationgpttask-adaptation
Python 783
16 天前
loading...