GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

data-quality

Website
Wikipedia
GokuMohandas/Made-With-ML
https://static.github-zh.com/github_avatars/GokuMohandas?size=40
GokuMohandas / Made-With-ML

#自然语言处理#学习如何设计、开发、部署、和迭代生产级机器学习应用

机器学习深度学习PyTorch自然语言处理数据科学Pythonmlopsdata-engineeringdata-quality大语言模型raydistributed-training
Jupyter Notebook 38.91 k
10 个月前
eugeneyan/applied-ml
https://static.github-zh.com/github_avatars/eugeneyan?size=40
eugeneyan / applied-ml

#自然语言处理#精选大公司分享他们在生产中关于数据科学 & 机器学习的论文和技术博客等资源

applied-machine-learningproductionapplied-data-science机器学习数据科学reinforcement-learningdata-engineeringrecsyssearch深度学习data-qualitydata-discovery机器视觉自然语言处理
28.03 k
1 年前
https://static.github-zh.com/github_avatars/ydataai?size=40
ydataai / ydata-profiling

#计算机科学#1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.

pandas-profilingpandas-dataframe统计Jupyter Notebookexploration数据科学Pythonpandas机器学习深度学习exploratory-data-analysisedadata-qualityhtml-reportdata-exploration数据分析big-data-analyticsdata-profilingHacktoberfest
Python 12.98 k
5 天前
https://static.github-zh.com/github_avatars/cleanlab?size=40
cleanlab / cleanlab

#数据仓库#The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.

weak-supervisiondata-cleaningdata-quality数据科学noisy-labelsdata-centric-aiout-of-distribution-detectionoutlier-detectionactive-learningdata-labelingdata-profilingdata-validationlabelingdata-curationannotationDataOpsdataquality大语言模型数据集exploratory-data-analysis
Python 10.61 k
12 天前
https://static.github-zh.com/github_avatars/great-expectations?size=40
great-expectations / great_expectations

Always know what to expect from your data.

pipeline-testsdataqualitydatacleaningdatacleaner数据科学data-profilingpipelinepipeline-testingcleandatadataunittestedaexploratory-data-analysisdata-qualitydata-engineeringmlops
Python 10.48 k
2 天前
voxel51/fiftyone
https://static.github-zh.com/github_avatars/voxel51?size=40
voxel51 / fiftyone

#计算机科学#Refine high-quality datasets and visual AI models

机器学习人工智能深度学习机器视觉developer-tools数据科学Pythonactive-learningdata-centric-aidata-cleaningdata-curationdata-qualityimage-classificationobject-detectionunstructured-datavector-search可视化
Python 9.59 k
5 小时前
open-metadata/OpenMetadata
https://static.github-zh.com/github_avatars/open-metadata?size=40
open-metadata / OpenMetadata

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team colla...

metadatadatadiscovery数据科学dataqualitydata-profilingmetadata-managementdataengineeringdata-catalogdata-observabilitydbtdata-discoverydata-contractsdata-governancedata-lineagedata-validationsnowflakedata-qualitydata-quality-checksdata-collaboration
TypeScript 6.89 k1
15 小时前
https://static.github-zh.com/github_avatars/evidentlyai?size=40
evidentlyai / evidently

#大语言模型#Evidently is ​​an open-source ML and LLM observability framework. Evaluate, test, and monitor any AI-powered system or data pipeline. From tabular data to Gen AI. 100+ metrics.

data-driftJupyter Notebookpandas-dataframe机器学习model-monitoringhtml-reportmlops数据科学Hacktoberfestdata-qualitydata-validationgenerative-ai大语言模型llmops
Jupyter Notebook 6.28 k
2 天前
https://static.github-zh.com/github_avatars/feast-dev?size=40
feast-dev / feast

#计算机科学#The Open Source Feature Store for AI/ML

机器学习featuresbig-datafeature-storePythonmlopsdata-engineering数据科学data-quality
Python 6.14 k
4 天前
treeverse/lakeFS
https://static.github-zh.com/github_avatars/treeverse?size=40
treeverse / lakeFS

lakeFS - Data version control for your data lake | Git for data

data-engineeringdata-versioningGoobject-storagedata-lakeaws-s3data-qualityazure-blob-storagegoogle-cloud-storagegit-for-dataApache Sparkhadoop-filesystemdatalakedata-version-controlazure-storage
Go 4.72 k
5 小时前
GokuMohandas/mlops-course
https://static.github-zh.com/github_avatars/GokuMohandas?size=40
GokuMohandas / mlops-course

#自然语言处理#Learn how to design, develop, deploy and iterate on production-grade ML applications.

机器学习深度学习PyTorchmlopsdata-engineeringdata-quality数据科学大语言模型自然语言处理Pythonray
Jupyter Notebook 3.13 k
10 个月前
https://static.github-zh.com/github_avatars/datafold?size=40
datafold / data-diff

Compare tables within or across databases

数据库MySQLPostgreSQLsnowflakerdbmstrinodata-engineeringdata-quality数据科学data-quality-monitoringdataengineeringdataqualityOracle 数据库SQLdbtPythondata
Python 2.97 k
1 年前
https://static.github-zh.com/github_avatars/whylabs?size=40
whylabs / whylogs

#计算机科学#An open-source data logging library for machine learning models and data pipelines. 📚 Provides visibility into data quality & model performance over time. 🛡️ Supports privacy-preserving data collect...

ai-pipelinesapproximate-statisticsstatistical-propertiesdata-qualitycalculate-statisticsPythonLoggingmlopsDataOpsml-pipelinesdata-pipelinedataset机器学习数据科学analyticsconstraints
Jupyter Notebook 2.72 k
5 个月前
sodadata/soda-core
https://static.github-zh.com/github_avatars/sodadata?size=40
sodadata / soda-core

⚡ Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io

Pythondata-engineeringdata-governancedata-monitoringdata-observabilitydata-profilingdata-qualitydata-quality-checksdata-quality-monitoringdata-reliabilitydata-testingdata-validationdataqualitydbtpipeline-testingsnowflakedata-contracts
Python 2.11 k
3 天前
featureform/featureform
https://static.github-zh.com/github_avatars/featureform?size=40
featureform / featureform

#计算机科学#The Virtual Feature Store. Turn your existing data infrastructure into a feature store.

机器学习数据科学vector-databaseembeddings-similarityembeddingsHacktoberfestfeature-storemlopsdata-qualityfeature-engineeringPython
Go 1.91 k
1 个月前
https://static.github-zh.com/github_avatars/feathr-ai?size=40
feathr-ai / feathr

#计算机科学#Feathr – A scalable, unified data and AI engineering platform for enterprise

feature-engineeringfeature-store人工智能mlopsdata-engineeringdata-quality机器学习Apache SparkAzure数据科学feature-management
Scala 1.9 k
1 年前
https://static.github-zh.com/github_avatars/re-data?size=40
re-data / re-data

re_data - fix data issues before your users & CEO would discover them 😊

data-monitoring数据分析data-qualitydata-quality-monitoringopen-source-toolingdata-observabilitydataqualitydata-testingdata-quality-checksdbtdata-reliability
HTML 1.56 k
1 年前
opendatadiscovery/odd-platform
https://static.github-zh.com/github_avatars/opendatadiscovery?size=40
opendatadiscovery / odd-platform

First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.

Open Sourcedata-platformmetadatametadata-managementdata-pipelinesdata-engineeringobservabilitydata-catalogdata-discoverydata-lineagebigdataalertinglineagedata-profilingdata-explorationdata-governancedata-quality数据科学data-observability
Java 1.33 k
4 个月前
daochenzha/data-centric-AI
https://static.github-zh.com/github_avatars/daochenzha?size=40
daochenzha / data-centric-AI

#计算机科学#A curated, but incomplete, list of data-centric AI resources.

人工智能data-centric-ai机器学习data-curationdata-centricdata-centric-machine-learning数据科学data-qualitydata-engineering
1.11 k
1 年前
https://static.github-zh.com/github_avatars/cleanlab?size=40
cleanlab / cleanvision

#计算机科学#Automatically find issues in image datasets and practice data-centric computer vision.

机器视觉data-centric-aidata-explorationdata-qualitydata-validation深度学习exploratory-data-analysisimage-analysisimage-classificationimage-generationimage-qualityimage-segmentationdata-profiling数据科学
Python 1.09 k
2 个月前
loading...