GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

dataquality

Website
Wikipedia
https://static.github-zh.com/github_avatars/cleanlab?size=40
cleanlab / cleanlab

#数据仓库#The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.

weak-supervisiondata-cleaningdata-quality数据科学noisy-labelsdata-centric-aiout-of-distribution-detectionoutlier-detectionactive-learningdata-labelingdata-profilingdata-validationlabelingdata-curationannotationDataOpsdataquality大语言模型数据集exploratory-data-analysis
Python 10.61 k
12 天前
https://static.github-zh.com/github_avatars/great-expectations?size=40
great-expectations / great_expectations

Always know what to expect from your data.

pipeline-testsdataqualitydatacleaningdatacleaner数据科学data-profilingpipelinepipeline-testingcleandatadataunittestedaexploratory-data-analysisdata-qualitydata-engineeringmlops
Python 10.48 k
2 天前
open-metadata/OpenMetadata
https://static.github-zh.com/github_avatars/open-metadata?size=40
open-metadata / OpenMetadata

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team colla...

metadatadatadiscovery数据科学dataqualitydata-profilingmetadata-managementdataengineeringdata-catalogdata-observabilitydbtdata-discoverydata-contractsdata-governancedata-lineagedata-validationsnowflakedata-qualitydata-quality-checksdata-collaboration
TypeScript 6.9 k1
6 小时前
https://static.github-zh.com/github_avatars/awslabs?size=40
awslabs / deequ

Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.

dataqualityApache SparkUnit testingScala
Scala 3.43 k
2 个月前
https://static.github-zh.com/github_avatars/datafold?size=40
datafold / data-diff

Compare tables within or across databases

数据库MySQLPostgreSQLsnowflakerdbmstrinodata-engineeringdata-quality数据科学data-quality-monitoringdataengineeringdataqualityOracle 数据库SQLdbtPythondata
Python 2.97 k
1 年前
sodadata/soda-core
https://static.github-zh.com/github_avatars/sodadata?size=40
sodadata / soda-core

⚡ Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io

Pythondata-engineeringdata-governancedata-monitoringdata-observabilitydata-profilingdata-qualitydata-quality-checksdata-quality-monitoringdata-reliabilitydata-testingdata-validationdataqualitydbtpipeline-testingsnowflakedata-contracts
Python 2.11 k
4 天前
https://static.github-zh.com/github_avatars/re-data?size=40
re-data / re-data

re_data - fix data issues before your users & CEO would discover them 😊

data-monitoring数据分析data-qualitydata-quality-monitoringopen-source-toolingdata-observabilitydataqualitydata-testingdata-quality-checksdbtdata-reliability
HTML 1.56 k
1 年前
zinggAI/zingg
https://static.github-zh.com/github_avatars/zinggAI?size=40
zinggAI / zingg

Scalable identity resolution, entity resolution, data mastering and deduplication using ML

fuzzymatchfuzzy-matchingEntity resolutiondedupemasterdatadataengineering数据科学Apache Spark机器学习dataqualityanalyticsdatalakemaster-data-managementcustomer-data-platformdatabrickssnowflakecdpmdm
Java 1.04 k
3 天前
https://static.github-zh.com/github_avatars/chaos-genius?size=40
chaos-genius / chaos_genius

#计算机科学#ML powered analytics engine for outlier detection and root cause analysis.

anomaly-detectionbusiness-intelligenceanalytics机器学习observabilitymonitoring-tool数据可视化dataquality监控人工智能seasonalityoutlier-detectionalerttime-series深度学习PythonHacktoberfest
Python 758
9 个月前
https://static.github-zh.com/github_avatars/datacleaner?size=40
datacleaner / DataCleaner

The premier open source Data Quality solution

datadataquality数据库Desktopdatacleanermdmetl数据分析数据科学profiling
Java 632
18 天前
https://static.github-zh.com/github_avatars/datavane?size=40
datavane / datavines

Know your data better!Datavines is Next-gen Data Observability Platform, support metadata manage and data quality.

dataqualitydatasciencedorisApache Sparkmetadatacleandatadata-engineeringdata-profilingdata-qualitydata-quality-checksdata-quality-monitoring数据科学flink
Java 624
1 个月前
https://static.github-zh.com/github_avatars/IBM?size=40
IBM / lale

#计算机科学#Library for Semi-Automated Data Science

scikit-learnautomlautomated-machine-learninghyperparameter-optimizationhyperparameter-tuninghyperparameter-searchPython人工智能pipeline-testspipeline-testingdataquality数据科学机器学习interoperability
Python 339
1 个月前
https://static.github-zh.com/github_avatars/canimus?size=40
canimus / cuallee

Possibly the fastest DataFrame-agnostic quality check library in town.

bigdataperformance-metricspysparkPythonUnit testingpandasdataqualitydata-qualitydata-quality-checks
Python 192
6 天前
https://static.github-zh.com/github_avatars/DataEval?size=40
DataEval / dingo

#大语言模型#Dingo: A Comprehensive Data Quality Evaluation Tool

data-quality数据科学data-validationgpt大语言模型Apache Sparkvlmdataqualitydatascienceopenaideepseek
JavaScript 173
5 天前
https://static.github-zh.com/github_avatars/datachecks?size=40
datachecks / dcs-core

Open Source Data Quality Monitoring.

data-engineeringdata-validationDataOpsdataquality监控mlopsPostgreSQLPythondata-governancedata-observability数据库SQLdata-quality-monitoringelasticsearchMySQLetl
Python 156
4 天前
https://static.github-zh.com/github_avatars/OSMCha?size=40
OSMCha / osmcha-frontend

Frontend for the osmcha-django REST API

dataqualityOpenStreetMaposmqa
JavaScript 137
10 天前
https://static.github-zh.com/github_avatars/AutoViML?size=40
AutoViML / pandas_dq

#计算机科学#Find data quality issues and clean your data in a single line of code with a Scikit-Learn compatible Transformer.

data数据科学dataquality机器学习pandasPythonscikit-learn
Python 130
2 年前
https://static.github-zh.com/github_avatars/DataKitchen?size=40
DataKitchen / data-observability-installer

Installer for DataKitchen's Open Source Data Observability Products. Data breaks. Servers break. Your toolchain breaks. Ensure your team is the first to know and the first to solve with visibility acr...

datadata-engineeringdata-observabilitydata-profilingdata-quality数据科学datacleanerdatacleaningDataOpsdataqualitysql-serverpipeline-testsPostgreSQLredshift自托管snowflakedata-reliability
Python 118
5 天前
https://static.github-zh.com/github_avatars/DataKitchen?size=40
DataKitchen / dataops-testgen

DataOps Data Quality TestGen is part of DataKitchen's Open Source Data Observability. DataOps TestGen delivers simple, fast data quality test generation and execution by data profiling,  new dataset...

datadata-engineeringdata-observabilitydata-quality数据科学data-testingdataquality自托管DataOpssql-serverPostgreSQLPythonredshiftsnowflake
Python 57
18 天前
https://static.github-zh.com/github_avatars/schic?size=40
schic / DQCS

内嵌AI的数据质量控制系统

dataetldataquality数据库
Java 47
4 年前
loading...