GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

datacleaning

Website
Wikipedia
https://static.github-zh.com/github_avatars/OpenRefine?size=40
OpenRefine / OpenRefine

#数据仓库#OpenRefine(原名Google Refine) 是一个强大的数据清洗和转换工具

datacleansing数据分析JavaOpen Datawikidatajournalism数据科学datajournalismdatacleaningdataminingreconciliationdata-wrangling
Java 11.45 k
1 天前
https://static.github-zh.com/github_avatars/great-expectations?size=40
great-expectations / great_expectations

Always know what to expect from your data.

pipeline-testsdataqualitydatacleaningdatacleaner数据科学data-profilingpipelinepipeline-testingcleandatadataunittestdata-unit-testsedaexploratory-data-analysisexploratory-analysisexploratorydataanalysisdata-qualitydata-engineeringpipeline-debtdata-profilersmlops
Python 10.61 k
7 小时前
https://static.github-zh.com/github_avatars/sfu-db?size=40
sfu-db / dataprep

Open-source low code data preparation library in python. Collect, clean and visualization your data in python with a few lines of code.

dataprep数据科学datapreparationdataconnectoredaexploratory-data-analysisdata-explorationconnectorcleaningdatacleaningapiwrapperapis
Python 2.19 k
1 年前
https://static.github-zh.com/github_avatars/yobulkdev?size=40
yobulkdev / yobulkdev

🔥 🔥 🔥Open Source & AI driven Data Onboarding Platform:Free flatfile.com alternative

csv-parserMongoDBOpen Sourcestreamingdata-engineeringcsv-importcsv-readerNextNode.jsstreamReactJavaScriptembeddablelanguagemodeldatacleaning
JavaScript 897
2 年前
https://static.github-zh.com/github_avatars/DataCanvasIO?size=40
DataCanvasIO / HyperGBM

A full pipeline AutoML tool for tabular data

automlgbmxgboostlightgbmcatboostsemi-supervised-learningdatacleaningpreprocessingensemble-learningtabular-datadistributed-trainingdaskgpu-accelerationrapidsaiscikit-learn
Python 353
3 个月前
https://static.github-zh.com/github_avatars/sharmaroshan?size=40
sharmaroshan / Twitter-Sentiment-Analysis

#自然语言处理#It is a Natural Language Processing Problem where Sentiment Analysis is done by Classifying the Positive tweets from negative tweets by machine learning models for classification, text mining, text a...

自然语言处理sentiment-analysis数据分析bag-of-words数据可视化eda机器学习classificationcross-validationevaluation-metricswordcloudhashtagsdatacleaning
Jupyter Notebook 249
2 年前
https://static.github-zh.com/github_avatars/DataKitchen?size=40
DataKitchen / data-observability-installer

Installer for DataKitchen's Open Source Data Observability Products. Data breaks. Servers break. Your toolchain breaks. Ensure your team is the first to know and the first to solve with visibility acr...

datadata-engineeringdata-observabilitydata-profilingdata-quality数据科学datacleanerdatacleaningDataOpsdataqualitysql-serverpipeline-testsPostgreSQLredshift自托管snowflakedata-reliability
Python 124
2 天前
https://static.github-zh.com/github_avatars/imdevskp?size=40
imdevskp / covid_19_jhu_data_web_scrap_and_cleaning

This repository contains data and code used to get and clean data from https://github.com/CSSEGISandData/COVID-19 and https://www.worldometers.info/coronavirus/

COVID-19webscrapingdatacleaningPythonpandaspandemicepidemicsoutbreak
Jupyter Notebook 99
5 年前
https://static.github-zh.com/github_avatars/prasanthg3?size=40
prasanthg3 / cleantext

#自然语言处理#An open-source package for python to clean raw text data

Python自然语言处理datacleaning
Python 70
2 年前
https://static.github-zh.com/github_avatars/benchopt?size=40
benchopt / benchmark_bilevel

Benchmark for bi-level optimization solvers

datacleaninghyperparameter-optimization
Python 48
2 个月前
https://static.github-zh.com/github_avatars/imdevskp?size=40
imdevskp / covid-19-india-data

data and code for scrapping and cleaning data on covid-19 in India from https://www.mohfw.gov.in/ and https://www.covid19india.org/

indiawebscrapingdatacleaningPythonJupyter NotebookdataCOVID-19pandas
Jupyter Notebook 41
5 年前
https://static.github-zh.com/github_avatars/data-cleaning?size=40
data-cleaning / validatedb

Validate on a table in a DB, using dbplyr

validation数据库datacleaning
R 33
3 年前
https://static.github-zh.com/github_avatars/DemonDamon?size=40
DemonDamon / tongdaxin-futures-data-clearing-database-operation

对通达信数据进行去重和清洗处理,并将数据存入MongoDB,方便往后研究

datacleaningpymongo
Python 27
7 年前
https://static.github-zh.com/github_avatars/sayaliwalke30?size=40
sayaliwalke30 / Kaggle-Projects

#计算机科学#This repo contains 4 different projects. Built various machine learning models for Kaggle competitions. Also carried out Exploratory Data Analysis, Data Cleaning, Data Visualization, Data Munging, Fe...

kaggle-competition机器学习数据科学数据分析datacleaningdatavisualizationkaggleexploratory-data-analysis
Jupyter Notebook 26
4 年前
https://static.github-zh.com/github_avatars/RonKG?size=40
RonKG / Machine-Learning-Projects-2

#计算机科学#

机器学习自然语言处理time-seriesdatadatacleaningnaive-bayes-classifierfoliumplotlyPortfolio
HTML 25
7 年前
https://static.github-zh.com/github_avatars/nirala96?size=40
nirala96 / Bangalore-House-Prediction-App

Predicts home prices of Bangalore. Used Flutter, Flask and Jupyter Notebook.

Flutterflask-apiJupyter NotebookPythonexploratory-data-analysislinear-regression数据科学datacleaning
Jupyter Notebook 18
4 年前
https://static.github-zh.com/github_avatars/hoshigan?size=40
hoshigan / Supply-Chain-Analytic---Just-In-Time-Company

The project provides a real-world dataset focusing on supply chain analytics

datadatacleaningexploratory-data-analysissegmentationsupply-chain-managementPython
Jupyter Notebook 18
2 年前
https://static.github-zh.com/github_avatars/weismanm12?size=40
weismanm12 / finances-database

Personal finance database creation, SQL analysis, and Power BI dashboard

dashboarddatabase-schemadatacleaningdatavisualizationetlMySQLpandaspowerbiPythonSQLsqlalchemy
Jupyter Notebook 17
2 年前
https://static.github-zh.com/github_avatars/Anubhavchandil?size=40
Anubhavchandil / RESEARCH-INTERN

#计算机科学#Worked on a dataset of high entropy alloys which is used to design materials for additive manufacturing. Being responsible for Performing Data Analysis and constructing Machine learning algorithms, in...

人工智能数据科学datacleaningdatavisualization深度学习深度神经网络机器学习msexcelPythonresearch-project
Jupyter Notebook 17
3 年前
https://static.github-zh.com/github_avatars/ShrishtiHore?size=40
ShrishtiHore / Weapons-Detection-in-Real-Time-Surveillance-Videos

#计算机科学#This project aims to minimize the police response time by detecting weapons through a live CCTV camera feed. So it alerts the police as soon as it detects any sort of weapons. In our project we are fo...

机器视觉机器学习PythonJupyter NotebookTensorflowobject-detectiondatadatacleaningtrainingtensorboardssd-mobilenet
Jupyter Notebook 16
4 年前
loading...