#数据仓库#OpenRefine(原名Google Refine) 是一个强大的数据清洗和转换工具
Select, put and delete data from JSON, TOML, YAML, XML and CSV files with a single tool. Supports conversion between formats and can be used as a Go package.
#计算机科学#A Collection of Cheatsheets, Books, Questions, and Portfolio For DS/ML Interview Prep
#数据仓库#Blazing-fast Data-Wrangling toolkit
#计算机科学#Carefully curated resource links for data science in one place
#大语言模型#ETL, Analytics, Versioning for Unstructured Data
Zui is a powerful desktop application for exploring and working with data. The official front-end to the Zed lake.
#时序数据库#A Python toolbox for gaining geometric insights into high-dimensional data
#计算机科学#🚚 Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
#计算机科学#Machine learning with dataframes
The JavaScript data transformation and analysis toolkit inspired by Pandas and LINQ.
Statistical Inference via Data Science: A ModernDive into R and the Tidyverse
#计算机科学#Materials for following along with Hands-On Data Analysis with Pandas – Second Edition
Microsoft Program Synthesis using Examples SDK is a framework of technologies for the automatic generation of programs from input-output examples. This repo includes samples and sample data for the Mi...
Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algor...
#计算机科学#Materials for following along with Hands-On Data Analysis with Pandas.
An introductory workshop on pandas with notebooks and exercises for following along. Slides contain all solutions.
Data Analysis and Visualization in R for Ecologists
Pacote que trata e organiza os dados do Cadastro Nacional da Pessoa Jurídica (CNPJ)
Like awk, but with SQL and table joins