GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

parquet

Website
Wikipedia
https://static.github-zh.com/github_avatars/uber?size=40
uber / petastorm

#计算机科学#Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, a...

TensorflowPyTorch深度学习机器学习pysparkparquet
Python 1.86 k
1 个月前
https://static.github-zh.com/github_avatars/developmentseed?size=40
developmentseed / lonboard

A Python library for fast, interactive geospatial vector data visualization in Jupyter.

anywidgetapache-arrowdeck-glgeopandasJupyter Notebook数据可视化geospatial-analysismapsPython可视化webglapache-parquetparquetgeospatial
Python 818
5 天前
https://static.github-zh.com/github_avatars/HariSekhon?size=40
HariSekhon / DevOps-Python-tools

80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Functions, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML...

cloudformationPythonhbaseJSONavroparquetApache Sparkpysparktravis-cielasticsearchsolrhadoophdfsdockerhubDockerLinuxAmazon Web ServicesDevOpsGoogle 云gcf
Python 806
5 个月前
https://static.github-zh.com/github_avatars/ranaroussi?size=40
ranaroussi / pystore

Fast data store for Pandas time-series data

datastoredaskparquetpandastimeseries数据库dataframe
Python 592
2 个月前
https://static.github-zh.com/github_avatars/moshe?size=40
moshe / elasticsearch_loader

A tool for batch loading data files (json, parquet, csv, tsv) into ElasticSearch

PythonelasticsearchparquetCSVJSONlogstash
Python 401
3 年前
https://static.github-zh.com/github_avatars/grai-io?size=40
grai-io / grai-core

HacktoberfestdatadbtOpen SourcePostgreSQLPythonsql-serverMySQLsnowflake数据科学dataengineeringDjangoparquetredshiftdata-lineage
Python 308
2 天前
https://static.github-zh.com/github_avatars/awslabs?size=40
awslabs / amazon-s3-find-and-forget

Amazon S3 Find and Forget is a solution to handle data erasure requests from data lakes stored on Amazon S3, for example, pursuant to the European General Data Protection Regulation (GDPR)

data-lakeamazon-s3s3gdprAmazon Web Servicesparquetccpabig-data隐私data
Python 242
2 个月前
https://static.github-zh.com/github_avatars/scikit-hep?size=40
scikit-hep / awkward-0.x

Manipulate arrays of complex data structures as easily as Numpy.

PythonNumPybig-dataanalysiscolumnarcolumnar-storageapache-arrowarrowparquethdf5rootroot-cern
Python 213
5 年前
https://static.github-zh.com/github_avatars/JDASoftwareGroup?size=40
JDASoftwareGroup / kartothek

A consistent table management library in python

Pythonpydatadaskarrowparquet
Python 160
2 年前
https://static.github-zh.com/github_avatars/atlaslib?size=40
atlaslib / atlas

Atlas lets you explore your Apple Health data

Appleapple-health数据可视化parquet
Python 109
1 年前
https://static.github-zh.com/github_avatars/Youssef-Harby?size=40
Youssef-Harby / OvertureMapsDownloader

Overture Maps Downloader simplifies geospatial data manipulation by integrating the powerful DuckDB, Dask DataFrames, and GDAL/OGR open source tools.

gdalgeospatialparquetduckdb
Python 99
1 年前
https://static.github-zh.com/github_avatars/InfluxCommunity?size=40
InfluxCommunity / influxdb3-python

Python module that provides a simple and convenient way to interact with InfluxDB 3.0.

apachearrowinfluxdbPythonSQLparquet
Python 88
10 天前
https://static.github-zh.com/github_avatars/dacort?size=40
dacort / faker-cli

Command-line interface to quickly generate fake CSV and JSON data

Amazon Web ServicesCSVJSONdeltalakeparquet
Python 76
1 年前
https://static.github-zh.com/github_avatars/apache?size=40
apache / parquet-testing

Apache Parquet Testing

parquetapache
Python 69
23 天前
https://static.github-zh.com/github_avatars/cldellow?size=40
cldellow / csv2parquet

Convert a CSV to a parquet file.

parquetCSVapache-arrowapache-parquet
Python 64
3 年前
https://static.github-zh.com/github_avatars/cldellow?size=40
cldellow / csv2parquet

Convert a CSV to a parquet file.

parquetCSVapache-arrowapache-parquet
Python 64
3 年前
https://static.github-zh.com/github_avatars/zsvoboda?size=40
zsvoboda / dbd

dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.

eltSQL数据库database-schemasetlCSVJSONparquetxlsxxlsexcelPythonPostgreSQLMySQLSQLitesnowflakeBigQueryredshift
Python 57
4 年前
https://static.github-zh.com/github_avatars/cldellow?size=40
cldellow / datasette-parquet

Add DuckDB, Parquet, CSV and JSON lines support to Datasette

datasetteduckdbparquet
Python 54
1 年前
https://static.github-zh.com/github_avatars/dask-contrib?size=40
dask-contrib / dask-deltatable

A Delta Lake reader for Dask

daskdelta-lakeparquetPython
Python 53
1 个月前
https://static.github-zh.com/github_avatars/datacoon?size=40
datacoon / undatum

undatum: a command-line tool for data processing. Brings CSV simplicity to NDJSON, BSON, XML and other dat files

bsonjsonlJSONCSV命令行界面datadatasetparquet
Python 48
1 个月前
loading...