#计算机科学#Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, a...
A Python library for fast, interactive geospatial vector data visualization in Jupyter.
80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Functions, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML...
Fast data store for Pandas time-series data
A tool for batch loading data files (json, parquet, csv, tsv) into ElasticSearch
Amazon S3 Find and Forget is a solution to handle data erasure requests from data lakes stored on Amazon S3, for example, pursuant to the European General Data Protection Regulation (GDPR)
Manipulate arrays of complex data structures as easily as Numpy.
Overture Maps Downloader simplifies geospatial data manipulation by integrating the powerful DuckDB, Dask DataFrames, and GDAL/OGR open source tools.
Command-line interface to quickly generate fake CSV and JSON data
Add DuckDB, Parquet, CSV and JSON lines support to Datasette