massive-datasets · GitHub Topics

GalaxySQL 是 PolarDB-X 的计算节点。PolarDB-X 是一款面向超高并发、海量存储、复杂查询场景设计的云原生分布式数据库系统。其采用 Shared-nothing 与存储计算分离架构，支持水平扩展、分布式事务、混合负载等能力，具备企业级、云原生、高可用、高度兼容 MySQL 系统及生态等特点。

horizontal-scaling distributed-transactions htap enterprise-class cloud-native high-availability MySQL high-concurrency massive-datasets relational-database

Java 1.62 k

19 天前

helmholtz-analytics / heat

#计算机科学#Distributed tensors and Machine Learning framework with GPU and MPI acceleration in Python

gpu tensors distributed 机器学习 mpi NumPy Python PyTorch data-analytics data-processing 数据科学 hpc massive-datasets parallelism

Python 223

3 天前

polardb / polardbx

PolarDB-X is a cloud native distributed SQL Database designed for high concurrency, massive storage, complex querying scenarios.

MySQL distributed-transactions cloud-native high-availability relational-databases high-concurrency massive-datasets htap horizontal-scaling enterprise-class

Makefile 83

17 天前

joshuaboud / gen-dataset

Command line tool to quickly generate a lot of files in a lot of directories

Linux dataset dataset-generation benchmarking massive-datasets cli-tool multithreading evaluation

C++ 6

4 年前

rajeshidumalla / Bloom-Filter

#计算机科学#Building a Bloom Filter on English dictionary words

bloom-filter massive-datasets Python 数据科学机器学习数据分析

Jupyter Notebook 4

4 年前

FedericoBruzzone / anti-money-laundering

#计算机科学#The project is based on the analysis of the "IBM Transactions for Anti Money Laundering" dataset published on Kaggle. The task is to implement a model which predicts whether or not a transaction is il...

机器学习 massive-datasets pyspark

Jupyter Notebook 3

1 年前

FedericoBruzzone / algorithms-for-massive-datasets

#算法刷题#This repository contains a LaTeX file that generates a PDF document comprising comprehensive notes for the course "Algorithms for Massive Datasets"

算法深度学习 massive-datasets recommender-system

TeX 2

1 年前

gmalik9 / floating_point_data_compressor

gipa -- compression/decompression tool to package compress and encode massive archive files with floating-point data

compression compressor autoencoder floating-point massive-datasets 数据可视化 data-compression representation representation-learning

Python 2

8 年前

datakaveri / k-anonymisation-SKALD

Scalable, chunk-wise K-anonymization tool based on the Optimal Lattice Anonymization (OLA) algorithm. It is designed to handle large datasets by processing them in manageable chunks, ensuring data pri...

chunking encoding massive-datasets

Python 2

1 个月前

rajeshidumalla / PageRank

#计算机科学#Building PageRank algorithm on Web Graph around Stanford.edu using NetworkX python library

pagerank-algorithm 机器学习 massive-datasets 数据分析数据科学 Python Apache Spark pandas NumPy

Jupyter Notebook 2

4 年前

StefanoBalbo / Geocoding

Automated massive geolocator of addresses with parallel processing.

Docker geocoding geolocation geopandas geospatial Jupyter Notebook jupyterlab massive-datasets massively-parallel nominatim osm Python spatial-analysis ssh-server

Jupyter Notebook 1

6 个月前

Alex4gtx / Massive-Data-Handler

Permite abrir e manipular arquivos massivos de texto/dados cujo seria impossivel abrir em um computador, por exemplo um arquivo de texto de +20gb, permite manipular o arquivo pegando apenas as linhas ...

big-data dictionaries python-script massive-datasets

Python 1

4 年前