GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

big-data

Website
Wikipedia
binhnguyennus/awesome-scalability
https://static.github-zh.com/github_avatars/binhnguyennus?size=40
binhnguyennus / awesome-scalability

#面试#一份说明可伸缩、高可靠和高性能的大规模系统模式的阅读清单。案例学习都是从服务于数百万甚至数亿用用户的线上系统总结的。

system-design后端scalability面试architectureDevOpsdesign-patternsAwesome Listsbig-dataHackathon-Kitlistsweb-development编程systeminterview-practice计算机科学distributed-systems机器学习
62.54 k
1 个月前
https://static.github-zh.com/github_avatars/apache?size=40
apache / spark

Apache Spark - 用于大数据处理的统一分析引擎

PythonScalaRJavabig-datajdbcSQLApache Spark
Scala 41.31 k
1 天前
https://static.github-zh.com/github_avatars/ClickHouse?size=40
ClickHouse / ClickHouse

#数据库#ClickHouse是性能强悍、适合OLAP实时分析的列式数据库,支持SQL语法

dbmsolapanalyticsSQLbig-datamppclickhouseHacktoberfestC++Rust人工智能cloud-native数据库distributedembeddedlakehouse自托管
C++ 41.18 k
11 小时前
https://static.github-zh.com/github_avatars/donnemartin?size=40
donnemartin / data-science-ipython-notebooks

#计算机科学#Python 数据科学学习笔记:深度学习 (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, 大数据 (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python 核心, AWS, Linux命令

Python机器学习深度学习数据科学big-dataAmazon Web ServicesTensorflowtheanocaffescikit-learnkaggleApache SparkmapreducehadoopmatplotlibpandasNumPySciPyKeras
Python 28.28 k
1 年前
https://static.github-zh.com/github_avatars/apache?size=40
apache / flink

Flink 是一个分布式批处理和流处理框架

ScalaJavabig-dataflinkPythonSQL
Java 24.95 k
2 天前
https://static.github-zh.com/github_avatars/amark?size=40
amark / gun

#区块链#An open source cybersecurity protocol for syncing decentralized graph data.

机器学习人工智能big-data区块链P2PdecentralizedgraphCryptographycryptooffline-firstrealtimecrdtProtocol (disambiguation)数据库end-to-endencryptiondwebdappweb3metaverse
JavaScript 18.46 k
2 个月前
https://static.github-zh.com/github_avatars/heibaiying?size=40
heibaiying / BigData-Notes

大数据入门指南 ⭐

hadoophdfsYarnmapreducehiveApache SparkstormhbaseScalakafkazookeeperflumeazkabansqoopphoenixbigdatabig-data
Java 16.49 k
1 年前
prestodb/presto
https://static.github-zh.com/github_avatars/prestodb?size=40
prestodb / presto

Presto 是用于大数据的高性能分布式SQL查询引擎

Javaprestohivehadoopbig-dataSQLdatalakehouseQuery (disambiguation)
Java 16.38 k
2 天前
https://static.github-zh.com/github_avatars/andkret?size=40
andkret / Cookbook

The Data Engineering Cookbook

data-engineerdata-engineeringbig-databest-practicescookbook
Python 14.34 k
4 天前
https://static.github-zh.com/github_avatars/apache?size=40
apache / predictionio

PredictionIO, a machine learning server for developers and ML engineers.

Scalabig-data
Scala 12.53 k
4 年前
https://static.github-zh.com/github_avatars/yahoo?size=40
yahoo / CMAK

CMAK is a tool for managing Apache Kafka clusters

kafkaScalacluster-managementbig-data
Scala 11.91 k
2 年前
https://static.github-zh.com/github_avatars/trinodb?size=40
trinodb / trino

trino 是一个分布式大数据 SQL 查询引擎(前身 PrestoSQL)

Javaprestohivehadoopbig-dataSQLprestodb数据库distributed-systemsdistributed-database数据科学datalakejdbcquery-enginetrinoanalyticsdelta-lakeiceberg
Java 11.43 k
17 小时前
https://static.github-zh.com/github_avatars/vesoft-inc?size=40
vesoft-inc / nebula

A distributed, fast open-source graph database featuring horizontal scalability and high availability

graph-databasedistributed数据库graphdbraftC++NebulaGraphnebulagraphnebulagraphbig-datadistributed-systemsscalabilityHacktoberfest
C++ 11.4 k
10 天前
provectus/kafka-ui
https://static.github-zh.com/github_avatars/provectus?size=40
provectus / kafka-ui

UI for Apache Kafka。一个Kafak 图形化管理工具

kafka-uikafka-brokerskafkakafka-streamskafka-clientOpen Sourcekafka-connectkafka-producerstreamsbig-dataapache-kafkacluster-managementweb-uikafka-managerkafka-clusterstreaming-dataevent-streamingHacktoberfest
Java 10.92 k
1 年前
https://static.github-zh.com/github_avatars/StarRocks?size=40
StarRocks / starrocks

StarRocks 是新一代极速全场景 MPP (Massively Parallel Processing) 数据库。StarRocks 的愿景是能够让用户的数据分析变得更加简单和敏捷。用户无需经过复杂的预处理,就可以用 StarRocks 来支持多种数据分析场景的极速分析。

数据库olapSQLanalyticsbig-datarealtime-databasevectorizeddistributed-databasereal-time-analyticsmppjoinstar-schemareal-time-updatesdelta-lakehudiiceberglakehousedatalakelakehouse-platformcloudnative
Java 10.13 k
1 天前
quickwit-oss/quickwit
https://static.github-zh.com/github_avatars/quickwit-oss?size=40
quickwit-oss / quickwit

#搜索#quickwit 是一个用于日志管理和分析的搜索引擎,是Datadog、Elasticsearch, Loki 和 Tempo 的替代品

Rustlog-managementlogstantivycloud-nativeOpen Sourcebig-datacloud-storagedistributed-tracing搜索引擎
Rust 10.08 k
3 天前
https://static.github-zh.com/github_avatars/cython?size=40
cython / cython

#编程语言#The most widely used Python to C compiler

Pythoncythoncpythoncpython-extensionsCC++performancebig-data
Python 10.07 k
4 天前
https://static.github-zh.com/github_avatars/catboost?size=40
catboost / catboost

#计算机科学#A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computa...

机器学习decision-treesgradient-boostinggbmgbdtPythonRkagglegpu-computingcatboost教程categorical-featuresgpucoreml数据科学big-dataCUDAdata-mining
C++ 8.43 k
13 小时前
https://static.github-zh.com/github_avatars/apache?size=40
apache / beam

Apache Beam 是大数据统一编程模型,用于定义和执行数据处理管道,包括ETL、批处理和流处理

PythonJavabig-databeambatchGoSQLstreaming
Java 8.16 k
16 小时前
https://static.github-zh.com/github_avatars/delta-io?size=40
delta-io / delta

Delta Lake 是一个开源存储框架,可以使用 Spark、PrestoDB、Flink、Trino 和 Hive 等计算引擎以及适用于 Scala、Java、Rust、Ruby 和 Python 的 API 构建 Lakehouse 架构。

Apache Sparkacidbig-dataanalyticsdelta-lake
Scala 8.08 k
21 小时前
loading...