GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

delta-lake

Website
Wikipedia
https://static.github-zh.com/github_avatars/apache?size=40
apache / doris

Doris 是百度开源的支持对海量大数据进行快速分析的MPP数据库。

olap数据库hadoophivehudiicebergreal-timeSQLBigQuerydbtdelta-lakeeltetllakehousequery-engineredshiftsnowflakeApache Spark
Java 13.81 k
1 天前
https://static.github-zh.com/github_avatars/trinodb?size=40
trinodb / trino

trino 是一个分布式大数据 SQL 查询引擎(前身 PrestoSQL)

Javaprestohivehadoopbig-dataSQLprestodb数据库distributed-systemsdistributed-database数据科学datalakejdbcquery-enginetrinoanalyticsdelta-lakeiceberg
Java 11.44 k
2 小时前
https://static.github-zh.com/github_avatars/StarRocks?size=40
StarRocks / starrocks

StarRocks 是新一代极速全场景 MPP (Massively Parallel Processing) 数据库。StarRocks 的愿景是能够让用户的数据分析变得更加简单和敏捷。用户无需经过复杂的预处理,就可以用 StarRocks 来支持多种数据分析场景的极速分析。

数据库olapSQLanalyticsbig-datarealtime-databasevectorizeddistributed-databasereal-time-analyticsmppjoinstar-schemareal-time-updatesdelta-lakehudiiceberglakehousedatalakelakehouse-platformcloudnative
Java 10.13 k
1 天前
https://static.github-zh.com/github_avatars/delta-io?size=40
delta-io / delta

Delta Lake 是一个开源存储框架,可以使用 Spark、PrestoDB、Flink、Trino 和 Hive 等计算引擎以及适用于 Scala、Java、Rust、Ruby 和 Python 的 API 构建 Lakehouse 架构。

Apache Sparkacidbig-dataanalyticsdelta-lake
Scala 8.08 k
1 天前
https://static.github-zh.com/github_avatars/roapi?size=40
roapi / roapi

#数据仓库#Create full-fledged APIs for slowly moving datasets without writing a single line of code.

SQLGraphQLarrowREST APIanalyticsQuery (disambiguation)columnarRustin-memory-databasedatafusionblob-storagecloud-nativeparquet数据集s3delta-lake
Rust 3.31 k
1 个月前
https://static.github-zh.com/github_avatars/delta-io?size=40
delta-io / delta-rs

A native Rust library for Delta Lake, with bindings into Python

deltaRustdelta-lakedatabricksPythonpandaspandas-dataframe
Rust 2.83 k
3 天前
Mooncake-Labs/pg_mooncake
https://static.github-zh.com/github_avatars/Mooncake-Labs?size=40
Mooncake-Labs / pg_mooncake

Real-time analytics on Postgres tables

analyticscolumnstoredelta-lakeiceberglakehouseparquetPostgreSQL
Rust 1.48 k
4 天前
https://static.github-zh.com/github_avatars/databricks?size=40
databricks / LearningSparkV2

This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]

Apache Sparkspark-sqlspark-mllibmlflowdelta-lake
Scala 1.3 k
5 个月前
https://static.github-zh.com/github_avatars/apache?size=40
apache / incubator-xtable

Apache XTable (incubating) is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.

apache-hudiapache-icebergdelta-lake
Java 1.07 k
9 天前
https://static.github-zh.com/github_avatars/delta-io?size=40
delta-io / delta-sharing

An open protocol for secure data sharing

big-dataApache Sparkpandasdelta-lake
Scala 841
10 天前
https://static.github-zh.com/github_avatars/Nike-Inc?size=40
Nike-Inc / koheesio

Python framework for building efficient data pipelines. It promotes modularity and collaboration, enabling the creation of complex pipelines from simple, reusable components.

data-engineeringdelta-lakepydanticpysparkPython
Python 637
1 个月前
https://static.github-zh.com/github_avatars/splitgraph?size=40
splitgraph / seafowl

Analytical database for data-driven Web applications 🪶

数据库HTTPSQLAPIEdgeServerless可视化Rustdatafusiondelta-lake
Rust 484
4 个月前
https://static.github-zh.com/github_avatars/tansu-io?size=40
tansu-io / tansu

Apache Kafka® compatible broker with S3, PostgreSQL, Apache Iceberg and Delta Lake

built-with-rustPostgreSQLs3apache-icebergapache-kafkaapache-arrowdatafusionparquetdelta-lakedatalake
Rust 387
5 天前
https://static.github-zh.com/github_avatars/aws-samples?size=40
aws-samples / amazon-sagemaker-local-mode

#计算机科学#Amazon SageMaker Local Mode Examples

sagemakeramazon-sagemakerPyTorchcatboostlightgbmPyCharmtensorflow-trainingprophetscikit-learnhuggingfacehuggingface-transformers机器学习delta-lakegensim-word2vecdaskTensorflow
Python 257
2 个月前
https://static.github-zh.com/github_avatars/adidas?size=40
adidas / lakehouse-engine

The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for several lakehouse algorithms, data flows and utilities for Data Prod...

big-dataconfiguration-drivendata-engineeringdata-qualitydatabricksdelta-lake框架great-expectationslakehouseApache Spark
Python 252
4 个月前
https://static.github-zh.com/github_avatars/josephmachado?size=40
josephmachado / data_engineering_best_practices

Sample project to demonstrate data engineering best practices

data-engineeringdelta-lakeetlgreat-expectationsminiopysparkApache Spark
Python 193
1 年前
https://static.github-zh.com/github_avatars/japila-books?size=40
japila-books / delta-lake-internals

The Internals of Delta Lake

deltalakebookinternalsdelta-lakebooksdatalake
184
5 个月前
https://static.github-zh.com/github_avatars/izhangzhihao?size=40
izhangzhihao / Real-time-Data-Warehouse

Real-time Data Warehouse with Apache Flink & Apache Kafka & Apache Hudi

flinkdata-warehousedata-warehousingflink-sqldebeziumkafkaelasticsearchdelta-lakecdcchange-data-capturehudiicebergSQLdatalakedeltadeltalakeApache Sparkspark-sql
Dockerfile 113
2 年前
https://static.github-zh.com/github_avatars/anneglienke?size=40
anneglienke / 101_upsert-delta

This repository exemplifies a simple ELT process using delta to perform upsert and remove data files that aren't in the latest state of the transaction log for the table.

deltadelta-lakedeltalake
Python 95
3 年前
https://static.github-zh.com/github_avatars/delta-incubator?size=40
delta-incubator / delta-sharing-rs

A Minimalistic Rust Implementation of Delta Sharing Server.

axumdata-engineeringdelta-lakeRust
Rust 92
3 个月前
loading...