#数据库#ClickHouse是性能强悍、适合OLAP实时分析的列式数据库,支持SQL语法
Presto 是用于大数据的高性能分布式SQL查询引擎
Doris 是百度开源的支持对海量大数据进行快速分析的MPP数据库。
StarRocks 是新一代极速全场景 MPP (Massively Parallel Processing) 数据库。StarRocks 的愿景是能够让用户的数据分析变得更加简单和敏捷。用户无需经过复杂的预处理,就可以用 StarRocks 来支持多种数据分析场景的极速分析。
𝗔𝗜-𝗡𝗮𝘁𝗶𝘃𝗲 𝗗𝗮𝘁𝗮 𝗪𝗮𝗿𝗲𝗵𝗼𝘂𝘀𝗲. Open-source Snowflake alternative. Proven at petabyte scale with enterprise performance. Built for multimodal analytics. https://databend.com
LakeSoul is an end-to-end, realtime and cloud native Lakehouse framework with fast data ingestion, concurrent update and incremental data analytics on cloud storages for both BI and AI applications.
ByConity 是字节跳动开源的云原生数据仓库,提供读写分离、弹性扩缩容、租户资源隔离和数据读写的强一致性
YTsaurus is a scalable and fault-tolerant open-source big data platform.
World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.
Real-time analytics on Postgres tables
Apache Fluss is a streaming storage built for real-time analytics.
Fastest open-source tool for replicating Databases to Data Lake in Open Table Formats like Apache Iceberg. ⚡ Efficient, quick and scalable data ingestion for real-time analytics. Supporting Postgres,...
ClickBench: a Benchmark For Analytical Databases
DuckDB-powered data lake analytics from Postgres
#Awesome#A curated list of open source tools used in analytics platforms and data engineering ecosystem
The Control Plane for Apache Iceberg.
GigAPI is a Timeseries lakehouse for real-time data and sub-second queries, powered by DuckDB OLAP + Parquet Query Engine, Compactor w/ Cloud-Native Storage. Drop-in FDAP alternative ⭐
Use SQL to build ELT pipelines on a data lakehouse.
Examples of using Terraform to deploy Databricks resources