#计算机科学#Apache Airflow 是一个workflow工作流调度、编排、监控平台
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
Airbyte 开源 EL(T) 平台,帮助用户将数据从应用程序,API 和数据库中同步到数据仓库
Doris 是百度开源的支持对海量大数据进行快速分析的MPP数据库。
An orchestration platform for the development, production, and observation of data assets.
Fancy stream processing made operationally mundane
#计算机科学#🧙 Build, run, and manage data pipelines for integrating and transforming data.
Zero-ETL, infinite possibilities. Live query APIs, code & more with SQL. No DB required.
Flink CDC Connector 是ApacheFlink的一组数据源连接器
Privacy and Security focused Segment-alternative, in Golang and React
#编辑器#Build data pipelines, the easy way 🛠️
pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Par...
Open Source Data Security Platform for Developers to Monitor and Detect PII, Anonymize Production Data and Sync it across environments.
Spreadsheet with AI, Code, Connections
#计算机科学#Maestro: Netflix’s Workflow Orchestrator
Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal transactions. Data is available in Go...
#Awesome#A curated list with resources about node-based UIs
#自然语言处理#🦛 CHONK your texts with Chonkie ✨ - The no-nonsense RAG chunking library
Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and co...