#计算机科学#Apache Airflow 是一个workflow工作流调度、编排、监控平台
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
Airbyte 开源 EL(T) 平台,帮助用户将数据从应用程序,API 和数据库中同步到数据仓库
An orchestration platform for the development, production, and observation of data assets.
#计算机科学#🧙 Build, run, and manage data pipelines for integrating and transforming data.
Fancy stream processing made operationally mundane
Zero-ETL, infinite possibilities. Live query APIs, code & more with SQL. No DB required.
Flink CDC Connector 是ApacheFlink的一组数据源连接器
一个高性能ELT 框架,powered by Apache Arrow
Privacy and Security focused Segment-alternative, in Golang and React
#编辑器#Build data pipelines, the easy way 🛠️
pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Par...
Open Source Data Security Platform for Developers to Monitor and Detect PII, Anonymize Production Data and Sync it across environments.
Spreadsheet with AI, Code, Connections
#计算机科学#Maestro: Netflix’s Workflow Orchestrator
#Awesome#A curated list with resources about node-based UIs
Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal transactions. Data is available in Go...
#自然语言处理#🦛 CHONK your texts with Chonkie ✨ - The no-nonsense RAG chunking library
Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and co...
Fast, Simple and a cost effective tool to replicate data from Postgres to Data Warehouses, Queues and Storage