GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

etl

Website
Wikipedia
apache/airflow
https://static.github-zh.com/github_avatars/apache?size=40
apache / airflow

#计算机科学#Apache Airflow 是一个workflow工作流调度、编排、监控平台

airflowapacheapache-airflowPythonschedulerworkflow自动化dagdata-engineeringdata-integrationdata-orchestratordata-pipelines数据科学eltetl机器学习mlopsorchestrationworkflow-engineworkflow-orchestration
Python 40.56 k
4 小时前
https://static.github-zh.com/github_avatars/pathwaycom?size=40
pathwaycom / pathway

Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.

batch-processingkafkapathwayPythonstreaming机器学习real-timedata-analyticsdata-pipelinesdata-processingdataflowetletl-frameworkiot-analyticsRuststream-processingtime-series-analysis
Python 26.78 k
2 天前
https://static.github-zh.com/github_avatars/airbytehq?size=40
airbytehq / airbyte

Airbyte 开源 EL(T) 平台,帮助用户将数据从应用程序,API 和数据库中同步到数据仓库

datapipeline数据分析data-engineeringJavaPythonetlchange-data-capturedata-collectiondata-integrationeltBigQueryredshiftsnowflakedata-pipelinesql-serverMySQLPostgreSQLs3自托管
Python 18.43 k
16 小时前
https://static.github-zh.com/github_avatars/apache?size=40
apache / doris

Doris 是百度开源的支持对海量大数据进行快速分析的MPP数据库。

olap数据库hadoophivehudiicebergreal-timeSQLBigQuerydbtdelta-lakeeltetllakehousequery-engineredshiftsnowflakeApache Spark
Java 13.81 k
2 天前
https://static.github-zh.com/github_avatars/dagster-io?size=40
dagster-io / dagster

An orchestration platform for the development, production, and observation of data assets.

data-pipelinesdagsterworkflow数据科学workflow-automationPythonschedulerdata-orchestratoretlanalyticsdata-engineeringmlopsorchestrationdata-integrationmetadata
Python 13.38 k
2 天前
redpanda-data/connect
https://static.github-zh.com/github_avatars/redpanda-data?size=40
redpanda-data / connect

Fancy stream processing made operationally mundane

message-queuestream-processingstreaming-datamessage-buslogsstream-processorcqrsevent-sourcingGokafkaamqprabbitmqnatsetldata-engineeringDataOps
Go 8.38 k
2 天前
https://static.github-zh.com/github_avatars/mage-ai?size=40
mage-ai / mage-ai

#计算机科学#🧙 Build, run, and manage data pipelines for integrating and transforming data.

机器学习人工智能datadata-engineering数据科学Pythoneltetlpipelinesdata-pipelinesorchestrationdata-integrationSQLApache Sparkdbtpipelinereverse-etltransformation
Python 8.37 k
2 天前
turbot/steampipe
https://static.github-zh.com/github_avatars/turbot?size=40
turbot / steampipe

Zero-ETL, infinite possibilities. Live query APIs, code & more with SQL. No DB required.

steampipePostgreSQLpostgresql-fdwcloud安全Amazon Web ServicesAzureciscnappcspmDevOpsdevsecopsGoogle 云GoKubernetesTerraformetlSQLiteHacktoberfest
Go 7.36 k
3 天前
https://static.github-zh.com/github_avatars/cloudquery?size=40
cloudquery / cloudquery

一个高性能ELT 框架,powered by Apache Arrow

Amazon Web ServicesGoogle 云AzureSQLdata-integrationeltetletl-frameworkBigQuerydata-collectiondata-engineeringKubernetesdataairbyteGitHub API数据分析GoogleGocspmattack-surface-management
Go 6.12 k
2 天前
https://static.github-zh.com/github_avatars/apache?size=40
apache / flink-cdc

Flink CDC Connector 是ApacheFlink的一组数据源连接器

change-data-capturecdcbatchdata-integrationdata-pipelinedistributedeltetlflinkkafkaMySQLpaimonPostgreSQLreal-timeschema-evolution
Java 6.1 k
2 天前
rudderlabs/rudder-server
https://static.github-zh.com/github_avatars/rudderlabs?size=40
rudderlabs / rudder-server

Privacy and Security focused Segment-alternative, in Golang and React

隐私warehouse-managementdata-warehousecustomer-data-platformdata-integrationdata-synchronizationetlBigQueryredshiftsnowflakedata-pipelineeltdata-engineeringcdpevent-streaming
Go 4.21 k
3 天前
https://static.github-zh.com/github_avatars/orchest?size=40
orchest / orchest

#编辑器#Build data pipelines, the easy way 🛠️

数据科学机器学习pipelinesideJupyter Notebookcloud自托管jupyterlabnotebooksDockerPythondata-pipelines部署Kubernetesairflowdagetletl-pipeline
TypeScript 4.13 k
2 年前
aws/aws-sdk-pandas
https://static.github-zh.com/github_avatars/aws?size=40
aws / aws-sdk-pandas

pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Par...

PythonAmazon Web Servicespandasapache-arrowapache-parquetdata-engineeringetl数据科学redshiftathenalambdaaws-lambdaemrMySQLmodinray
Python 4.03 k
2 天前
nucleuscloud/neosync
https://static.github-zh.com/github_avatars/nucleuscloud?size=40
nucleuscloud / neosync

Open Source Data Security Platform for Developers to Monitor and Detect PII, Anonymize Production Data and Sync it across environments.

DockerGosynthetic-databenthosetlorchestrationTestingTypeScriptOpen SourceReact自托管KubernetesNexttest-data-generatorfine-tuningsynthetic-data-generationfakerMySQLPostgreSQL
Go 3.87 k
4 天前
https://static.github-zh.com/github_avatars/quadratichq?size=40
quadratichq / quadratic

Spreadsheet with AI, Code, Connections

Python数据科学SpreadsheetSQLquadraticetldata-engineering数据分析dataWebAssemblywebgl人工智能
Rust 3.7 k
2 天前
https://static.github-zh.com/github_avatars/Netflix?size=40
Netflix / maestro

#计算机科学#Maestro: Netflix’s Workflow Orchestrator

analytics自动化batch-processingdagdata-engineeringDataOpsdata-orchestratordata-pipelines数据科学eltetlJava机器学习mlopsorchestrationschedulerworkflowworkflow-engineworkflow-orchestration
Java 3.48 k
21 小时前
https://static.github-zh.com/github_avatars/blockchain-etl?size=40
blockchain-etl / ethereum-etl

Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal transactions. Data is available in Go...

以太坊CSVexporterc20erc20-tokensetltransactionSQLAmazon Web ServicesGoogle 云BigQueryerc721
Python 3.03 k
2 个月前
https://static.github-zh.com/github_avatars/xyflow?size=40
xyflow / awesome-node-based-uis

#Awesome#A curated list with resources about node-based UIs

node-based-uiAwesome Listsetlvisual-programming
3.01 k
2 个月前
chonkie-ai/chonkie
https://static.github-zh.com/github_avatars/chonkie-ai?size=40
chonkie-ai / chonkie

#自然语言处理#🦛 CHONK your texts with Chonkie ✨ - The no-nonsense RAG chunking library

人工智能chunkingragtext-processing自然语言处理Pythonsemantic-segmentationvector-searchetlretrieval
Python 2.87 k
3 个月前
https://static.github-zh.com/github_avatars/apache?size=40
apache / incubator-devlake

Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and co...

data数据分析data-engineeringdata-integrationdata-transfersDevOpsdomain-layeretlGointegrationjiraOpen Sourceuser-friendlydoraHacktoberfest
Go 2.75 k
5 天前
loading...