GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

data-ingestion

Website
Wikipedia
https://static.github-zh.com/github_avatars/apache?size=40
apache / seatunnel

SeaTunnel (原名为 waterdrop)是一个易用的支持海量数据实时同步的高性能分布式数据集成平台,每天可以稳定同步数百亿数据

data-integrationhigh-performanceofflinereal-timeapachebatchcdcchange-data-capturedata-ingestioneltstreaming
Java 8.58 k
3 天前
https://static.github-zh.com/github_avatars/bruin-data?size=40
bruin-data / ingestr

ingestr is a CLI tool to copy data between any databases with a single command seamlessly.

BigQuerycopy-databasedata-ingestiondata-integrationdata-pipelineduckdbingestion-pipelinesql-serverPostgreSQLsnowflake
Python 2.97 k
3 天前
https://static.github-zh.com/github_avatars/apache?size=40
apache / paimon

Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.

big-datadata-ingestionflinkpaimonreal-time-analyticsApache Sparktable-storestreaming-datalake
Java 2.84 k
2 天前
dashbitco/broadway
https://static.github-zh.com/github_avatars/dashbitco?size=40
dashbitco / broadway

Concurrent and multi-stage data ingestion and data processing with Elixir

Elixirdata-ingestiondata-processingconcurrent
Elixir 2.54 k
10 天前
https://static.github-zh.com/github_avatars/pravega?size=40
pravega / pravega

Pravega - Streaming as a new software defined storage primitive

streamingstreaming-datadistributed-storagereal-time-datadata-ingestion
Java 2 k
3 个月前
https://static.github-zh.com/github_avatars/bruin-data?size=40
bruin-data / bruin

Build data pipelines with SQL and Python, ingest data from different sources, add quality checks, and build end-to-end flows.

analyticsBigQuerydata-modelingdata-pipelinesPythonsnowflakeSQL数据分析data-transformationdata-ingestiondata-platform
Go 947
5 天前
https://static.github-zh.com/github_avatars/CrunchyData?size=40
CrunchyData / pg_parquet

Copy to/from Parquet in S3, Azure Blob Storage, Google Cloud Storage, http(s) stores, local files or standard inout stream from within PostgreSQL

columnardata-ingestiondata-migrationparquetPostgreSQLazure-storagegoogle-cloud-storageHTTPs3
Rust 518
4 天前
https://static.github-zh.com/github_avatars/orbitalapi?size=40
orbitalapi / orbital

Orbital automates integration between data sources (APIs, Databases, Queues and Functions). BFF's, API Composition and ETL pipelines that adapt as your specs change.

APIintegrationKotlin微服务api-integrationapi-managementREST APITypeScriptdata-engineeringdata-ingestionetlJava
TypeScript 321
3 个月前
https://static.github-zh.com/github_avatars/unbody-io?size=40
unbody-io / unbody

#大语言模型#The Supabase of AI era. A modular, open-source backend for building AI-native software — designed for knowledge, not static data.

agentic-aiai-native后端聊天机器人data-ingestiondeveloper-toolsetl-pipelinegenerative-aiknowledge-base大语言模型ragvector-database
TypeScript 298
10 天前
https://static.github-zh.com/github_avatars/cuebook?size=40
cuebook / cuelake

Use SQL to build ELT pipelines on a data lakehouse.

apache-icebergdeltalakehousedatalakedata-lakeeltetldata-engineeringdata-integrationdata-ingestionApache Sparkspark-sqldata-transferpipelinesdata-pipelinezeppelin-notebookSQL
JavaScript 287
3 年前
https://static.github-zh.com/github_avatars/merantix-momentum?size=40
merantix-momentum / squirrel-core

#自然语言处理#A Python library that enables ML teams to share, load, and transform data in a collaborative, flexible, and efficient way 🌰

Python机器学习数据科学机器视觉cv自然语言处理人工智能PyTorchTensorflow数据集distributedDataOps深度学习data-ingestioncloud-computingcollaborationinternal
Python 281
2 个月前
https://static.github-zh.com/github_avatars/apache?size=40
apache / paimon-rust

Apache Paimon Rust The rust implementation of Apache Paimon.

big-datadata-ingestionpaimonreal-time-analyticsRuststreaming-datalaketable-store
Rust 126
2 个月前
https://static.github-zh.com/github_avatars/thedataengineeringbook?size=40
thedataengineeringbook / thedataengineeringbook

The Data Engineering Book - หนังสือวิศวกรรมข้อมูล ของคนไทย เพื่อคนไทย

data-engineeringdataHacktoberfestbookdata-engineerdata-pipelinedata-integrationdata-ingestiondata-infrastructure
JavaScript 112
2 年前
https://static.github-zh.com/github_avatars/jgperrin?size=40
jgperrin / net.jgp.labs.spark

Apache Spark examples exclusively in Java

Apache SparkingestionJavadata-ingestiondataframe
Java 101
2 年前
https://static.github-zh.com/github_avatars/paloaltodatabases?size=40
paloaltodatabases / sequor

Sequor is a SQL-centric platform for building API integrations without lock-in and black boxes. Fuses API execution with SQL logic to provide an open, flexible platform for all your data and app integ...

api-integrationdata-integrationetlipaasSQLworkflow-automationdata-engineeringdata-ingestionreverse-etl
Python 67
9 天前
https://static.github-zh.com/github_avatars/XavientInformationSystems?size=40
XavientInformationSystems / Data-Ingestion-Platform

data-ingestionflinkstormapexApache Sparkbatch-processing
Java 49
5 年前
https://static.github-zh.com/github_avatars/merantix-momentum?size=40
merantix-momentum / squirrel-datasets-core

#自然语言处理#Squirrel dataset hub

Python数据科学机器学习自然语言处理人工智能机器视觉深度学习TensorflowcvcollaborationPyTorchdistributedDataOpscloud-computing数据集data-ingestion
Python 42
2 年前
https://static.github-zh.com/github_avatars/aws-samples?size=40
aws-samples / amazon-kinesis-data-processor-aws-fargate

Sample code for the AWS Big Data Blog Post Building a scalable streaming data processor with Amazon Kinesis Data Streams on AWS Fargate

data-ingestioncontainers
Python 37
2 个月前
https://static.github-zh.com/github_avatars/Dynatrace?size=40
Dynatrace / openkit-java

OpenKit Java Reference Implementation

data-ingestionSDK
Java 36
10 个月前
https://static.github-zh.com/github_avatars/Dynatrace?size=40
Dynatrace / OneAgent-SDK-for-Java

Enables custom tracing of Java applications in Dynatrace

SDKsdk-javaApplication Performance Management (APM)agentdata-ingestion
Java 36
1 个月前
loading...