GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

llm-serving

Website
Wikipedia
https://static.github-zh.com/github_avatars/vllm-project?size=40
vllm-project / vllm

#大语言模型#A high-throughput and memory-efficient inference and serving engine for LLMs

gpt大语言模型PyTorchllmopsmlopsmodel-servingtransformerllm-servinginferencellamaamdrocmCUDAinferentiatrainiumtpuxpuhpudeepseekqwen
Python 49.62 k
20 小时前
https://static.github-zh.com/github_avatars/ray-project?size=40
ray-project / ray

#大语言模型#Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

raydistributedparallel机器学习reinforcement-learning深度学习Pythonrllibhyperparameter-searchoptimization数据科学hyperparameter-optimizationserving部署PyTorchTensorflowllm-servinglarge-language-models大语言模型llm-inference
Python 37.52 k
14 小时前
https://static.github-zh.com/github_avatars/liguodongiot?size=40
liguodongiot / llm-action

#大语言模型#本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)

大语言模型llm-inferencellm-servingllm-trainingllmops
HTML 18.54 k
17 小时前
https://static.github-zh.com/github_avatars/sgl-project?size=40
sgl-project / sglang

#大语言模型#SGLang is a fast serving framework for large language models and vision language models.

CUDAinferencellamallava大语言模型llm-servingmoePyTorchtransformervlmllama3llama3-1deepseekdeepseek-llmdeepseek-v3deepseek-r1deepseek-r1-zeroqwen3llama4
Python 15.14 k
7 小时前
https://static.github-zh.com/github_avatars/bentoml?size=40
bentoml / OpenLLM

#大语言模型#Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.

大语言模型llmopsmodel-inferencefine-tuningllm-servingllamavicunabentomlllama2llm-inferencellm-opsmistralmlopsllama3-1
Python 11.35 k
6 天前
https://static.github-zh.com/github_avatars/skypilot-org?size=40
skypilot-org / skypilot

#计算机科学#SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 16+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.

cloud-computing数据科学深度学习gpuhyperparameter-tuning机器学习tpujob-queuejob-schedulercloud-managementdistributed-trainingml-infrastructuremulticloudspot-instancesml-platformcost-managementcost-optimizationfinopsllm-servingllm-training
Python 8.23 k
4 天前
bentoml/BentoML
https://static.github-zh.com/github_avatars/bentoml?size=40
bentoml / BentoML

#大语言模型#The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!

model-servingmlopsllmopsgenerative-aillm-inference深度学习llm-serving机器学习Pythonmultimodalml-engineering大语言模型
Python 7.78 k
3 天前
superduper-io/superduper
https://static.github-zh.com/github_avatars/superduper-io?size=40
superduper-io / superduper

#向量搜索引擎#Superduper: End-to-end framework for building custom AI applications and agents.

人工智能mlopstorchtransformersMongoDBPythonPyTorch机器学习数据库datainferencellm-inferencepretrained-models聊天机器人semantic-searchllm-servingllmopsvector-searchrag
Python 5.08 k
3 天前
https://static.github-zh.com/github_avatars/PaddlePaddle?size=40
PaddlePaddle / FastDeploy

#大语言模型#Large Language Model Deployment Toolkit

servingernie大语言模型qwenwenxinyiyaninferencellm-serving
Cuda 3.21 k
5 天前
https://static.github-zh.com/github_avatars/predibase?size=40
predibase / lorax

#大语言模型#Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

fine-tuninggptllama大语言模型llm-inferencellm-servingllmopsloramodel-servingPyTorchtransformers
Python 3.01 k
25 天前
https://static.github-zh.com/github_avatars/gpustack?size=40
gpustack / gpustack

#大语言模型#Simple, scalable AI model deployment on GPU clusters

ascendCUDAdeepseekdistributed-inferencegenaiinferencellamallamacpp大语言模型maasmetalopenaiqwenrocmvllmmindiellm-inferencellm-servinglocal-aiheterogeneous-cluster
Python 2.92 k
3 天前
https://static.github-zh.com/github_avatars/microsoft?size=40
microsoft / aici

#大语言模型#AICI: Prompts as (Wasm) Programs

人工智能RustWebAssemblywasmtimeinferencelanguage-model大语言模型llm-frameworkllm-inferencellm-servingllmopsmodel-servingtransformer
Rust 2.03 k
5 个月前
https://static.github-zh.com/github_avatars/MoonshotAI?size=40
MoonshotAI / MoBA

#大语言模型#MoBA: Mixture of Block Attention for Long-Context LLMs

flash-attention大语言模型llm-servingllm-trainingmoePyTorchtransformer
Python 1.8 k
2 个月前
https://static.github-zh.com/github_avatars/ray-project?size=40
ray-project / ray-llm

#大语言模型#RayLLM - LLMs on Ray (Archived). Read README for more info.

ray大语言模型llm-serving
Python 1.26 k
3 个月前
https://static.github-zh.com/github_avatars/thu-pacman?size=40
thu-pacman / chitu

#大语言模型#High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.

deepseekgpu大语言模型PyTorchllm-servingmodel-serving
Python 1.14 k
6 天前
https://static.github-zh.com/github_avatars/zhihu?size=40
zhihu / ZhiLight

#大语言模型#A highly optimized LLM inference acceleration engine for Llama and its variants.

inference-engine大语言模型CUDAgptllamallm-servingPyTorchllm-inferencemodel-servingdeepseek-r1
C++ 891
1 个月前
https://static.github-zh.com/github_avatars/mosecorg?size=40
mosecorg / mosec

#大语言模型#A high-performance ML model serving framework, offers dynamic batching and CPU/GPU pipelines to fully exploit your compute machine

model-serving深度学习机器学习nerual-networkmlopsHacktoberfestgpuPythonPyTorchTensorflow大语言模型jaxllm-servingRustcvmxnettts
Python 842
5 天前
https://static.github-zh.com/github_avatars/efeslab?size=40
efeslab / Nanoflow

#大语言模型#A throughput-oriented high-performance serving framework for LLMs

CUDAinferencellama2大语言模型llm-servingmodel-serving
Jupyter Notebook 820
11 天前
https://static.github-zh.com/github_avatars/alibaba?size=40
alibaba / rtp-llm

#大语言模型#RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.

gptinferencellama大语言模型llm-servingllmopsmodel-serving
C++ 788
12 天前
https://static.github-zh.com/github_avatars/vllm-project?size=40
vllm-project / vllm-ascend

#大语言模型#Community maintained hardware plugin for vLLM on Ascend

ascendinference大语言模型llm-servingllmopsmlopsmodel-servingtransformervllm
Python 758
20 小时前
loading...