集合主题趋势排行榜

#

sglang

kvcache-ai/Mooncake

kvcache-ai / Mooncake

#大语言模型#Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

inference kvcache 大语言模型 rdma sglang vllm disaggregation

C++ 4.09 k

15 小时前

OpenMOSS / MOSS-TTSD

MOSS-TTSD is a spoken dialogue generation model that enables expressive dialogue speech synthesis in both Chinese and English, supporting zero-shot multi-speaker voice cloning, and long-form speech ge...

large-language-models finetune sglang streaming

Python 978

18 天前

ModelCloud / GPTQModel

LLM model quantization (compression) toolkit with hw acceleration support for Nvidia CUDA, AMD ROCm, Intel XPU and Intel/AMD/Apple CPU via HF, vLLM, and SGLang.

gptq peft quantization sglang transformers vllm

Python 828

2 天前

HuiResearch / FlashTTS

基于SparkTTS、OrpheusTTS等模型，提供高质量中文语音合成与声音克隆服务。

Python 538

5 个月前

sgl-project / SpecForge

#大语言模型#Train speculative decoding models effortlessly and port them smoothly to SGLang serving.

PyTorch sglang training 大语言模型

Python 420

6 天前

sgl-project / ome

#大语言模型#OME is a Kubernetes operator for enterprise-grade management and serving of Large Language Models (LLMs)

Kubernetes llm-inference model-serving oracle-cloud sglang 大语言模型 deepseek llama

Go 289

5 天前

InftyAI / llmaz

#大语言模型#☸️ Easy, advanced inference platform for large language models on Kubernetes. 🌟 Star to support our work!

Kubernetes 大语言模型 llamacpp sglang vllm huggingface modelscope ollama inference inference-platform

Go 260

3 天前

shell-nlp / gpt_server

#大语言模型#gpt_server是一个用于生产级部署LLMs、Embedding、Reranker、ASR、TTS、文生图、图片编辑和文生视频的开源框架。

embedding gpt llama 大语言模型 openai prompt-injection rerank vllm tts fastchat function-calling asr sglang

Python 213

19 小时前

ovg-project / kvcached

#大语言模型#Virtualized Elastic KV Cache for Dynamic GPU Sharing and Beyond

kvcache 大语言模型 sglang vllm inference-engine llm-framework llm-inference llm-serving Serverless ollama

Python 103

1 天前

sgl-project / rbg

#大语言模型#A workload for deploying LLM inference services on Kubernetes

Kubernetes 大语言模型 sglang

Go 77

5 天前

#大语言模型#Arks is a cloud-native inference framework running on Kubernetes

dynamo Kubernetes sglang vllm inference reasoning 人工智能大语言模型

Go 43

16 天前

modal-labs / stopwatch

#计算机科学#A tool for benchmarking LLMs on Modal

大语言模型机器学习 sglang tensorrt-llm vllm

Python 43

2 个月前

sgl-project / sgl-cookbook

#大语言模型#Make SGLang go brrr

deepseek deepseek-r1 deepseek-v3 gpt-oss llama3 llama3-1 llama4 qwen2 qwen2-5 qwen3 ome sglang Kubernetes 大语言模型

35

15 天前

blackbird-io / blackbird

A high-performance RDMA distributed storage system for fast LLM Inference and GPU Training

big-data C++distributed-cache gpu infiniband kv-cache Python rdma vllm CUDA llm-serving sglang llm-framework

C++ 34

8 天前

dzhsurf / deepseek-v3-r1-deploy-and-benchmarks

DeepSeek-V3, R1 671B on 8xH100 Throughput Benchmarks

deepseek-r1 deepseek-v3 sglang vllm

Python 16

7 个月前

zejia-lin / BulletServe

#大语言模型#Boosting GPU utilization for LLM serving via dynamic spatial-temporal prefill & decode orchestration

inference sglang 大语言模型 llm-serving

Python 13

22 天前

sgl-project / whl

Kernel Library Wheel for SGLang

CUDA cutlass sglang

HTML 12

4 天前

AidanCooper / constrained-decoding

#自然语言处理#A guide to structured generation using constrained decoding

generative-model large-language-models 自然语言处理 sglang structured-generation

Jupyter Notebook 11

1 年前

lucasavila00 / LmScript

#大语言模型#Controllable Language Model Interactions in TypeScript

人工智能 guidance 大语言模型 TypeScript sglang

TypeScript 9

1 年前

didier-durand / llms-in-clouds

#大语言模型#Experiments with LLMs in clouds (powered by SGLang)

Amazon Web Services Docker huggingface 大语言模型 qwen sglang llama mistral

Python 6

1 个月前

loading...

Website
Wikipedia