集合主题趋势排行榜

kvcache

kvcache-ai / Mooncake

#大语言模型#Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

inference kvcache 大语言模型 rdma sglang vllm disaggregation

C++ 4.09 k

15 小时前

Zefan-Cai / R-KV

#大语言模型#[Neurips 2025] R-KV: Redundancy-aware KV Cache Compression for Reasoning Models

kvcache 大语言模型

Python 1.13 k

2 个月前

uccl-project / uccl

#大语言模型#Ultra and Unified CCL

人工智能 amd broadcom CUDA gpu hpc 大语言模型 Network Nvidia rdma kvcache P2P

C++ 584

1 天前

ovg-project / kvcached

#大语言模型#Virtualized Elastic KV Cache for Dynamic GPU Sharing and Beyond

kvcache 大语言模型 sglang vllm inference-engine llm-framework llm-inference llm-serving Serverless ollama

Python 103

1 天前

ModelEngine-Group / unified-cache-management

#大语言模型#Persist and reuse KV Cache to speedup your LLM.

ascend CUDA gpu kvcache 大语言模型 npu nfs ssd torch vllm deepseek

Python 58

3 天前

NoakLiu / PiKV

PiKV: KV Cache Management System for Mixture of Experts [Efficient ML System]

kvcache moe parallel-computing kv-cache management-system mixture-of-experts

Python 40

19 天前

Linking-ai / SCOPE

(ACL 2025 oral) SCOPE: Optimizing KV Cache Compression in Long-context Generation

kvcache long-context

Jupyter Notebook 33

5 个月前

IBM / spnl

Span Queries: What if we had a way to plan and optimize GenAI like we do for SQL?

generative-ai kvcache locality optimization SQL

Rust 11

2 天前

RohitMurali18 / Music-Generation-Emotion-Adaptive

This project implements an Emotion-Aware Music Generator (EAMG) that turns natural-language prompts into emotion-aligned music in real time. It uses a LoRA-tuned DistilBERT to classify emotions, maps ...

FastAPI kvcache 大语言模型 lora transformers

Jupyter Notebook 0

3 个月前

Website
Wikipedia