#大语言模型#Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
#大语言模型#kvcached: Elastic KV cache for dynamic GPU sharing and efficient multi-LLM inference.
PiKV: KV Cache Management System for Mixture of Experts [Efficient ML System]
(ACL 2025 oral) SCOPE: Optimizing KV Cache Compression in Long-context Generation
Span Queries: What if we had a way to plan and optimize GenAI like we do for SQL?
This project implements an Emotion-Aware Music Generator (EAMG) that turns natural-language prompts into emotion-aligned music in real time. It uses a LoRA-tuned DistilBERT to classify emotions, maps ...