#大语言模型#Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any...
An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & Ray & Dynamic Sampling & Async Agentic RL)
#大语言模型#Structured data extraction and instruction calling with ML, LLM and Vision LLM
📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉
#大语言模型#Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
#大语言模型#Simple, scalable AI model deployment on GPU clusters
#大语言模型#High-performance Inference and Deployment Toolkit for LLMs and VLMs based on PaddlePaddle
#大语言模型#RamaLama is an open-source developer tool that simplifies the local serving of AI models from any source and facilitates their use for inference in production, all through the familiar language of con...
#大语言模型#UltraRAG 2.0: Less Code, Lower Barrier, Faster Deployment! MCP-based low-code RAG framework, enabling researchers to build complex pipelines to creative innovation.
Model swapping for llama.cpp (or any local OpenAI API compatible server)
#大语言模型#Community maintained hardware plugin for vLLM on Ascend
#大语言模型#🔒 Enterprise-grade API gateway that helps you monitor and impose cost or rate limits per API key. Get fine-grained access control and monitoring per user, application, or environment. Supports OpenAI...
#大语言模型#AI Inference Operator for Kubernetes. The easiest way to serve ML models in production. Supports VLMs, LLMs, embeddings, and speech-to-text.
Intelligent Mixture-of-Models Router for Efficient LLM Inference
#大语言模型#一个轻量级、支持全链路且易于二次开发的大模型应用项目(Large Model Data Assistant) 支持DeepSeek/Qwen3等大模型 基于 Dify 、LangChain/LangGraph、Ollama&Vllm、Sanic 和 Text2SQL 📊 等技术构建的一站式大模型应用开发项目,采用 Vue3、TypeScript 和 Vite 5 打造现代UI。它支持通过 ECh...
#大语言模型#Evaluate your LLM's response with Prometheus and GPT4 💯
#大语言模型#LLM notes, including model inference, transformer model structure, and llm framework code analysis notes.