集合主题趋势排行榜

llm-inference

nomic-ai / gpt4all

GPT4All: 可在任何设备（笔记本 & 台式机）上运行大型语言模型(LLM)

llm-inference ai-chat

C++ 76.68 k

4 个月前

ray-project / ray

#大语言模型#Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Python 38.94 k

3 小时前

gitleaks / gitleaks

#大语言模型#Gitleaks 是一个开源SAST（静态应用安全测试）命令行工具，用于检测Git 仓库以防止把密码、API 密钥和访问令牌等机密信息硬编码到代码中

安全 Git Go secret gitleaks devsecops Hacktoberfest CI/CD 命令行界面 data-loss-prevention dlp Open Source ai-powered 大语言模型 llm-inference llm-training

Go 23.24 k

1 个月前

liguodongiot / llm-action

#大语言模型#本项目旨在分享大模型相关技术原理以及实战经验（大模型工程化、大模型应用落地）

大语言模型 llm-inference llm-serving llm-training llmops

HTML 20.78 k

1 个月前

Lightning-AI / litgpt

#大语言模型#20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.

人工智能深度学习 large-language-models 大语言模型 llm-inference

Python 12.75 k

3 天前

bentoml / OpenLLM

#大语言模型#Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.

大语言模型 llmops model-inference fine-tuning llm-serving llama vicuna bentoml llama2 llm-inference llm-ops mistral mlops llama3-1

Python 11.78 k

10 小时前

mistralai / mistral-inference

#大语言模型#Official inference library for Mistral models

大语言模型 llm-inference mistralai

Jupyter Notebook 10.47 k

6 个月前

openvinotoolkit / openvino

#自然语言处理#OpenVINO™ is an open source toolkit for optimizing and deploying AI inference

inference 深度学习 openvino 人工智能机器视觉 diffusion-models generative-ai llm-inference 自然语言处理 performance-boost speech-recognition stable-diffusion deploy-ai optimize-ai transformers yolo recommendation-system good-first-issue

C++ 8.82 k

11 小时前

SJTU-IPADS / PowerInfer

#大语言模型#PowerInfer 是一个快速的、可运行在消费级GPU、个人电脑上的大模型服务

large-language-models llama 大语言模型 llm-inference local-inference

C++ 8.33 k

1 个月前

bentoml / BentoML

#大语言模型#The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!

model-serving mlops llmops generative-ai llm-inference model-inference-service inference-platform 深度学习 llm-serving 机器学习 Python multimodal ml-engineering 大语言模型 ai-inference

Python 8.07 k

18 小时前

duixcom / Duix-Mobile

🚀 全网效果最好的移动端【实时对话数字人】。支持本地部署、多模态交互（语音、文本、表情），响应速度低于 1.5 秒，适用于直播、教学、客服、金融、政务等对隐私与实时性要求极高的场景。开箱即用，开发者友好。

ai-companion ai-girlfriend avatar chat-ui digital-human edge-ai llm-inference mobile-ai tts ai-boyfriend realtime-avatar

C++ 7.46 k

18 天前

InternLM / lmdeploy

#大语言模型#LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

cuda-kernels deepspeed fastertransformer llm-inference turbomind internlm llama 大语言模型 codellama llama2 llama3

Python 7.05 k

13 小时前

superduper-io / superduper

#向量搜索引擎#Superduper: End-to-end framework for building custom AI applications and agents.

人工智能 mlops torch transformers MongoDB Python PyTorch 机器学习数据库 data inference llm-inference pretrained-models 聊天机器人 semantic-search llm-serving llmops vector-search rag

Python 5.21 k

14 天前

FellouAI / eko

Eko (Eko Keeps Operating) - Build Production-ready Agentic Workflow with Natural Language - eko.fellou.ai

TypeScript 4.57 k

13 小时前

kserve / kserve

#计算机科学#Standardized Distributed Generative and Predictive AI Inference Platform for Scalable, Multi-Framework Deployment on Kubernetes

knative 机器学习 model-interpretability model-serving istio kubeflow 人工智能 Tensorflow PyTorch scikit-learn xgboost Kubernetes service-mesh kserve Hacktoberfest mlops genai llm-inference

Python 4.54 k

14 小时前

xlite-dev / Awesome-LLM-Inference

📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

flash-attention tensorrt-llm vllm llm-inference deepseek deepseek-v3 deepseek-r1 qwen3

Python 4.52 k

1 个月前

codelion / openevolve

Open-source implementation of AlphaEvolve

alphacode coding-agent deepmind deepmind-lab discovery distributed-evolutionary-algorithms evolutionary-algorithms evolutionary-computation genetic-algorithm genetic-algorithms iterative-methods iterative-refinement llm-engineering llm-ensemble llm-inference optimize alpha-evolve alphaevolve openevolve

Python 3.9 k

2 天前

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

gpu large-large-models CUDA PyTorch llm-inference jit attention Nvidia distributed-inference moe

Cuda 3.74 k

3 小时前

gpustack / gpustack

#大语言模型#Simple, scalable AI model deployment on GPU clusters

ascend CUDA deepseek distributed-inference genai inference llama llamacpp 大语言模型 maas metal openai qwen rocm vllm mindie llm-inference llm-serving local-ai heterogeneous-cluster

Python 3.71 k

5 天前

katanemo / archgw

The smart edge and AI gateway for agents. Arch is a high-performance proxy server that handles the low-level work in building agents: like applying guardrails, routing prompts to the right agent, and ...

gateway generative-ai llm-inference 大语言模型 prompt proxy proxy-server llmops openai Routing (disambiguation)ai-gateway llm-gateway llm-routing envoy envoyproxy ai-gateway-support llm-proxy

Rust 3.68 k

4 小时前

Website
Wikipedia