GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

tensorrt-llm

Website
Wikipedia
xlite-dev/Awesome-LLM-Inference
https://static.github-zh.com/github_avatars/xlite-dev?size=40
xlite-dev / Awesome-LLM-Inference

📚A curated list of Awesome LLM Inference Papers with Codes.

flash-attentiontensorrt-llmvllmllm-inferencedeepseekdeepseek-v3deepseek-r1qwen3
Python 4.12 k
7 天前
https://static.github-zh.com/github_avatars/collabora?size=40
collabora / WhisperLive

A nearly-live implementation of OpenAI's Whisper.

dictationobsopenaitext-to-speechtranslationvoice-recognitionWhispertensorrttensorrt-llmwhisper-tensorrtopenvino
Python 2.96 k
14 天前
https://static.github-zh.com/github_avatars/shashikg?size=40
shashikg / WhisperS2T

#计算机科学#An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engine

asr深度学习speech-recognitionspeech-to-textWhispertensorrt-llmtensorrtvadvoice-activity-detection
Jupyter Notebook 427
10 个月前
https://static.github-zh.com/github_avatars/huggingface?size=40
huggingface / optimum-benchmark

🏋️ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of Optimum's hardware optimizations & quantization schemes.

benchmarkonnxruntimeopenvinoPyTorchtensorrt-llm
Python 302
18 天前
https://static.github-zh.com/github_avatars/coderonion?size=40
coderonion / awesome-cuda-and-hpc

#大语言模型#🚀🚀🚀 This repository lists some awesome public CUDA, cuda-python, cuBLAS, cuDNN, CUTLASS, TensorRT, TensorRT-LLM, Triton, TVM, MLIR, PTX and High Performance Computing (HPC) projects.

CUDAcublastensorrtAwesome Lists大语言模型gpublasPyTorchhpcgemmllamacudnntritontensorrt-llmcutlassmlirtvmdeepseekptxvlm
278
16 天前
https://static.github-zh.com/github_avatars/npuichigo?size=40
npuichigo / openai_trtllm

#大语言模型#OpenAI compatible API for TensorRT LLM triton backend

langchain大语言模型openai-apitensorrt-llmtriton-inference-server
Rust 209
10 个月前
https://static.github-zh.com/github_avatars/NetEase-Media?size=40
NetEase-Media / grps

Deep Learning Deployment Framework: Supports tf/torch/trt/trtllm/vllm and other NN frameworks. Support dynamic batching, and streaming modes. It is dual-language compatible with Python and C++, offeri...

Tensorflowtensorrttorchvllmservingtriton-inference-servertensorrt-llm
C++ 165
1 个月前
https://static.github-zh.com/github_avatars/NetEase-Media?size=40
NetEase-Media / grps_trtllm

#大语言模型#Higher performance OpenAI LLM service than vLLM serve: A pure C++ high-performance OpenAI LLM service implemented with GPRS+TensorRT-LLM+Tokenizers.cpp, supporting chat and function call, AI agents, d...

大语言模型openaitensorrt-llmchatglmllama3qwen2function-callai-agentllama-indexmulti-modaldeepseek-r1phiqwqqwen2-vlminicpm-vinternvlqwen3
Python 140
1 个月前
https://static.github-zh.com/github_avatars/openhackathons-org?size=40
openhackathons-org / End-to-End-LLM

#自然语言处理#This repository is an AI Bootcamp material that consist of a workflow for LLM

深度学习自然语言处理p-tuningprompt-tuning大语言模型question-answeringtensorrt-llmgenai
Jupyter Notebook 90
2 个月前
https://static.github-zh.com/github_avatars/vossr?size=40
vossr / Chat-With-RTX-python-api

#大语言模型#Chat With RTX Python API

大语言模型llm-inferencemistral-7btensorrttensorrt-llm
Python 65
1 个月前
https://static.github-zh.com/github_avatars/guidance-ai?size=40
guidance-ai / llgtrt

TensorRT-LLM server with Structured Outputs (JSON) built with Rust

guidanceopenai-apitensorrt-llmcfgJSONRegular expressionstructured-generation
Rust 55
2 个月前
https://static.github-zh.com/github_avatars/argonne-lcf?size=40
argonne-lcf / LLM-Inference-Bench

#大语言模型#LLM-Inference-Bench

benchmarkdeepspeedinferencellamacpp大语言模型tensorrt-llmvllm
Jupyter Notebook 44
5 天前
https://static.github-zh.com/github_avatars/fgblanch?size=40
fgblanch / OutlookLLM

Add-in for new Outlook that adds LLM new features (Composition, Summarizing, Q&A). It uses a local LLM via Nvidia TensorRT-LLM

tensorrt-llm
Python 39
10 天前
https://static.github-zh.com/github_avatars/lix19937?size=40
lix19937 / llm-deploy

#大语言模型#AI Infra LLM infer/ tensorrt-llm/ vllm

大语言模型llm-inferencetensorrt-llm
Python 20
6 个月前
https://static.github-zh.com/github_avatars/modal-labs?size=40
modal-labs / stopwatch

#计算机科学#A tool for benchmarking LLMs on Modal

大语言模型机器学习tensorrt-llmvllm
Python 20
6 天前
https://static.github-zh.com/github_avatars/zRzRzRzRzRzRzR?size=40
zRzRzRzRzRzRzR / lm-fly

#大语言模型#大模型推理框架加速,让 LLM 飞起来

大语言模型llm-inferenceMLXopenvinotensorrt-llmvllm
Python 18
1 年前
https://static.github-zh.com/github_avatars/CactusQ?size=40
CactusQ / TensorRT-LLM-Tutorial

Getting started with TensorRT-LLM using BLOOM as a case study

深度学习Jupyter Notebookllm-inference大语言模型tensorrttensorrt-llm
Jupyter Notebook 18
1 年前
https://static.github-zh.com/github_avatars/EdVince?size=40
EdVince / whisper-trtllm

Whisper in TensorRT-LLM

asrCUDAhuggingfaceopenaitensorrttensorrt-llmtransformersWhisper
C++ 15
2 年前
https://static.github-zh.com/github_avatars/Delxrius?size=40
Delxrius / MiniMax-01

#大语言模型#MiniMax-01 is a simple implementation of the MiniMax algorithm, a widely used strategy for decision-making in two-player turn-based games like Tic-Tac-Toe. The algorithm aims to minimize the maximum p...

chat-api聊天机器人deepseekdeepseek-v3flash-attention大语言模型llm-inferenceminimaxtensorrt-llm
4
3 天前
https://static.github-zh.com/github_avatars/wcks13589?size=40
wcks13589 / LLM-Tutorial

#大语言模型#LLM tutorial materials include but not limited to NVIDIA NeMo, TensorRT-LLM, Triton Inference Server, and NeMo Guardrails.

nemotensorrt-llm大语言模型
Jupyter Notebook 2
14 天前
loading...