GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

fastertransformer

Website
Wikipedia
https://static.github-zh.com/github_avatars/InternLM?size=40
InternLM / lmdeploy

#大语言模型#LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

cuda-kernelsdeepspeedfastertransformerllm-inferenceturbomindinternlmllama大语言模型codellamallama2llama3
Python 6.52 k
2 天前
https://static.github-zh.com/github_avatars/Curt-Park?size=40
Curt-Park / serving-codegen-gptj-triton

Serving Example of CodeGen-350M-Mono-GPTJ on Triton Inference Server with Docker and Kubernetes

codegenDockerfastertransformerKubernetestriton-inference-serverPyTorchhuggingface-transformers
Python 20
2 年前
https://static.github-zh.com/github_avatars/detail-novelist?size=40
detail-novelist / novelist-triton-server

Deploy KoGPT with Triton Inference Server

fastertransformerhuggingfacekogptlarge-language-modelstransformerstritontriton-inference-server
Shell 14
3 年前
https://static.github-zh.com/github_avatars/clam004?size=40
clam004 / triton-ft-api

tutorial on how to deploy a scalable autoregressive causal language model transformer using nvidia triton server

FastAPIfastertransformergpthuggingfaceNvidianvidia-docker
Python 5
3 年前
https://static.github-zh.com/github_avatars/RajeshThallam?size=40
RajeshThallam / fastertransformer-converter

#大语言模型#This repository is a code sample to serve Large Language Models (LLM) on a Google Kubernetes Engine (GKE) cluster with GPUs running NVIDIA Triton Inference Server with FasterTransformer backend.

fastertransformergkeGoogle 云inference大语言模型triton-inference-server
Python 0
2 年前