GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

speculative-decoding

Website
Wikipedia
https://static.github-zh.com/github_avatars/intel?size=40
intel / intel-extension-for-transformers

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

大语言模型聊天机器人4-bitsllm-inferencellm-cpuchatpdfstreamingllmintel-optimized-llamacppspeculative-decodinghabanaragretrieval
Python 2.17 k
8 个月前
https://static.github-zh.com/github_avatars/aphrodite-engine?size=40
aphrodite-engine / aphrodite-engine

#计算机科学#Large-scale LLM inference engine

APIinference-engine机器学习CUDAinferentiarocmintelloraspeculative-decodingtpu
C++ 1.45 k
5 天前
https://static.github-zh.com/github_avatars/SafeAILab?size=40
SafeAILab / EAGLE

Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3.

large-language-modelsllm-inferencespeculative-decoding
Python 1.29 k
14 天前
https://static.github-zh.com/github_avatars/Infini-AI-Lab?size=40
Infini-AI-Lab / Sequoia

#大语言模型#scalable and robust tree-based speculative decoding algorithm

efficiencyinference大语言模型speculative-decoding
Python 346
5 个月前
https://static.github-zh.com/github_avatars/facebookresearch?size=40
facebookresearch / LayerSkip

#大语言模型#Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024

大语言模型optimizationtransformersspeculative-decoding
Python 297
1 个月前
https://static.github-zh.com/github_avatars/Infini-AI-Lab?size=40
Infini-AI-Lab / TriForce

#大语言模型#[COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

acceleration大语言模型long-contextspeculative-decodingllm-inferenceefficiencyinference
Python 254
10 个月前
https://static.github-zh.com/github_avatars/FasterDecoding?size=40
FasterDecoding / REST

REST: Retrieval-Based Speculative Decoding, NAACL 2024

llm-inferenceretrievalspeculative-decoding
C 201
6 个月前
https://static.github-zh.com/github_avatars/Infini-AI-Lab?size=40
Infini-AI-Lab / UMbreLLa

LLM Inference on consumer devices

llm-inferenceoffloadingspeculative-decoding
Python 115
3 个月前
https://static.github-zh.com/github_avatars/kssteven418?size=40
kssteven418 / BigLittleDecoder

#大语言模型#[NeurIPS'23] Speculative Decoding with Big Little Decoder

decodingefficient-inference大语言模型speculative-decoding
Python 92
1 年前
https://static.github-zh.com/github_avatars/bigai-nlco?size=40
bigai-nlco / TokenSwift

[ICML 2025] | From Hours to Minutes: Lossless Acceleration of Ultra Long Sequence Generation

deepseekinferencellm-inferencellm-serving大语言模型qwenspeculative-decodingtransformer
Python 90
1 个月前
https://static.github-zh.com/github_avatars/romsto?size=40
romsto / Speculative-Decoding

#大语言模型#Implementation of the paper Fast Inference from Transformers via Speculative Decoding, Leviathan et al. 2023.

大语言模型llm-inferencespeculative-decoding
Python 57
6 个月前
https://static.github-zh.com/github_avatars/hemingkx?size=40
hemingkx / SWIFT

[ICLR 2025] SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration

speculative-decoding
Python 48
4 个月前
https://static.github-zh.com/github_avatars/hemingkx?size=40
hemingkx / SpecDec

Codes for our paper "Speculative Decoding: Exploiting Speculative Execution for Accelerating Seq2seq Generation" (EMNLP 2023 Findings)

speculative-decodingnon-autoregressive
Python 41
2 年前
https://static.github-zh.com/github_avatars/BaohaoLiao?size=40
BaohaoLiao / RSD

[ICML 2025] Reward-guided Speculative Decoding (RSD) for efficiency and effectiveness.

efficiencylarge-language-modelsreasoningspeculative-decoding
Python 29
1 个月前
https://static.github-zh.com/github_avatars/AutonomicPerfectionist?size=40
AutonomicPerfectionist / PipeInfer

#大语言模型#PipeInfer: Accelerating LLM Inference using Asynchronous Pipelined Speculation

inferencellamacpp大语言模型speculative-decoding
C++ 28
7 个月前
https://static.github-zh.com/github_avatars/hyx1999?size=40
hyx1999 / SAM-Decoding

Official Implementation of SAM-Decoding: Speculative Decoding via Suffix Automaton

speculative-decoding
Python 26
4 个月前
https://static.github-zh.com/github_avatars/mscheong01?size=40
mscheong01 / speculative_decoding.c

#大语言模型#minimal C implementation of speculative decoding based on llama2.c

人工智能Cllama2大语言模型speculative-decoding
C 22
1 年前
https://static.github-zh.com/github_avatars/jadohu?size=40
jadohu / LANTERN

Official Implementation of LANTERN (ICLR'25) and LANTERN++(ICLRW-SCOPE'25)

speculative-decoding
Python 13
3 个月前
https://static.github-zh.com/github_avatars/ccs96307?size=40
ccs96307 / fast-llm-inference

Accelerating LLM inference with techniques like speculative decoding, quantization, and kernel fusion, focusing on implementing state-of-the-art research papers.

accelerationinference-optimizationlarge-language-modelsspeculative-decoding
Jupyter Notebook 8
1 个月前
https://static.github-zh.com/github_avatars/hsj576?size=40
hsj576 / GRIFFIN

Official Implementation of "GRIFFIN: Effective Token Alignment for Faster Speculative Decoding"

large-language-modelsllm-inferencespeculative-decoding
Python 6
4 个月前
loading...