GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

speculative-decoding

Website
Wikipedia
https://static.github-zh.com/github_avatars/intel?size=40
intel / intel-extension-for-transformers

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

大语言模型聊天机器人4-bitsllm-inferencellm-cpuchatpdfstreamingllmintel-optimized-llamacppspeculative-decodinghabanaragretrieval
Python 2.17 k
10 个月前
https://static.github-zh.com/github_avatars/aphrodite-engine?size=40
aphrodite-engine / aphrodite-engine

#计算机科学#Large-scale LLM inference engine

APIinference-engine机器学习CUDAinferentiarocmintelloraspeculative-decodingtpu
C++ 1.49 k
2 天前
https://static.github-zh.com/github_avatars/SafeAILab?size=40
SafeAILab / EAGLE

Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3.

large-language-modelsllm-inferencespeculative-decoding
Python 1.44 k
3 天前
https://static.github-zh.com/github_avatars/Infini-AI-Lab?size=40
Infini-AI-Lab / Sequoia

#大语言模型#scalable and robust tree-based speculative decoding algorithm

efficiencyinference大语言模型speculative-decoding
Python 351
6 个月前
https://static.github-zh.com/github_avatars/facebookresearch?size=40
facebookresearch / LayerSkip

#大语言模型#Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024

大语言模型optimizationtransformersspeculative-decoding
Python 323
3 个月前
https://static.github-zh.com/github_avatars/Infini-AI-Lab?size=40
Infini-AI-Lab / TriForce

#大语言模型#[COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

acceleration大语言模型long-contextspeculative-decodingllm-inferenceefficiencyinference
Python 261
1 年前
https://static.github-zh.com/github_avatars/FasterDecoding?size=40
FasterDecoding / REST

REST: Retrieval-Based Speculative Decoding, NAACL 2024

llm-inferenceretrievalspeculative-decoding
C 205
8 个月前
https://static.github-zh.com/github_avatars/Infini-AI-Lab?size=40
Infini-AI-Lab / UMbreLLa

LLM Inference on consumer devices

llm-inferenceoffloadingspeculative-decoding
Python 123
4 个月前
https://static.github-zh.com/github_avatars/bigai-nlco?size=40
bigai-nlco / TokenSwift

[ICML 2025] |TokenSwift: Lossless Acceleration of Ultra Long Sequence Generation

deepseekinferencellm-inferencellm-serving大语言模型qwenspeculative-decodingtransformer
Python 110
2 个月前
https://static.github-zh.com/github_avatars/kssteven418?size=40
kssteven418 / BigLittleDecoder

#大语言模型#[NeurIPS'23] Speculative Decoding with Big Little Decoder

decodingefficient-inference大语言模型speculative-decoding
Python 93
1 年前
https://static.github-zh.com/github_avatars/romsto?size=40
romsto / Speculative-Decoding

#大语言模型#Implementation of the paper Fast Inference from Transformers via Speculative Decoding, Leviathan et al. 2023.

大语言模型llm-inferencespeculative-decoding
Python 72
8 个月前
https://static.github-zh.com/github_avatars/hemingkx?size=40
hemingkx / SWIFT

[ICLR 2025] SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration

speculative-decoding
Python 52
5 个月前
https://static.github-zh.com/github_avatars/hemingkx?size=40
hemingkx / SpecDec

Codes for our paper "Speculative Decoding: Exploiting Speculative Execution for Accelerating Seq2seq Generation" (EMNLP 2023 Findings)

speculative-decodingnon-autoregressive
Python 42
2 年前
https://static.github-zh.com/github_avatars/BaohaoLiao?size=40
BaohaoLiao / RSD

[ICML 2025] Reward-guided Speculative Decoding (RSD) for efficiency and effectiveness.

efficiencylarge-language-modelsreasoningspeculative-decoding
Python 40
3 个月前
https://static.github-zh.com/github_avatars/vladislavkruglikov?size=40
vladislavkruglikov / eagle

#自然语言处理#Pretty and simple to use implementation of speculative decoding algorithm eagle which is extrapolation algorithm for greater language model efficiency 🦅

大语言模型llm-inference自然语言处理speculative-decoding
Python 36
17 天前
https://static.github-zh.com/github_avatars/Tencent?size=40
Tencent / AngelSlim

#大语言模型#Model compression toolkit engineered for enhanced usability, comprehensiveness, and efficiency.

大语言模型quantizationspeculative-decodingdiffusionvlm
Python 31
4 天前
https://static.github-zh.com/github_avatars/AutonomicPerfectionist?size=40
AutonomicPerfectionist / PipeInfer

#大语言模型#PipeInfer: Accelerating LLM Inference using Asynchronous Pipelined Speculation

inferencellamacpp大语言模型speculative-decoding
C++ 30
9 个月前
https://static.github-zh.com/github_avatars/hyx1999?size=40
hyx1999 / SAM-Decoding

Official Implementation of SAM-Decoding: Speculative Decoding via Suffix Automaton

speculative-decoding
Python 29
6 个月前
https://static.github-zh.com/github_avatars/mscheong01?size=40
mscheong01 / speculative_decoding.c

#大语言模型#minimal C implementation of speculative decoding based on llama2.c

人工智能Cllama2大语言模型speculative-decoding
C 24
1 年前
https://static.github-zh.com/github_avatars/jadohu?size=40
jadohu / LANTERN

Official Implementation of LANTERN (ICLR'25) and LANTERN++(ICLRW-SCOPE'25)

speculative-decoding
Python 16
5 个月前
loading...