#

speculative-decoding

https://static.github-zh.com/github_avatars/intel?size=40

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

Python 2.17 k
1 年前
https://static.github-zh.com/github_avatars/SafeAILab?size=40

Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3.

Python 1.78 k
7 天前
https://static.github-zh.com/github_avatars/Infini-AI-Lab?size=40
Python 358
8 个月前
https://static.github-zh.com/github_avatars/facebookresearch?size=40

#大语言模型#Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024

Python 335
4 个月前
https://static.github-zh.com/github_avatars/Infini-AI-Lab?size=40

#大语言模型#[COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

Python 262
1 年前
https://static.github-zh.com/github_avatars/FasterDecoding?size=40

REST: Retrieval-Based Speculative Decoding, NAACL 2024

C 208
5 天前
https://static.github-zh.com/github_avatars/Infini-AI-Lab?size=40
Python 124
6 个月前
https://static.github-zh.com/github_avatars/Tencent?size=40

#大语言模型#Model compression toolkit engineered for enhanced usability, comprehensiveness, and efficiency.

Python 123
1 天前
https://static.github-zh.com/github_avatars/bigai-nlco?size=40

[ICML 2025] |TokenSwift: Lossless Acceleration of Ultra Long Sequence Generation

Python 113
4 个月前
https://static.github-zh.com/github_avatars/kssteven418?size=40
Python 94
2 年前
https://static.github-zh.com/github_avatars/romsto?size=40

#大语言模型#Implementation of the paper Fast Inference from Transformers via Speculative Decoding, Leviathan et al. 2023.

Python 78
9 个月前
https://static.github-zh.com/github_avatars/hemingkx?size=40

[ICLR 2025] SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration

Python 55
7 个月前
https://static.github-zh.com/github_avatars/BaohaoLiao?size=40

[ICML 2025] Reward-guided Speculative Decoding (RSD) for efficiency and effectiveness.

Python 46
4 个月前
https://static.github-zh.com/github_avatars/hemingkx?size=40

Codes for our paper "Speculative Decoding: Exploiting Speculative Execution for Accelerating Seq2seq Generation" (EMNLP 2023 Findings)

Python 44
2 年前
https://static.github-zh.com/github_avatars/vladislavkruglikov?size=40

#自然语言处理#Pretty and simple to use implementation of speculative decoding algorithm eagle which is extrapolation algorithm for greater language model efficiency 🦅

Python 36
2 个月前
https://static.github-zh.com/github_avatars/hyx1999?size=40

Official Implementation of SAM-Decoding: Speculative Decoding via Suffix Automaton

Python 32
7 个月前
https://static.github-zh.com/github_avatars/AutonomicPerfectionist?size=40

#大语言模型#PipeInfer: Accelerating LLM Inference using Asynchronous Pipelined Speculation

C++ 30
10 个月前
https://static.github-zh.com/github_avatars/mscheong01?size=40
C 25
1 年前
https://static.github-zh.com/github_avatars/jadohu?size=40

Official Implementation of LANTERN (ICLR'25) and LANTERN++(ICLRW-SCOPE'25)

Python 17
6 个月前
loading...
Website
Wikipedia