#

post-training-quantization

https://static.github-zh.com/github_avatars/intel?size=40

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

Python 2.49 k
4 小时前
https://static.github-zh.com/github_avatars/666DZY666?size=40

micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Ari...

Python 2.26 k
4 个月前
https://static.github-zh.com/github_avatars/megvii-research?size=40

[IJCAI 2022] FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer

Python 347
2 年前
https://static.github-zh.com/github_avatars/sayakpaul?size=40
Jupyter Notebook 174
3 年前
https://static.github-zh.com/github_avatars/Hsu1023?size=40

#大语言模型#[NeurIPS 2024 Oral🔥] DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs.

Python 166
1 年前
https://static.github-zh.com/github_avatars/ModelTC?size=40

[CVPR 2024 Highlight & TPAMI 2025] This is the official PyTorch implementation of "TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models".

Jupyter Notebook 103
2 个月前
https://static.github-zh.com/github_avatars/ModelTC?size=40

#大语言模型#[ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models"

Python 39
2 年前
https://static.github-zh.com/github_avatars/Sanjana7395?size=40
Jupyter Notebook 37
5 年前
https://static.github-zh.com/github_avatars/zysxmu?size=40

Pytorch implementation of our paper accepted by ECCV 2022-- Fine-grained Data Distribution Alignment for Post-Training Quantization

Python 15
3 年前
https://static.github-zh.com/github_avatars/iszry?size=40

Improved the performance of 8-bit PTQ4DM expecially on FID.

Python 12
2 年前
https://static.github-zh.com/github_avatars/shieldforever?size=40

[ASP-DAC 2025] "NeuronQuant: Accurate and Efficient Post-Training Quantization for Spiking Neural Networks" Official Implementation

Python 11
6 个月前
https://static.github-zh.com/github_avatars/smpanaro?size=40

Post post-training-quantization (PTQ) method for improving LLMs. Unofficial implementation of https://arxiv.org/abs/2309.02784

Python 7
2 年前
https://static.github-zh.com/github_avatars/GongCheng1919?size=40

[CAAI AIR'24] Minimize Quantization Output Error with Bias Compensation

Python 7
6 个月前
loading...
Website
Wikipedia