#大语言模型#《李宏毅深度学习教程》(李宏毅老师推荐👍,苹果书🍎),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
Neural Network Distiller by Intel AI Lab: a Python package for neural network compression research. https://intellabs.github.io/distiller
#自然语言处理#Sparsity-aware deep learning inference runtime for CPUs
#大语言模型#[CVPR 2023] DepGraph: Towards Any Structural Pruning; LLMs, Vision Foundation Models, etc.
#Awesome#A curated list of neural network pruning resources.
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
#计算机科学#AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.
micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Ari...
#自然语言处理#Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models
Practical course about Large Language Models.
OpenMMLab Model Compression Toolbox and Benchmark.
PaddleSlim is an open-source library for deep model compression and architecture search.
Config driven, easy backup cli for restic.
#计算机科学#A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
Efficient computing methods developed by Huawei Noah's Ark Lab
#自然语言处理#Neural Network Compression Framework for enhanced OpenVINO™ inference
#大语言模型#[NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support Llama-3/3.1, Llama-2, LLaMA, BLOOM, Vicuna, Baichuan, TinyLlama, etc.
#计算机科学#PyTorch Implementation of [1611.06440] Pruning Convolutional Neural Networks for Resource Efficient Inference
mobilev2-yolov5s剪枝、蒸馏,支持ncnn,tensorRT部署。ultra-light but better performence!
#计算机科学#TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework.