GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

int8

Website
Wikipedia
https://static.github-zh.com/github_avatars/intel?size=40
intel / neural-compressor

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

low-precisionpruningsparsityauto-tuningknowledge-distillationquantizationquantization-aware-trainingpost-training-quantizationsmoothquantlarge-language-modelsgptqint8
Python 2.43 k
3 天前
https://static.github-zh.com/github_avatars/intel?size=40
intel / neural-speed

An innovative library for efficient LLM inference via low-bit quantization

cpufp8gpuint8llm-inferencesparsityllamacpp
C++ 349
10 个月前
https://static.github-zh.com/github_avatars/clancylian?size=40
clancylian / retinaface

Reimplement RetinaFace use C++ and TensorRT

retinafacetensorrtint8caffe
C++ 297
6 年前
https://static.github-zh.com/github_avatars/Wulingtian?size=40
Wulingtian / yolov5_tensorrt_int8_tools

tensorrt int8 量化yolov5 onnx模型

yolov5tensorrtonnxint8
Python 183
4 年前
https://static.github-zh.com/github_avatars/Wulingtian?size=40
Wulingtian / yolov5_tensorrt_int8

TensorRT int8 量化部署 yolov5s 模型,实测3.3ms一帧!

yolov5tensorrtint8
C++ 168
4 年前
https://static.github-zh.com/github_avatars/xuanandsix?size=40
xuanandsix / Tensorrt-int8-quantization-pipline

a simple pipline of int8 quantization based on tensorrt.

int8quantizationtensorrtyolox
Python 64
3 年前
https://static.github-zh.com/github_avatars/Wulingtian?size=40
Wulingtian / RepVGG_TensorRT_int8

RepVGG TensorRT int8 量化,实测推理不到1ms一帧!

repvggtensorrtint8
Python 63
4 年前
https://static.github-zh.com/github_avatars/the0807?size=40
the0807 / YOLOv8-ONNX-TensorRT

👀 Apply YOLOv8 exported with ONNX or TensorRT(FP16, INT8) to the Real-time camera

onnxtensorrtyolov8int8机器视觉object-detection
Python 49
1 年前
https://static.github-zh.com/github_avatars/Wulingtian?size=40
Wulingtian / nanodet_tensorrt_int8

nanodet int8 量化,实测推理2ms一帧!

nanodettensorrtint8
C++ 37
4 年前
https://static.github-zh.com/github_avatars/ppogg?size=40
ppogg / ncnn-yolov4-int8

NCNN+Int8+YOLOv4 quantitative modeling and real-time inference

ncnnyolov4int8real-time
C++ 24
4 年前
https://static.github-zh.com/github_avatars/whitelok?size=40
whitelok / tensorrt-int8-python-sample

#计算机科学#TensorRT Int8 Python version sample. TensorRT Int8 Python 实现例子。TensorRT Int8 Pythonの例です

tensorrt人工智能深度学习NvidiainferencePythonint8机器学习
Python 14
6 年前
https://static.github-zh.com/github_avatars/Egorundel?size=40
Egorundel / int8_calibrator_cpp

INT8 calibrator for ONNX model with dynamic batch_size at the input and NMS module at the output. C++ Implementation.

calibrationC++int8tensorrtonnx
C++ 13
8 个月前
https://static.github-zh.com/github_avatars/aahouzi?size=40
aahouzi / llama2-chatbot-cpu

#大语言模型#A LLaMA2-7b chatbot with memory running on CPU, and optimized using smooth quantization, 4-bit quantization or Intel® Extension For PyTorch with bfloat16.

bfloat16cpuint8langchainllama2optimizationStreamlit聊天机器人huggingfaceChatGPTllamaintelmeta
Python 12
1 年前
https://static.github-zh.com/github_avatars/cbalint13?size=40
cbalint13 / rvv-kernels

RISCV Vector Kernel C/LLVM-IR generator

int8KernelLLVM数学RISC-Vtvmvector
C 7
6 个月前
https://static.github-zh.com/github_avatars/dasdristanta13?size=40
dasdristanta13 / LLM-Lora-PEFT_accumulate

#大语言模型#LLM-Lora-PEFT_accumulate explores optimizations for Large Language Models (LLMs) using PEFT, LORA, and QLORA. Contribute experiments and implementations to enhance LLM efficiency. Join discussions and...

alpacaint8大语言模型lorapeftqlorafalconllama
Jupyter Notebook 6
2 年前
https://static.github-zh.com/github_avatars/egbertYeah?size=40
egbertYeah / mt-yolov6_tensorrt

MT-Yolov6 TensorRT Inference with Python.

tensorrtint8yolov6
Python 6
3 年前
https://static.github-zh.com/github_avatars/yester31?size=40
yester31 / Quantization_EX

quantization example for pqt & qat

quantization-aware-trainingquantizationtensorrtpost-training-quantizationint8
Python 6
2 年前
https://static.github-zh.com/github_avatars/JohnClaw?size=40
JohnClaw / chatllm.cs

#大语言模型#C# api wrapper for llm-inference chatllm.cpp

APIbindingschatllmC#gemmaggmlint8llamallm-inferencemistralqwencpu-inferenceinference大语言模型
C# 4
7 个月前
https://static.github-zh.com/github_avatars/JohnClaw?size=40
JohnClaw / chatllm.vb

VB.NET api wrapper for llm-inference chatllm.cpp

APIbindingschatllmcpu-inferencegemmaggmlint8llamallm-inferencemistralqwen
Visual Basic .NET 4
7 个月前
https://static.github-zh.com/github_avatars/stdlib-js?size=40
stdlib-js / constants-int8

8-bit signed integer mathematical constants.

Node.jsJavaScriptstdlib数学standardLibraryconstantsnamespaceint88-bit8bitbyte
JavaScript 2
3 个月前
loading...