#

fp8

https://static.github-zh.com/github_avatars/NVIDIA?size=40

#计算机科学#A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper, Ada and Blackwell GPUs, to provide better performance with lower memory ut...

Python 2.75 k
2 天前
https://static.github-zh.com/github_avatars/Azure?size=40
Python 621
1 年前
https://static.github-zh.com/github_avatars/intel?size=40

An innovative library for efficient LLM inference via low-bit quantization

C++ 348
1 年前
https://static.github-zh.com/github_avatars/aredden?size=40

Flux diffusion model implementation using quantized fp8 matmul & remaining layers use faster half precision accumulate, which is ~2x faster on consumer devices.

Python 280
1 年前
https://static.github-zh.com/github_avatars/graphcore-research?size=40
Python 16
1 年前
https://static.github-zh.com/github_avatars/MurrellGroup?size=40

Narrow precision floating point types

Julia 5
10 天前
https://static.github-zh.com/github_avatars/zerfoo?size=40

#计算机科学#A modular, accelerator-ready machine learning framework built in Go that speaks float8/16/32/64. Designed with clean architecture, strong typing, and native concurrency for scalable, production-ready ...

Go 4
1 个月前
https://static.github-zh.com/github_avatars/zsxkib?size=40

Cog Single GPU Quantized Implementation of Step-Video-T2V

Python 1
7 个月前
https://static.github-zh.com/github_avatars/mukullokhande99?size=40

Python implementations for multi-precision quantization in computer vision and sensor fusion workloads, targeting the XR-NPE Mixed-Precision SIMD Neural Processing Engine. The code includes visual ine...

Jupyter Notebook 1
1 个月前
https://static.github-zh.com/github_avatars/umangyadav?size=40

FP8 dtypes enumeration in python

C++ 0
2 年前
https://static.github-zh.com/github_avatars/STX-AI?size=40
Python 0
22 天前
Website
Wikipedia