float8 · GitHub Topics

PyTorch native quantization and sparsity for training and inference

brrr dtypes inference mx PyTorch quantization sparsity training float8 transformer offloading optimizer CUDA llama

Python 2.35 k

1 天前

#计算机科学#A modular, accelerator-ready machine learning framework built in Go that speaks float8/16/32/64. Designed with clean architecture, strong typing, and native concurrency for scalable, production-ready ...

autodiff 深度学习 distributed-training float16 float8 fp8 Go 机器学习神经网络 onnx transformer

Go 4

14 天前

CVasilakis / float8-float32-converter

A library written in C for converting between float8 (8-bit minifloat numbers) and float32 (single-precision floating-point numbers) formats.

floating-point float8

C 2

3 个月前

kshard / float8

minifloat (8-bit float) in Golang

Go float8

Go 2

1 年前

xmc-aalto / elmo

Official Code for the paper ELMO : Efficiency via Low-precision and Peak Memory Optimization in Large Output Spaces (in ICML 2025)

chunking multi-label-classification float8

Python 2

2 个月前