kernel-fusion · GitHub Topics

#计算机科学#burn 是一个Rust的深度学习框架

autodiff 深度学习机器学习 Rust scientific-computing ndarray tensor 神经网络 PyTorch cross-platform kernel-fusion onnx WebAssembly webgpu CUDA metal rocm vulkan

Rust 12.88 k

6 小时前

chhzh123 / Krill

An efficient concurrent graph processing system

graph system kernel-fusion

C++ 46

4 年前

wu-kan / GoPTX

GoPTX: Fine-grained GPU Kernel Fusion by PTX-level Instruction Flow Weaving

compile gpu Web Monetization kernel-fusion ptx

HTML 18

2 个月前

nopperl / pytorch-fused-lamb

LAMB go brrr

CUDA kernel-fusion lamb optimizer PyTorch triton

Python 4

1 年前

ParCoreLab / gpu-fusion

GPU fusion code and algorithm

CUDA gpu kernel-fusion

Cuda 1

1 年前

ShkalikovOleh / alpaka_expr_trees

Compile time kernels fusion and expression trees as Alpaka boost.odeint backend. This is my team project developed in collaboration with and under the supervision of HZDR.

CUDA kernel-fusion

C++ 1

2 年前

JonSnow1807 / Fused-LayerNorm-CUDA-Operator

#计算机科学#High-performance CUDA implementation of LayerNorm for PyTorch achieving 1.46x speedup through kernel fusion. Optimized for large language models (4K-8K hidden dims) with vectorized memory access, warp...

CUDA 深度学习 kernel-fusion PyTorch

Python 0

1 个月前