GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

cuda-kernels

Website
Wikipedia
https://static.github-zh.com/github_avatars/NVIDIA?size=40
NVIDIA / cuda-samples

CUDA 开发人员使用的示例,演示了 CUDA 工具包中的功能

CUDAcuda-kernelscuda-openglcuda-driver-api
C 7.59 k
24 天前
https://static.github-zh.com/github_avatars/InternLM?size=40
InternLM / lmdeploy

#大语言模型#LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

cuda-kernelsdeepspeedfastertransformerllm-inferenceturbomindinternlmllama大语言模型codellamallama2llama3
Python 6.52 k
2 天前
xlite-dev/LeetCUDA
https://static.github-zh.com/github_avatars/xlite-dev?size=40
xlite-dev / LeetCUDA

📚LeetCUDA: 200+ CUDA/Tensor Cores Kernels, HGEMM, FA-2 MMA.

CUDAcuda-kernelsflash-attentioncuda-librarycuda-cpp
Cuda 4.76 k
5 天前
https://static.github-zh.com/github_avatars/Rust-GPU?size=40
Rust-GPU / Rust-CUDA

Ecosystem of libraries and tools for writing and executing fast GPU code fully in Rust.

CUDAcuda-kernelscuda-programminggpgpugpugpu-programmingRust
Rust 4.45 k
20 天前
https://static.github-zh.com/github_avatars/coreylowman?size=40
coreylowman / dfdx

#计算机科学#Deep learning in Rust, with shape checked tensors and neural networks

Rustautogradautodiff机器学习神经网络backpropagationtensor深度学习深度神经网络CUDAcuda-kernelsgpugpu-accelerationgpu-computingcudnn
Rust 1.82 k
1 年前
https://static.github-zh.com/github_avatars/NVIDIA?size=40
NVIDIA / cccl

CUDA Core Compute Libraries

accelerated-computingC++cpp-programmingCUDAcuda-cppcuda-kernelscuda-librarycuda-programminggpugpu-accelerationgpu-computinggpu-programminghpcNvidiaparallel-algorithmparallel-computingparallel-programmingmodern-cpp
C++ 1.69 k
2 天前
https://static.github-zh.com/github_avatars/coreylowman?size=40
coreylowman / cudarc

Safe rust wrapper around CUDA toolkit

CUDAcuda-programminggpugpu-accelerationRustcublascurandcuda-kernelscudnnnccl
Rust 857
1 个月前
https://static.github-zh.com/github_avatars/NVIDIA?size=40
NVIDIA / nvbench

CUDA Kernel Benchmarking Library

benchmarkcuda-kernelsCUDAperformanceNvidiagpu
Cuda 664
2 天前
https://static.github-zh.com/github_avatars/harrism?size=40
harrism / hemi

Simple utilities to enable code reuse and portability between CUDA C/C++ and standard C/C++.

CUDAgpucuda-kernelsC++
C++ 347
3 年前
https://static.github-zh.com/github_avatars/KernelTuner?size=40
KernelTuner / kernel_tuner

#计算机科学#Kernel Tuner

cuda-kernelsPythongpuCUDAopenclCC++auto-tuninggpu-computingTesting软件工程optimization机器学习
Python 344
3 天前
https://static.github-zh.com/github_avatars/jaredhoberock?size=40
jaredhoberock / stanford-cs193g-sp2010

This is an archive of materials produced for an introductory class on CUDA programming at Stanford University in 2010

CUDAcuda-kernelscuda-programminggpu-programming
C++ 219
3 年前
https://static.github-zh.com/github_avatars/HMUNACHI?size=40
HMUNACHI / cuda-tutorials

#计算机科学#CUDA tutorials for Maths & ML tutorials with examples, covers multi-gpus, fused attention, winograd convolution, reinforcement learning.

CUDAcuda-kernelscuda-programming机器学习maths
Cuda 182
4 天前
https://static.github-zh.com/github_avatars/deepakkumar1984?size=40
deepakkumar1984 / Amplifier.NET

Amplifier allows .NET developers to easily run complex applications with intensive mathematical computation on Intel CPU/GPU, NVIDIA, AMD without writing any additional C kernel code. Write your funct...

openclcuda-kernels编译器gpgpugpgpu-computingsimd
C# 180
2 个月前
https://static.github-zh.com/github_avatars/PatWie?size=40
PatWie / cuda-design-patterns

Some CUDA design patterns and a bit of template magic for CUDA

CUDAC++template-metaprogrammingcuda-kernelsgpubazel
C++ 154
2 年前
https://static.github-zh.com/github_avatars/tudelft?size=40
tudelft / cuSNN

Spiking Neural Networks in C++ with strong GPU acceleration through CUDA

CUDAcuda-kernels神经网络
Cuda 129
5 年前
https://static.github-zh.com/github_avatars/alexzhang13?size=40
alexzhang13 / flashattention2-custom-mask

#计算机科学#Triton implementation of FlashAttention2 that adds Custom Masks.

attentionattention-mechanismcuda-kernels深度学习flash-attentiontriton
Python 119
10 个月前
https://static.github-zh.com/github_avatars/eyalroz?size=40
eyalroz / cuda-kat

#算法刷题#CUDA kernel author's tools

CUDAcuda-kernelsutility-libraryC++constexpr算法patternsmodern-cppgpu-programminggpucuda-librarycuda-programmingprintf
Cuda 111
3 年前
https://static.github-zh.com/github_avatars/wangsiping97?size=40
wangsiping97 / FastGEMV

#计算机科学#High-speed GEMV kernels, at most 2.7x speedup compared to pytorch baseline.

CUDAcuda-kernels机器学习optimization
Cuda 109
1 年前
https://static.github-zh.com/github_avatars/microsoft?size=40
microsoft / Accera

Open source cross-platform compiler for compute-intensive loops used in AI algorithms, from Microsoft Research

cross-platformpython-libraryresearch机器学习gpu-accelerationtuning-parameterscross-compiler编译器cuda-kernels
C++ 109
2 年前
https://static.github-zh.com/github_avatars/microsoft?size=40
microsoft / TileFusion

TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing.

C++cuda-kernels
Cuda 88
9 天前
loading...