GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

nccl

Website
Wikipedia
https://static.github-zh.com/github_avatars/cupy?size=40
cupy / cupy

NumPy & SciPy for GPU

CUDAcudnncublascusolverncclPythonNumPycupycurandcusparsegpuSciPytensorrocm
Python 10.28 k
2 天前
https://static.github-zh.com/github_avatars/coreylowman?size=40
coreylowman / cudarc

Safe rust wrapper around CUDA toolkit

CUDAcuda-programminggpugpu-accelerationRustcublascurandcuda-kernelscudnnnccl
Rust 857
1 个月前
https://static.github-zh.com/github_avatars/huggingface?size=40
huggingface / llm_training_handbook

#自然语言处理#An open collection of methodologies to help with successful training of large language models.

CUDAlarge-language-models大语言模型nccl自然语言处理performancePythonPyTorchscalabilitytroubleshooting
Python 493
1 年前
https://static.github-zh.com/github_avatars/huggingface?size=40
huggingface / large_language_model_training_playbook

#自然语言处理#An open collection of implementation tips, tricks and resources for training large language models

CUDA大语言模型nccl自然语言处理performancePythonPyTorchscalabilitytroubleshootinglarge-language-models
Python 475
2 年前
https://static.github-zh.com/github_avatars/LambdaLabsML?size=40
LambdaLabsML / distributed-training-guide

Best practices & guides on how to write distributed pytorch training code

CUDAdeepspeeddistributed-traininggpugpu-clusterkuberentesncclPyTorchslurmclustermpisharding
Python 435
4 个月前
https://static.github-zh.com/github_avatars/FZJ-JSC?size=40
FZJ-JSC / tutorial-multi-gpu

Efficient Distributed GPU Programming for Exascale, an SC/ISC Tutorial

hpcmpigpuncclCUDA
Cuda 269
5 天前
https://static.github-zh.com/github_avatars/Bluefog-Lib?size=40
Bluefog-Lib / bluefog

#计算机科学#Distributed and decentralized training framework for PyTorch over graph

mpidistributed-computing深度学习机器学习PyTorchdecentralizedasynchronousone-sidednccl
Python 256
1 年前
https://static.github-zh.com/github_avatars/microsoft?size=40
microsoft / msrflute

#计算机科学#Federated Learning Utilities and Tools for Experimentation

federated-learninggloo机器学习ncclpersonalizationprivacy-toolsPyTorchSimulation
Python 190
1 年前
https://static.github-zh.com/github_avatars/google?size=40
google / nccl-fastsocket

#计算机科学#NCCL Fast Socket is a transport layer plugin to improve NCCL collective communication performance on Google Cloud.

nccltraining机器学习
C++ 116
2 年前
https://static.github-zh.com/github_avatars/muriloboratto?size=40
muriloboratto / NCCL

Sample examples of how to call collective operation functions on multi-GPU environments. A simple example of using broadcast, reduce, allGather, reduceScatter and sendRecv operations.

ncclCUDAmpi
34
2 年前
https://static.github-zh.com/github_avatars/JuliaGPU?size=40
JuliaGPU / NCCL.jl

A Julia wrapper for the NVIDIA Collective Communications Library.

Julia 语言CUDAgpunccl
Julia 27
10 个月前
https://static.github-zh.com/github_avatars/openhackathons-org?size=40
openhackathons-org / nways_multi_gpu

N-Ways to Multi-GPU Programming

CUDAhpcmpinccl
C 25
2 年前
https://static.github-zh.com/github_avatars/lanl?size=40
lanl / pyDNMFk

#计算机科学#Python Distributed Non Negative Matrix Factorization with custom clustering

distributed-computinghpccupy机器学习nccloutofmemoryPython
Python 24
2 年前
https://static.github-zh.com/github_avatars/1duo?size=40
1duo / nccl-examples

#计算机科学#NCCL Examples from Official NVIDIA NCCL Developer Guide.

Nvidianccl深度学习distributed-systems
CMake 17
7 年前
https://static.github-zh.com/github_avatars/BaguaSys?size=40
BaguaSys / bagua-net

High performance NCCL plugin for Bagua.

nccldistributed-computing
Rust 15
4 年前
https://static.github-zh.com/github_avatars/YinLiu-91?size=40
YinLiu-91 / ncclOperationPlus

use ncclSend ncclRecv realize ncclSendrecv ncclGather ncclScatter ncclAlltoall

ncclCUDAmpiC++
Cuda 8
3 年前
https://static.github-zh.com/github_avatars/UCBerkeley-Spring2022-CS267-project?size=40
UCBerkeley-Spring2022-CS267-project / blinkplus

Blink+: Increase GPU group bandwidth by utilizing across tenant NVLink.

ncclgpu
Jupyter Notebook 6
3 年前
https://static.github-zh.com/github_avatars/YconquestY?size=40
YconquestY / nccl

Summary of call graphs and data structures of NVIDIA Collective Communication Library (NCCL)

computer-networknccl
D2 6
10 个月前
https://static.github-zh.com/github_avatars/lancelee82?size=40
lancelee82 / pynccl

Nvidia NCCL2 Python bindings using ctypes and numba.

ncclnumbaPython
Python 5
4 年前
https://static.github-zh.com/github_avatars/asprenger?size=40
asprenger / distributed-training-patterns

Experiments with low level communication patterns that are useful for distributed training.

mpincclhorovodTensorflowdistributed-training
Python 5
7 年前
loading...