megatron · GitHub Topics

#大语言模型#Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-VL, Qwen2.5-Omni, InternVL3.5, Ovis2.5, Llava, GLM4v, P...

大语言模型 lora llama sft deploy multimodal peft internvl liger qwen2-vl rft deepseek-r1 embedding grpo open-r1 megatron omni llama4 qwen3 qwen3-moe

Python 9.84 k

2 天前

eyaadh / megadlbot_oss

Megatron was a telegram file management bot that helped a lot of users, specially movie channel managers to upload their files to telegram by just providing a link to it. The project initially started...

megatron pyrogram Telegram tgcrypto aiohttp pymongo FFmpeg MongoDB Bot youtube-dl

Python 186

4 年前

xrsrke / pipegoose

Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*

megatron transformers data-parallelism pipeline-parallelism model-parallelism huggingface-transformers mixture-of-experts moe

Python 87

2 年前

MoFHeka / LLaMA-Megatron

#大语言模型#A LLaMA1/LLaMA12 Megatron implement.

llama llama2 大语言模型 llm-training megatron PyTorch

Python 28

2 年前

liangyuwang / Tiny-Megatron

#计算机科学#Tiny-Megatron, a minimalistic re-implementation of the Megatron library

深度学习 distributed-computing gpt large-language-models megatron

Python 16

13 天前

janelu9 / EasyLLM

Running Large Language Model easily.

deepspeed fine-tuning pretrain llama npu qwen deepseek rlhf megatron vllm

Python 10

4 天前

Sh-Jil / megadlbot

Bot MongoDB Telegram FFmpeg youtube-dl pyrogram megatron tgcrypto

Python 3

4 年前

future-wd / course-hugo-bootstrap-introduction

A course which takes you through the entire process of setting up a Hugo project with Bootstrap. It also teaches you the basics of bootstrap to get you up and running.

教程 Hugo Bootstrap footer copyright hero megatron HTML CSS pagination

HTML 1

3 年前