GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

moe

Website
Wikipedia
hiyouga/LLaMA-Factory
https://static.github-zh.com/github_avatars/hiyouga?size=40
hiyouga / LLaMA-Factory

#大语言模型#Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

fine-tuninglanguage-modelllama大语言模型pefttransformersrlhfqloraquantizationchatglmqweninstruction-tuningmistralgptloralarge-language-modelsagent人工智能moellama3
Python 52.29 k
4 天前
https://static.github-zh.com/github_avatars/sgl-project?size=40
sgl-project / sglang

#大语言模型#SGLang is a fast serving framework for large language models and vision language models.

CUDAinferencellamallava大语言模型llm-servingmoePyTorchtransformervlmllama3llama3-1deepseekdeepseek-llmdeepseek-v3deepseek-r1deepseek-r1-zeroqwen3llama4
Python 15.14 k
7 小时前
czy0729/Bangumi
https://static.github-zh.com/github_avatars/czy0729?size=40
czy0729 / Bangumi

#安卓#:electron: An unofficial https://bgm.tv ui first app client for Android and iOS, built with React Native. 一个无广告、以爱好为驱动、不以盈利为目的、专门做 ACG 的类似豆瓣的追番记录,bgm.tv 第三方客户端。为移动端重新设计,内置大量加强的网页端难以实现的功能,且提供了相当的自定义选项。...

React NativemobxiOSReactAndroidbangumidesignexpomoe
TypeScript 4.42 k
7 天前
https://static.github-zh.com/github_avatars/PKU-YuanGroup?size=40
PKU-YuanGroup / MoE-LLaVA

Mixture-of-Experts for Large Vision-Language Models

large-vision-language-modelmixture-of-expertsmoemulti-modal
Python 2.18 k
6 个月前
https://static.github-zh.com/github_avatars/MoonshotAI?size=40
MoonshotAI / MoBA

#大语言模型#MoBA: Mixture of Block Attention for Long-Context LLMs

flash-attention大语言模型llm-servingllm-trainingmoePyTorchtransformer
Python 1.8 k
2 个月前
https://static.github-zh.com/github_avatars/davidmrau?size=40
davidmrau / mixture-of-experts

PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538

moemixture-of-expertsPyTorch
Python 1.12 k
1 年前
https://static.github-zh.com/github_avatars/pjlab-sys4nlp?size=40
pjlab-sys4nlp / llama-moe

#大语言模型#⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)

llama大语言模型mixture-of-expertsmoe
Python 967
6 个月前
https://static.github-zh.com/github_avatars/microsoft?size=40
microsoft / Tutel

#大语言模型#Tutel MoE: Optimized Mixture-of-Experts Library, Support DeepSeek FP8/FP4

PyTorchmoemixture-of-expertsdeepseek大语言模型
C 839
8 天前
https://static.github-zh.com/github_avatars/sail-sg?size=40
sail-sg / Adan

#计算机科学#Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models

bert-modelconvnext深度学习fairseqoptimizerresnettimmvittransformer-xl人工智能diffusiondreamfusiongpt2PyTorchcuda-programmingllm-training大语言模型moe
Python 792
7 天前
https://static.github-zh.com/github_avatars/open-compass?size=40
open-compass / MixtralKit

#大语言模型#A toolkit for inference and evaluation of 'mixtral-8x7b-32kseqlen' from Mistral AI

大语言模型mistralmoe
Python 767
2 年前
https://static.github-zh.com/github_avatars/ScienceOne-AI?size=40
ScienceOne-AI / DeepSeek-671B-SFT-Guide

#大语言模型#An open-source solution for full parameter fine-tuning of DeepSeek-V3/R1 671B, including complete code and scripts from training to inference, as well as some practical experiences and conclusions. (D...

deepseek-r1大语言模型moesftPython
Python 699
3 个月前
https://static.github-zh.com/github_avatars/ymcui?size=40
ymcui / Chinese-Mixtral

#自然语言处理#中文Mixtral混合专家大模型(Chinese Mixtral MoE LLMs)

large-language-models大语言模型mixtralmixture-of-expertsmoe自然语言处理
Python 605
1 年前
https://static.github-zh.com/github_avatars/mindspore-courses?size=40
mindspore-courses / step_into_llm

#自然语言处理#MindSpore online courses: Step into LLM

大语言模型自然语言处理large-language-modelsbertChatGPTgptgpt2instruction-tuningparallel-computingprompt-tuningrlhfchatglmchatglm2llamallama2moepeft
Jupyter Notebook 469
5 个月前
https://static.github-zh.com/github_avatars/kokororin?size=40
kokororin / pixiv.moe

😘 A pinterest-style layout site, shows illusts on pixiv.net order by popularity.

lovelivePixivReactmoeReduxcomiccomicsTypeScriptWebsiteWeb app
TypeScript 364
2 年前
https://static.github-zh.com/github_avatars/LISTEN-moe?size=40
LISTEN-moe / android-app

#安卓#Official LISTEN.moe Android app

Androidmusic-playerjapanAnimemusicmoeandroid-autoKotlin
Kotlin 263
5 天前
https://static.github-zh.com/github_avatars/SkyworkAI?size=40
SkyworkAI / MoH

MoH: Multi-Head Attention as Mixture-of-Head Attention

attentiondit大语言模型mixture-of-expertsmoetransformervit
Python 251
8 个月前
https://static.github-zh.com/github_avatars/inferflow?size=40
inferflow / inferflow

Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).

llama2llamacppllm-inferencemodel-quantizationmulti-gpu-inferencemixture-of-expertsmoegemmafalconminicpmmistralbloomdeepseekinternlmbaichuan2mixtralqwen
C++ 242
1 年前
https://static.github-zh.com/github_avatars/libgdx?size=40
libgdx / gdx-pay

#安卓#A libGDX cross-platform API for InApp purchasing.

AndroidiOSiapin-app-purchasemoeJavalibgdx
Java 229
5 个月前
https://static.github-zh.com/github_avatars/SkyworkAI?size=40
SkyworkAI / MoE-plus-plus

[ICLR 2025] MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts

large-language-models大语言模型mixture-of-expertsmoe
Python 224
8 个月前
https://static.github-zh.com/github_avatars/IBM?size=40
IBM / ModuleFormer

ModuleFormer is a MoE-based architecture that includes two different types of experts: stick-breaking attention heads and feedforward experts. We released a collection of ModuleFormer-based Language M...

lmmoe
Python 222
1 年前
loading...