rft · GitHub Topics

#大语言模型#Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-VL, Qwen2.5-Omni, InternVL3.5, Ovis2.5, Llava, GLM4v, P...

大语言模型 lora llama sft deploy multimodal peft internvl liger qwen2-vl rft deepseek-r1 embedding grpo open-r1 megatron omni llama4 qwen3 qwen3-moe

Python 9.84 k

2 天前

dafyddg / RFA

Implementation of the Rhythm Formant Analysis methodology for identifying speech rhythms and rhythm variation in the low frequency spectrum and spectrogram.

rft

Python 16

2 年前

LifeCoachRay / My-Pocket-Token-Foundation

#区块链#The My Pocket Token Foundation will make the blockchain better, by bridging the blockchain with the worldwide web. Some of the best Developers in the world.

blockchain-technology developer-tools 加密货币 tokens blogging 安全 rft nftools nfts nft nft-gallery

Solidity 15

3 年前

flint-xf-fan / Federated-RLHF

[AAMAS 2025] Privacy-preserving and Personalized RLHF, with convergence guarantees. The Code contains experiments for training multiple instances of GPT-2 for personalized sentiment aligned text gener...

大语言模型 reinforcement-learning-from-human-feedback rft rlhf

Python 10

5 个月前

anasshad / Refungible-Tokens-Fractional-NFT

Smart contract and unit tests for Refungible Token / Fractional NFT

Solidity smart-contracts erc721 erc20 nft rft

JavaScript 4

4 年前

XxFChen / awesome-reinforcement-fine-tuning

Awesome Reinforcement Fine Tuning

rft fine-tuning finetuning

9 个月前

JohnTheCoolingFan / RandomFactorioThings

Random Factorio Things mod for Factorio

Factorio mod rft

Lua 2

8 个月前

Masoudjafaripour / llm-hf-planning

A small Hugging Face LLM for planning and reasoning

fine-tuning 大语言模型 planning rft sft

Python 2

7 个月前

Azaijah / Syllogimind

Syllogimind is a application developed in Go, designed to engage users in enhancing their logical reasoning capabilities through the generation and solving of syllogisms.

rft

Go 0

1 年前

aman-maurya / OfficeExporter

Generate MsWord file using php

PHP php-library msword rft office-tools

PHP 0

5 年前

rft-kolcsonzo / kolcsonzo-api

rft university PHP REST API API Docker

PHP 0

7 年前

HINNOTN / syllogisms

#大语言模型#Algorithmic Truth Table Method for Proving Validity of Argument Forms

大语言模型 philosophy Python reasoning rft statement validation

TeX 0

4 天前

ksm26 / Reinforcement-Fine-Tuning-LLMs-with-GRPO

The course teaches how to fine-tune LLMs using Group Relative Policy Optimization (GRPO)—a reinforcement learning method that improves model reasoning with minimal data. Learn RFT concepts, reward des...

grpo reinforcement-learning rft rlhf language-model 机器学习

Jupyter Notebook 0

3 个月前

shreyansh26 / wordle-solver

#大语言模型#Training Qwen3 to solve Wordle using SFT and GRPO

grpo 大语言模型 qwen3 rft rl sft wordle wordle-solver

Python 0

5 天前