GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

preference-alignment

Website
Wikipedia
https://static.github-zh.com/github_avatars/princeton-nlp?size=40
princeton-nlp / SimPO

[NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward

alignmentlarge-language-modelspreference-alignmentrlhf
Python 895
4 个月前
https://static.github-zh.com/github_avatars/zjukg?size=40
zjukg / KnowPAT

[Paper][ACL 2024 Findings] Knowledgeable Preference Alignment for LLMs in Domain-specific Question Answering

knowledge-graphlarge-language-modelsquestion-answeringpreference-alignmentinstruction-tuning
Python 194
1 年前
https://static.github-zh.com/github_avatars/Meaquadddd?size=40
Meaquadddd / DPO-Shift

DPO-Shift: Shifting the Distribution of Direct Preference Optimization

alignmentlarge-language-modelspreference-alignmentrlhf
Python 58
3 个月前
https://static.github-zh.com/github_avatars/junkangwu?size=40
junkangwu / beta-DPO

[NeurIPS 2024] Official code of $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$

alignmentdpopreference-alignmentrlhf
Python 45
8 个月前
https://static.github-zh.com/github_avatars/Shentao-YANG?size=40
Shentao-YANG / Dense_Reward_T2I

Source code for "A Dense Reward View on Aligning Text-to-Image Diffusion with Preference" (ICML'24).

preference-alignmenttext-to-image-generation
Python 38
1 年前
https://static.github-zh.com/github_avatars/Video-Bench?size=40
Video-Bench / Video-Bench

#大语言模型#Video Generation Benchmark

large-language-modelsmultimodal-large-language-modelspreference-alignmentsoravideo-generationvideo-understanding大语言模型text-to-video
Python 22
2 个月前
https://static.github-zh.com/github_avatars/junkangwu?size=40
junkangwu / Dr_DPO

[ICLR 2025] Official code of "Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization"

alignmentdpopreference-alignmentrlhf
Python 14
1 年前
https://static.github-zh.com/github_avatars/YJiangcm?size=40
YJiangcm / BMC

[ICLR 2025] Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization

alignmentdporlhf大语言模型preference-alignment
Python 12
5 个月前
https://static.github-zh.com/github_avatars/BARUDA-AI?size=40
BARUDA-AI / Awesome-Preference-Optimization

Survey of preference alignment algorithms

alignmentpreference-alignmentrlhf
0
1 年前
https://static.github-zh.com/github_avatars/thibaud-perrin?size=40
thibaud-perrin / synthetic-datasets

#大语言模型#Generate synthetic datasets for instruction tuning and preference alignment using tools like `distilabel` for efficient and scalable data creation.

人工智能instruction-tuning大语言模型preference-alignmentsynthetic-data
Jupyter Notebook 0
5 个月前
https://static.github-zh.com/github_avatars/reshalfahsi?size=40
reshalfahsi / gpt2chat

#自然语言处理#Creating a GPT-2-Based Chatbot with Human Preferences

聊天机器人gpt-2huggingfaceinstruction-tuninglangchainpreference-alignmentPyTorchpytorch-lightninglanguage-model自然语言处理
Jupyter Notebook 0
1 个月前