GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

mmlu

Website
Wikipedia
https://static.github-zh.com/github_avatars/baichuan-inc?size=40
baichuan-inc / Baichuan-7B

#自然语言处理#A large-scale 7B pretraining language model developed by BaiChuan-Inc.

人工智能cevallarge-language-models自然语言处理mmluChatGPTgpt-4huggingfacellama中文
Python 5.69 k
1 年前
https://static.github-zh.com/github_avatars/baichuan-inc?size=40
baichuan-inc / Baichuan2

#自然语言处理#A series of large language models developed by Baichuan Intelligent Technology

人工智能benchmarkcevalChatGPT中文gptgpt-4huggingfacelarge-language-modelsllama2mmlu自然语言处理
Python 4.12 k
7 个月前
https://static.github-zh.com/github_avatars/baichuan-inc?size=40
baichuan-inc / Baichuan-13B

#自然语言处理#A 13B large language model developed by Baichuan Intelligent Technology

人工智能ChatGPT中文gpt-4huggingfacelarge-language-models自然语言处理benchmarkcevalmmlu
Python 2.97 k
2 年前
https://static.github-zh.com/github_avatars/microsoft?size=40
microsoft / MMLU-CF

#大语言模型#A Contamination-free Multi-task Language Understanding Benchmark [Official, ACL 2025]

benchmark大语言模型mmlu
117
1 个月前
https://static.github-zh.com/github_avatars/ExplainableML?size=40
ExplainableML / in-context-impersonation

[NeurIPS 2023 Spotlight] In-Context Impersonation Reveals Large Language Models' Strengths and Biases

人工智能聊天机器人clipllamallama2banditmmlutext-generationreasoning
Python 22
7 个月前
https://static.github-zh.com/github_avatars/SS47816?size=40
SS47816 / AGI-Elo

#数据仓库#AGI-Elo: How Far Are We From Mastering A Task?

agibenchmarkcoco数据集evaluation-frameworkevaluation-metricsimagenetleaderboardmmluartificial-general-intelligencesota
Python 4
1 个月前
https://static.github-zh.com/github_avatars/sergeyklay?size=40
sergeyklay / factly

#大语言模型#CLI tool to evaluate LLM factuality on MMLU benchmark.

benchmarkChatGPT命令行界面factuality大语言模型llm-evaluationmmluopenaiprompt-engineering
Python 2
7 天前
https://static.github-zh.com/github_avatars/abhigupta2909?size=40
abhigupta2909 / LLMPerformanceLab

LLMs' performance analysis on CPU, GPU, Execution Time and Energy Usage

flask-restfulhumanevalJavaJavaScript大语言模型mmluollama-apiReactSpring BootMySQL
Java 0
1 年前