GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

humaneval

Website
Wikipedia
https://static.github-zh.com/github_avatars/bin123apple?size=40
bin123apple / AutoCoder

#自然语言处理#We introduced a new model designed for the Code generation task. Its test accuracy on the HumanEval base dataset surpasses that of GPT-4 Turbo (April 2024) and GPT-4o.

code-generationcode-interpreterhumaneval大语言模型text-generation自然语言处理
Python 844
1 年前
https://static.github-zh.com/github_avatars/the-crypt-keeper?size=40
the-crypt-keeper / can-ai-code

#大语言模型#Self-evaluating interview for AI coders

人工智能ggmllangchainllama-cpp大语言模型humanevaltransformers
Python 582
1 个月前
https://static.github-zh.com/github_avatars/abacaj?size=40
abacaj / code-eval

Run evaluation on LLMs using human-eval benchmark

humanevalwizardcoder
Python 414
2 年前
https://static.github-zh.com/github_avatars/SkyWorkAIGC?size=40
SkyWorkAIGC / SkyCode-AI-CodeX-GPT3

SkyCode是一个多语言开源编程大模型,采用GPT3模型结构,支持Java, JavaScript, C, C++, Python, Go, shell等多种主流编程语言,并能理解中文注释。模型可以对代码进行补全,拥有强大解题能力,使您从编程中解放出来,专心于解决更重要的问题。| SkyCode is an open source programming model, which adopts...

codexdeepmindGogpt-neohumanevalJavaJavaScriptopenaiPythongpt3gpt-3Shell
393
2 年前
https://static.github-zh.com/github_avatars/zorse-project?size=40
zorse-project / COBOLEval

#大语言模型#Evaluate LLM-generated COBOL

cobolevaluationhumaneval大语言模型
Python 36
1 年前
https://static.github-zh.com/github_avatars/declare-lab?size=40
declare-lab / LLM-ReasoningTest

Evaluating LLMs' Mathematical and Coding Competency through Ontology-guided Interventions

humanevalreasoning
Python 10
2 个月前
https://static.github-zh.com/github_avatars/abhigupta2909?size=40
abhigupta2909 / LLMPerformanceLab

LLMs' performance analysis on CPU, GPU, Execution Time and Energy Usage

flask-restfulhumanevalJavaJavaScript大语言模型mmluollama-apiReactSpring BootMySQL
Java 0
1 年前
https://static.github-zh.com/github_avatars/mousamax?size=40
mousamax / Evaluation-Code-Generator-LLMs

JetBrains Task: Leveraging software evolution data with LLMs

huggingfacehumaneval
0
1 年前
https://static.github-zh.com/github_avatars/mennahasan31?size=40
mennahasan31 / llm_benchmark

#自然语言处理#llm_benchmark is a comprehensive benchmarking tool for evaluating the performance of various Large Language Models (LLMs) on a range of natural language processing tasks. It provides a standardized fr...

ai-toolsalibabaanthropicbenchmarkevalsevaluationevaluation-metricshumanevalinformation-seekingmistral自然语言处理openaireasoningstreetfighterai
0
4 个月前