集合主题趋势排行榜

streetfighterai

OpenGenerativeAI / llm-colosseum

#大语言模型#Benchmark LLMs by fighting in Street Fighter 3! The new way to evaluate the quality of an LLM

genai 大语言模型 benchmark streetfighterai

Jupyter Notebook 1.45 k

6 个月前

jiseongHAN / StreetFighterRL

StreetFighter RL (Beta)

人工智能 reinforcement-learning rl streetfighterai gym retro

Python 2

6 个月前

mennahasan31 / llm_benchmark

#自然语言处理#llm_benchmark is a comprehensive benchmarking tool for evaluating the performance of various Large Language Models (LLMs) on a range of natural language processing tasks. It provides a standardized fr...

ai-tools alibaba anthropic benchmark evals evaluation evaluation-metrics humaneval information-seeking mistral 自然语言处理 openai reasoning streetfighterai

7 个月前

Website
Wikipedia