GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

llm-evaluation-metrics

Website
Wikipedia
https://static.github-zh.com/github_avatars/confident-ai?size=40
confident-ai / deepeval

The LLM Evaluation Framework

evaluation-metricsevaluation-frameworkllm-evaluationllm-evaluation-frameworkllm-evaluation-metrics
Python 8.01 k
3 天前
https://static.github-zh.com/github_avatars/locuslab?size=40
locuslab / open-unlearning

The one-stop repository for large language model (LLM) unlearning. Supports TOFU, MUSE, WMDP, and many unlearning methods. All features: benchmarks, methods, evaluations, models etc. are easily extens...

privacy-protectionbenchmarksllm-evaluation-metrics大语言模型Open Source
Python 287
24 天前
https://static.github-zh.com/github_avatars/cvs-health?size=40
cvs-health / langfair

#大语言模型#LangFair is a Python library for conducting use-case level LLM bias and fairness assessments

人工智能biasbias-detectionfairnessfairness-aifairness-mlfairness-testinglarge-language-models大语言模型responsible-aiPythonai-safetyllm-evaluationllm-evaluation-frameworkllm-evaluation-metrics
Python 215
4 天前
https://static.github-zh.com/github_avatars/zhuohaoyu?size=40
zhuohaoyu / KIEval

#大语言模型#[ACL'24] A Knowledge-grounded Interactive Evaluation Framework for Large Language Models

explainable-ai大语言模型llm-evaluationllm-evaluation-frameworkllm-evaluation-metrics机器学习
Python 36
1 年前
https://static.github-zh.com/github_avatars/pyladiesams?size=40
pyladiesams / eval-llm-based-apps-jan2025

#大语言模型#Create an evaluation framework for your LLM based app. Incorporate it into your test suite. Lay the monitoring foundation.

大语言模型llmopsworkshopllm-evalllm-evaluation-frameworkllm-evaluation-metricsllm-monitoring
Jupyter Notebook 7
1 个月前
https://static.github-zh.com/github_avatars/attogram?size=40
attogram / ollama-multirun

A bash shell script to run a single prompt against any or all of your locally installed ollama models, saving the output and performance statistics as easily navigable web pages.

人工智能ollamallm-evaluationollama-interfaceBashollama-appllm-evaluation-metricsllm-evalstatic-site-generator
Shell 6
1 天前
https://static.github-zh.com/github_avatars/ronniross?size=40
ronniross / llm-confidence-scorer

#数据仓库#A set of auxiliary systems designed to provide a measure of estimated confidence for the outputs generated by Large Language Models.

大语言模型llm-evaluationllm-evaluation-frameworkllm-evaluation-metricsllm-trainingdataset数据集
Python 2
21 天前
https://static.github-zh.com/github_avatars/ritwickbhargav80?size=40
ritwickbhargav80 / quick-llm-model-evaluations

This repo is for an streamlit application that provides a user-friendly interface for evaluating large language models (LLMs) using the beyondllm package.

llm-evaluation-metrics大语言模型retrieval-augmented-generationStreamlit
Python 0
10 个月前
https://static.github-zh.com/github_avatars/nhsengland?size=40
nhsengland / evalsense

#大语言模型#Tools for systematic large language model evaluations

llm-evaluationllm-evaluation-metrics大语言模型evaluation-frameworkevaluation-metricsllm-evaluation-framework
Python 0
23 天前