The LLM Evaluation Framework
A one-stop repository for large language model (LLM) unlearning. Supports TOFU, MUSE and is an easily extensible framework for new datasets, evaluations, methods, and other benchmarks.
#大语言模型#LangFair is a Python library for conducting use-case level LLM bias and fairness assessments
#大语言模型#[ACL'24] A Knowledge-grounded Interactive Evaluation Framework for Large Language Models
#大语言模型#Create an evaluation framework for your LLM based app. Incorporate it into your test suite. Lay the monitoring foundation.
#数据仓库#A set of auxiliary systems designed to provide a measure of estimated confidence for the outputs generated by Large Language Models.
This repo is for an streamlit application that provides a user-friendly interface for evaluating large language models (LLMs) using the beyondllm package.