#大语言模型#Test your prompts, agents, and RAGs. AI Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with comma...
Agentic LLM Vulnerability Scanner / AI red teaming kit 🧪
#大语言模型#LLM Reasoning and Generation Benchmark. Evaluate LLMs in complex scenarios systematically.
Test, compare, and optimize your AI prompts in minutes
#大语言模型#The prompt engineering, prompt management, and prompt evaluation tool for TypeScript, JavaScript, and NodeJS.
#大语言模型#LLM Prompt Test helps you test Large Language Models (LLMs) prompts to ensure they consistently meet your expectations.
#大语言模型#Community Plugin for Genkit to use Promptfoo
#大语言模型#Sample project demonstrates how to use Promptfoo, a test framework for evaluating the output of generative AI models
#大语言模型#A collection of prompts that I use on a day-to-day basis for work and leisure.
#大语言模型#Quickstart guide for using PromptFoo to evaluate LLM prompts via CLI or Colab.
#大语言模型#A pytest-based framework for testing multi AI agents systems. It provides a flexible and extensible platform for complex multi-agent simulations. Supports many integrations like LiteLLM, CrewAI, LangC...
#大语言模型#Sample implementation demonstrating how to use Firebase Genkit with Promptfoo
#大语言模型#A dynamic and interactive playground for testing and refining prompts with OpenAI's language models. Includes customizable inputs for prompts, advanced model settings, and live response streaming for ...
#大语言模型#🐙 Team Agents unifica 82 especialistas en IA para resolver desafíos con chat inteligente, analista de requisitos y subida de documentos. Plataforma futurista y modular.
#大语言模型#EvalWise is a developer-friendly platform for LLM evaluation and red teaming that helps test AI models for safety, compliance, and performance issues