Agent TARS 是一个通用的多模态 AI Agent Stack,它将 GUI Agent 和 Vision 的强大功能带入你的终端、计算机、浏览器和产品中。UI-TARS Desktop 是一个桌面应用程序,基于 UI-TARS 模型提供原生的 GUI Agent。
Your AI Operator for Web, Android, Automation & Testing.
Cua is Docker for Computer-Use AI Agents
The most reliable AI agent framework that supports MCP.
#大语言模型#Bytebot is a self-hosted AI desktop agent that automates computer tasks through natural language commands, operating within a containerized Linux desktop environment.
Agent S: an open agentic framework that uses computers like a human
Ui.Vision Open-Source RPA Software with Computer Vision, OCR, Anthropic Computer Use/LLM. Selenium IDE import/export.
#大语言模型#AI computer use powered by open source LLMs and E2B Desktop Sandbox
[CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.
A curated list of resources about AI agents for Computer Use, including research papers, projects, frameworks, and tools.
A curated list of resources about AI agents for Computer Use, including research papers, projects, frameworks, and tools.
Open Source Generative Process Automation (i.e. Generative RPA). AI-First Process Automation with Large ([Language (LLMs) / Action (LAMs) / Multimodal (LMMs)] / Visual Language (VLMs)) Models
A fork of Anthropic Computer Use that you can run on Mac computers to give Claude and other AI models autonomous access to your computer.
Windows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of multi-modal AI agents.
A framework to enable autonomous android and computer use using any LLM (local or remote)
A framework to enable autonomous android and computer use using any LLM (local or remote)
Desktop app powered by Claude’s computer use capability to control your computer
The only general AI agent that does NOT requires extra API key, giving you full control on your local and remote MacOs from Claude Desktop App
This is the repo for the paper "OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use" (ACL 2025 Oral).