Agent TARS 是一个通用的多模态 AI Agent Stack,它将 GUI Agent 和 Vision 的强大功能带入你的终端、计算机、浏览器和产品中。UI-TARS Desktop 是一个桌面应用程序,基于 UI-TARS 模型提供原生的 GUI Agent。
[CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.
A curated list of resources about AI agents for Computer Use, including research papers, projects, frameworks, and tools.
A curated list of resources about AI agents for Computer Use, including research papers, projects, frameworks, and tools.
This is the repo for the paper "OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use" (ACL 2025 Oral).
This is the repo for the paper "OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use" (ACL 2025 Oral).
Official implementation of "SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience"
Official implementation of GUI-R1 : A Generalist R1-Style Vision-Language Action Model For GUI Agents
Code for "UI-R1: Enhancing Efficient Action Prediction of GUI Agents by Reinforcement Learning"
#计算机科学#Official repository for InfiGUI-G1. We introduce Adaptive Exploration Policy Optimization (AEPO) to overcome semantic alignment bottlenecks in GUI agents through efficient, guided exploration.
Create your self-hosted, open-source Operator model.
Enable AI to control your PC. This repo includes the WorldGUI Benchmark and GUI-Thinker Agent Framework.
#大语言模型#Official repository of the paper "Generalist Virtual Agents: A Survey on Autonomous Agents Across Digital Platforms"
Official repo of "MMBench-GUI: Hierarchical Multi-Platform Evaluation Framework for GUI Agents". It can be used to evaluate a GUI agent with a hierarchical manner across multiple platforms, including ...
This is the official website for TuriX Computer-use-Agent
Release of code, datasets and model for our work TongUI: Building Generalized GUI Agents by Learning from Multimodal Web Tutorials
🕵 Code for our EMNLP 2025 Main paper: "FlashAdventure: A Benchmark for GUI Agents Solving Full Story Arcs in Diverse Adventure Games"
#大语言模型#Source code of the paper "V-Droid: Advancing Mobile GUI Agent Through Generative Verifiers"
This is a quick test of Chinese Scripting Language powered by AI. You can use it to open any text file. No illegal use is allowed! Free for commercial use and academic use.