#

embodied-agent

https://static.github-zh.com/github_avatars/zchoi?size=40

This is a curated list of "Embodied AI or robot with Large Language Models" research. Watch this repository for the latest updates! 🔥

1.52 k
2 个月前
https://static.github-zh.com/github_avatars/eric-ai-lab?size=40

A curated list for vision-and-language navigation. ACL 2022 paper "Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future Directions"

539
1 年前
https://static.github-zh.com/github_avatars/kyegomez?size=40

Democratization of RT-2 "RT-2: New model translates vision and language into action"

Python 509
1 年前
https://static.github-zh.com/github_avatars/RobotecAI?size=40

#大语言模型#RAI is a vendor-agnostic agentic framework for robotics, utilizing ROS 2 tools to perform complex actions, defined scenarios, free interface execution, log summaries, voice interaction and more.

Python 374
3 天前
https://static.github-zh.com/github_avatars/Yuxing-Wang-THU?size=40

Brain-Body Co-Design for Embodied Agents: Taxonomy, Frontiers, and Challenges

193
2 天前
https://static.github-zh.com/github_avatars/Gary3410?size=40

[arXiv 2023] Embodied Task Planning with Large Language Models

Python 190
2 年前
https://static.github-zh.com/github_avatars/iris0329?size=40

[CVPR'25] SeeGround: See and Ground for Zero-Shot Open-Vocabulary 3D Visual Grounding

Python 171
5 个月前
https://static.github-zh.com/github_avatars/hanxunyu?size=40

[CVPR 2025 Highlight🔥] Official code repository for "Inst3D-LMM: Instance-Aware 3D Scene Understanding with Multi-modal Instruction Tuning"

Python 112
1 个月前
https://static.github-zh.com/github_avatars/AoqunJin?size=40

A collection of vision-language-action model post-training methods.

98
18 天前
https://static.github-zh.com/github_avatars/Zhoues?size=40

[IROS'25 Oral & NeurIPSw'24] Official implementation of "MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simulated-World Control "

Python 95
3 个月前
https://static.github-zh.com/github_avatars/wendell0218?size=40

#大语言模型#Official repository of the paper "Generalist Virtual Agents: A Survey on Autonomous Agents Across Digital Platforms"

82
2 个月前
https://static.github-zh.com/github_avatars/mazpie?size=40

[NeurIPS 2024] GenRL: Multimodal-foundation world models enable grounding language and video prompts into embodied domains, by turning them into sequences of latent world model states. Latent state se...

Python 80
5 个月前
loading...
Website
Wikipedia