GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

gpt4v

Website
Wikipedia
https://static.github-zh.com/github_avatars/TencentQQGYLab?size=40
TencentQQGYLab / AppAgent

#大语言模型#AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.

agentChatGPTgenerative-aigpt4gpt4v大语言模型
Python 5.88 k
3 个月前
https://static.github-zh.com/github_avatars/X-PLUG?size=40
X-PLUG / MobileAgent

#安卓#Mobile-Agent: The Powerful Mobile Device Operation Assistant Family

agentgpt4vmllmmobile-agentsmultimodalmultimodal-large-language-modelsmultimodal-agentAndroidAppGUI移动自动化copilotharmonyiOS
Python 4.34 k
10 天前
https://static.github-zh.com/github_avatars/AmberSahdev?size=40
AmberSahdev / Open-Interface

#大语言模型#Control Any Computer Using LLMs.

gpt大语言模型机器学习macOSopenaiPython自动化assistantassistant-computer-controlgpt4gpt4vgpt4visionLinuxpyautoguipyinstallerself-drivingself-driving-softwareWindows
Python 2.22 k
3 个月前
https://static.github-zh.com/github_avatars/reworkd?size=40
reworkd / tarsier

Vision utilities for web interaction agents 👀

OCRPlaywrightSeleniumwebscrapingpypi-packagegpt4v大语言模型Python
Jupyter Notebook 1.69 k
7 个月前
https://static.github-zh.com/github_avatars/ictnlp?size=40
ictnlp / LLaVA-Mini

LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner.

efficientgpt4ogpt4vlarge-language-modelslarge-multimodal-modelsllavamultimodalVideovisionvision-language-modelvisual-instruction-tuningllamamultimodal-large-language-models
Python 488
5 个月前
https://static.github-zh.com/github_avatars/bdekraker?size=40
bdekraker / WebcamGPT-Vision

#大语言模型#Lightweight GPT-4 Vision processing over the Webcam

ChatGPT机器视觉gpt-4gpt4-apigpt4vopenai
JavaScript 284
2 年前
https://static.github-zh.com/github_avatars/langgptai?size=40
langgptai / Awesome-Multimodal-Prompts

#Awesome# Prompts of GPT-4V & DALL-E3 to full utilize the multi-modal ability. GPT4V Prompts, DALL-E3 Prompts.

ChatGPTgpt4multimodalprompt-engineeringpromptsgpt4vnewbingAwesome Listsprompt-injectiondall-e
253
2 年前
https://static.github-zh.com/github_avatars/ShareGPT4Omni?size=40
ShareGPT4Omni / ShareGPT4V

#大语言模型#[ECCV 2024] ShareGPT4V: Improving Large Multi-modal Models with Better Captions

ChatGPTgptgpt-4vgpt4vinstruction-tuninglanguage-modellarge-language-modelslarge-multimodal-modelslarge-vision-language-modelsvision-language-modeleccv2024
Python 221
1 年前
https://static.github-zh.com/github_avatars/pAIrprogio?size=40
pAIrprogio / vscode-ui-sketcher

Draw your projects to life

gpt4vtldrawUser interface designVS Code Extension
TypeScript 201
1 年前
https://static.github-zh.com/github_avatars/soulteary?size=40
soulteary / amazing-openai-api

Convert different model APIs into the OpenAI API format out of the box.

azure-openaiazure-openai-apigemini-progoogle-geminiopenaiopenai-apigpt4vgpt4vision
Go 153
1 年前
https://static.github-zh.com/github_avatars/zzxslp?size=40
zzxslp / MM-Navigator

GPT-4V in Wonderland: LMMs as Smartphone Agents

gpt4vllm-agents
Python 134
1 年前
https://static.github-zh.com/github_avatars/kyegomez?size=40
kyegomez / MambaByte

#计算机科学#Implementation of MambaByte in "MambaByte: Token-free Selective State Space Model" in Pytorch and Zeta

人工智能gpt4v机器学习mambamulti-modalityParsing
Python 118
2 个月前
https://static.github-zh.com/github_avatars/BUAADreamer?size=40
BUAADreamer / Chinese-LLaVA-Med

中文医学多模态大模型 Large Chinese Language-and-Vision Assistant for BioMedicine

llavamedicalmllmmultimodal中文qwen1-5人工智能gpt4vminigpt4transformers
Python 86
1 年前
https://static.github-zh.com/github_avatars/cameronking4?size=40
cameronking4 / sketch2app

The ultimate sketch to code app made using GPT4o serving 30k+ users. Choose your desired framework (React, Next, React Native, Flutter) for your app. It will instantly generate code and preview (sandb...

sketch2codewireframegpt4vdesign2codecode-generatorgpt4code-assistantNextopenai
JavaScript 79
1 年前
https://static.github-zh.com/github_avatars/admineral?size=40
admineral / GPT4-Vision-React-Starter

Early Alpha Release: Chat with Your Image - Leveraging GPT-4 Vision and Function Calls for AI-Powered Image Analysis and Description

gpt4gpt4-apigpt4vopenaiopenaiapi人工智能ChatGPT APIgpt-4-vision-previewopenai-api
TypeScript 77
2 年前
https://static.github-zh.com/github_avatars/reidbarber?size=40
reidbarber / webmarker

Mark web pages for use with vision-language models

promptprompt-engineeringsomvision-language-modelclaudegeminigpt4ogpt4v大语言模型Playwrightoperatorcomputer-usecua
TypeScript 40
1 个月前
https://static.github-zh.com/github_avatars/roboflow?size=40
roboflow / gpt-checkup

Monitor the performance of OpenAI's GPT O3 Mini model over time.

机器视觉gpt4vo1
HTML 34
1 个月前
https://static.github-zh.com/github_avatars/martintomov?size=40
martintomov / gpt4v-video-voiceover

Video Voiceover with gpt-4o-mini

gpt4vopenaiPythonStreamlitJupyter Notebook
Jupyter Notebook 33
9 个月前
https://static.github-zh.com/github_avatars/Azure-Samples?size=40
Azure-Samples / rag-as-a-service-with-vision

#大语言模型#This repository offers a Python framework for a retrieval-augmented generation (RAG) pipeline using text and images from MHTML documents, leveraging Azure AI and OpenAI services. It includes ingestion...

azure-ai-searchcosmosdbgpt-4ogpt4vgpt4vision大语言模型openairagvision
Python 27
6 个月前
https://static.github-zh.com/github_avatars/neka-nat?size=40
neka-nat / mylangrobot

#大语言模型#Language instructions to mycobot using GPT-4V

ChatGPTgpt4vsegment-anythingWhispergpt-4-visiongpt-4-vision-preview
Python 24
2 年前
loading...