GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

image-understanding

Website
Wikipedia
https://static.github-zh.com/github_avatars/PKU-YuanGroup?size=40
PKU-YuanGroup / Chat-UniVi

[CVPR 2024 Highlight🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding

image-understandinglarge-language-modelsvideo-understandingvision-language-model
Python 944
10 个月前
https://static.github-zh.com/github_avatars/PKU-YuanGroup?size=40
PKU-YuanGroup / UniWorld-V1

UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation

diffusionhigh-level-featureimage-editingimage-understandinglow-level-visiontext-to-image-generationunifyunify-aivlm
Python 669
2 天前
https://static.github-zh.com/github_avatars/yohasebe?size=40
yohasebe / openai-chat-api-workflow

🎩 An Alfred 5 Workflow for using OpenAI Chat API to interact with GPT models 🤖💬 It also allows image generation/editing/understanding 🖼️, speech-to-text conversion 🎤, and text-to-speech synthesis...

alfredopenaiworkflow人工智能gptdall-eimage-generationspeech-to-textWhispertext-to-speech聊天机器人image-understanding
318
1 个月前
https://static.github-zh.com/github_avatars/suprosanna?size=40
suprosanna / relationformer

A Unified Framework for Image-to-Graph Generation. Paper accepted @ ECCV22.

image-understandingroad-networkscene-graphtransformer
127
2 年前
https://static.github-zh.com/github_avatars/DmitryRyumin?size=40
DmitryRyumin / WACV-2024-Papers

#人脸识别#WACV 2024 Papers: Discover cutting-edge research from WACV 2024, the leading computer vision conference. Stay updated on the latest in computer vision and deep learning, with code included. ⭐ support ...

3d-computer-visionadversarial-attacksautonomous-drivingbiometrics机器视觉数据集face-recognitiongenerative-modelsgesture-recognitionimage-recognitionimage-understandinglow-level机器学习Roboticsvideo-recognitionvision-transformer可视化
Python 96
1 年前
https://static.github-zh.com/github_avatars/KyanChen?size=40
KyanChen / DynamicVis

This is the implement of the paper "DynamicVis: An Efficient and General Visual Foundation Model for Remote Sensing Image Understanding"

机器视觉foundation-modelsimage-understandingremote-sensingchange-detectionimage-segmentationobject-detectionimage-retrievalinstance-segmentation
Python 64
2 个月前
https://static.github-zh.com/github_avatars/KleinYuan?size=40
KleinYuan / image2text

#自然语言处理#A deep learning project to tell a story with an image or a video.

深度学习real-time人工智能神经网络image-understanding自然语言处理word2veccnnTensorflowconvolutional-neural-networks机器学习theanolasagne
Python 43
8 年前
https://static.github-zh.com/github_avatars/sopermanspace?size=40
sopermanspace / Unity_OpenAI

This GitHub repository shows how to integrate openai GPT-3 language model and ChatGPT API into a Unity project. It can be a useful way to add natural language processing capabilities to your applicat...

openaiUnitychatgpt3gpt-3聊天机器人gpt4人工智能游戏开发integrationopenai-chatgptimage-understandingtext-to-speech
C# 37
2 年前
https://static.github-zh.com/github_avatars/wangqingbaidu?size=40
wangqingbaidu / CV-Datasets

#数据仓库#Collection of open datasets in computer vision.

机器视觉数据集image-understandingvideo-understanding
34
7 年前
https://static.github-zh.com/github_avatars/The-Martyr?size=40
The-Martyr / Awesome-Multimodal-Reasoning

#大语言模型#Latest Advances on (RL based) Multimodal Reasoning and Generation in Multimodal Large Language Models

chain-of-thoughtcotlarge-language-models大语言模型mllmvideo-understandingmultimodal-learningreinforcement-learningrlo1image-generationimage-understandingvideo-generation
30
6 天前
https://static.github-zh.com/github_avatars/ddw2AIGROUP2CQUPT?size=40
ddw2AIGROUP2CQUPT / HumanVLM

HumanVLM (LLaVA-based): Foundation for Human-Scene Vision-Language Model (Journal of Information Fusion 2025)

humanimage-understandingvision-language-model
Python 12
7 个月前
https://static.github-zh.com/github_avatars/gasparyanartur?size=40
gasparyanartur / brain-image-implementation

#计算机科学#A reimplementation of the paper Human-Aligned Image Models Improve Visual Decoding from the Brain

深度学习image-understandingresearch
Jupyter Notebook 1
9 天前
https://static.github-zh.com/github_avatars/kimtth?size=40
kimtth / rag-multimodal-semantic-chunking

🖼️📄E2E Multi-modal Document Preprocessing for Search Indexing with Azure Document Intelligence

chunkingimage-understandingworkshop
Python 1
14 天前
https://static.github-zh.com/github_avatars/Dulyaaa?size=40
Dulyaaa / IUP_Labs

🏷This repository contains the lab sheets of Image Understanding & Processing (SE4130) Module in Year 4 Semester 1.

OpenCVNumPymatplotlibPython图像处理image-understanding
Jupyter Notebook 0
3 年前
https://static.github-zh.com/github_avatars/chrisputzu?size=40
chrisputzu / annuncio-hackathon-aria-allegro

#大语言模型#Annuncio generates product advertisements from user inputs, utilizing Aria for descriptions, Allegro for promotional videos, and hashtags for social media discoverability.

人工智能ariacontent-creatione-commercegenaiHackathonimage-understanding大语言模型video-generation
Python 0
9 个月前
https://static.github-zh.com/github_avatars/Serin-Yoon?size=40
Serin-Yoon / CS472-Image-Understanding

2022-1 Image Understanding Assignments & Projects

image-understandingMATLAB
MATLAB 0
3 年前
https://static.github-zh.com/github_avatars/Pfilipeferreira2004?size=40
Pfilipeferreira2004 / DynamicVis

This is the implement of the paper "DynamicVis: An Efficient and General Visual Foundation Model for Remote Sensing Image Understanding"

change-detectionfoundation-modelsgraphicsimage-retrievalimage-understandinginstance-segmentationipsobject-detectionRremote-sensing可视化
Jupyter Notebook 0
14 天前