GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

blip2

Website
Wikipedia
https://static.github-zh.com/github_avatars/DAMO-NLP-SG?size=40
DAMO-NLP-SG / Video-LLaMA

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

large-language-modelsvideo-language-pretrainingvision-language-pretrainingblip2llamaminigpt4cross-modal-pretrainingmulti-modal-chatgpt
Python 3.02 k
1 年前
https://static.github-zh.com/github_avatars/sled-group?size=40
sled-group / chat-with-nerf

#大语言模型#[ICRA 2024] Chat with NeRF enables users to interact with a NeRF model by typing in natural language.

blip2ChatGPTgpt-4nerf
Python 312
1 年前
https://static.github-zh.com/github_avatars/mlpc-ucsd?size=40
mlpc-ucsd / BLIVA

#大语言模型#(AAAI 2024) BLIVA: A Simple Multimodal LLM for Better Handling of Text-rich Visual Questions

blip2聊天机器人instruction-tuningllama大语言模型multimodalvisual-language-learninglora
Python 260
1 年前
https://static.github-zh.com/github_avatars/gongzix?size=40
gongzix / NeuroClips

Official code base for NeuroClips

fmriblip2
MATLAB 90
7 天前
https://static.github-zh.com/github_avatars/SmithaUpadhyaya?size=40
SmithaUpadhyaya / fashion_image_caption

Automate Fashion Image Captioning using BLIP-2. Automatic generating descriptions of clothes on shopping websites, which can help customers without fashion knowledge to better understand the features ...

blip2huggingface-transformersImagetransformermultimodal-deep-learning
Jupyter Notebook 55
2 年前
https://static.github-zh.com/github_avatars/kyegomez?size=40
kyegomez / qformer

#计算机科学#Implementation of Qformer from BLIP2 in Zeta Lego blocks.

人工智能attention-mechanismblip2machine机器学习multi-modalmulti-modality
Python 39
7 个月前
https://static.github-zh.com/github_avatars/eric-ai-lab?size=40
eric-ai-lab / ComCLIP

Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"

blip2causalityclipcompositionalityimage-text-matchingimage-text-retrievalvision-and-language
Python 35
10 个月前
https://static.github-zh.com/github_avatars/BUAADreamer?size=40
BUAADreamer / SPN4CIR

[ACM MM 2024] Improving Composed Image Retrieval via Contrastive Learning with Scaling Positives and Negatives

blipblip2clipdata-generationimage-retrievalllamallavamultimodal-learningtransformercross-modal-retrieval
Python 32
8 个月前
https://static.github-zh.com/github_avatars/nngocson2002?size=40
nngocson2002 / ViVQA

The Multimodal Model for Vietnamese Visual Question Answering (ViVQA)

beit-3blip2efficientnetmultimodal-deep-learningvqa
Python 20
1 年前
https://static.github-zh.com/github_avatars/zer0int?size=40
zer0int / CLIP-Interrogator-LongCLIP-hallucinwords

CLIP Interrogator, fully in HuggingFace Transformers 🤗, with LongCLIP & CLIP's own words and / or *your* own words!

blipblip2clip
Python 17
5 个月前
https://static.github-zh.com/github_avatars/ZhaoPeiduo?size=40
ZhaoPeiduo / BLIP2-Japanese

Modifying LAVIS' BLIP2 Q-former with models pretrained on Japanese datasets.

captioningjapanesePyTorchblip2multimodal-deep-learning
Python 12
5 个月前
https://static.github-zh.com/github_avatars/arashsajjadi?size=40
arashsajjadi / ai-powered-video-analyzer

#大语言模型#An offline AI-powered video analysis tool with object detection (YOLO), image captioning (BLIP), speech transcription (Whisper), audio event detection (PANNs), and AI-generated summaries (LLMs via Oll...

blip2GUIimage-captioning大语言模型object-detectionollamaollama-api隐私Whisperwhisper-aiyoloyolo11
Python 10
4 个月前
https://static.github-zh.com/github_avatars/matlok-ai?size=40
matlok-ai / bampe-weights

#大语言模型#This repository is for profiling, extracting, visualizing and reusing generative AI weights to hopefully build more accurate AI models and audit/scan weights at rest to identify knowledge domains for ...

人工智能blip2foundational-modelsgenerative-aigptqimage-to-image大语言模型safetensorsstable-diffusiontifftransformersblenderblender-python深度学习
Python 9
1 年前
https://static.github-zh.com/github_avatars/jacobmarks?size=40
jacobmarks / fiftyone-image-captioning-plugin

Caption images across your datasets with state of the art models from Hugging Face and Replicate!

blip2机器视觉huggingfacehuggingface-transformersimage-captioningllavaqwen
Python 9
1 年前
https://static.github-zh.com/github_avatars/MichiganNLP?size=40
MichiganNLP / visual_diversity_budget

#数据仓库#Annotations on a Budget: Leveraging Geo-Data Similarity to Balance Model Performance and Annotation Cost

active-learningblip2clip数据集multimodal-deep-learning
8
1 年前
https://static.github-zh.com/github_avatars/aws-samples?size=40
aws-samples / visual-question-answering-finetuning

Finetuning Large Visual Models on Visual Question Answering

blip2finetuninggenaivqa
Jupyter Notebook 6
1 年前
https://static.github-zh.com/github_avatars/leeyunjai?size=40
leeyunjai / image2text

caption generator using lavis and argostranslate

captioncaptionsimage-analysisblip2
Python 4
2 年前
https://static.github-zh.com/github_avatars/craigsdennis?size=40
craigsdennis / scairy

Uses AI to scare people...more.

人工智能blip2elevenlabsllama2replicate
Python 4
2 年前
https://static.github-zh.com/github_avatars/Pavansomisetty21?size=40
Pavansomisetty21 / Visual-Question-Answering-using-Gemini-LLM

In this we explore into visual Question Answering Using Gemini LLM and image was in URL or any other extension

blipblip2geminiGitquestion-answeringvision-language-modelvision-transformervisual-question-answeringvlmvqa人工智能generative-aigenerative-model
Jupyter Notebook 4
5 个月前
https://static.github-zh.com/github_avatars/otdavies?size=40
otdavies / AIOrganizeMyDesktop

Too lazy to organize my desktop, make gpt + BLIP-2 do it /s

自动化Desktopexample-projectgpt-3organizationPython人工智能blip2机器学习
Python 2
2 年前
loading...