GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

multimodal-deep-learning

Website
Wikipedia
https://static.github-zh.com/github_avatars/salesforce?size=40
salesforce / LAVIS

#计算机科学#LAVIS - A One-stop Library for Language-Vision Intelligence

深度学习deep-learning-libraryimage-captioningsalesforcevision-and-languagevision-frameworkvision-language-pretrainingvision-language-transformervisual-question-anwseringmultimodal-datasetsmultimodal-deep-learning
Jupyter Notebook 10.63 k
7 个月前
https://static.github-zh.com/github_avatars/AI4Finance-Foundation?size=40
AI4Finance-Foundation / FinRobot

#大语言模型#FinRobot: An Open-Source AI Agent Platform for Financial Analysis using LLMs 🚀 🚀 🚀

aiagentfingptChatGPTfinancelarge-language-modelsmultimodal-deep-learningprompt-engineeringrobo-advisor
Jupyter Notebook 3.6 k
7 个月前
Yutong-Zhou-cv/Awesome-Text-to-Image
https://static.github-zh.com/github_avatars/Yutong-Zhou-cv?size=40
Yutong-Zhou-cv / Awesome-Text-to-Image

(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.

Generative Adversarial Networktext-to-imageimage-synthesisimage-generationsurveyimage-manipulationmultimodalmultimodal-deep-learning
2.35 k
14 天前
https://static.github-zh.com/github_avatars/KimMeen?size=40
KimMeen / Time-LLM

#计算机科学#[ICLR 2024] Official implementation of " 🦙 Time-LLM: Time Series Forecasting by Reprogramming Large Language Models"

cross-modal-learningcross-modality深度学习language-modellarge-language-models机器学习multimodal-deep-learningmultimodal-time-seriesprompt-tuningtime-seriestime-series-analysistime-series-forecasting
Python 2.08 k
7 个月前
https://static.github-zh.com/github_avatars/kyegomez?size=40
kyegomez / BitNet

#计算机科学#Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch

人工智能深度神经网络深度学习gpt4机器学习multimodalmultimodal-deep-learning
Python 1.83 k
2 个月前
https://static.github-zh.com/github_avatars/AlibabaResearch?size=40
AlibabaResearch / AdvancedLiterateMachinery

A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.

人工智能documentaimultimodalmultimodal-deep-learningOCR机器视觉vision-language-transformerend-to-end-ocrscene-text-detectionscene-text-detection-recognitionscene-text-recognitiontext-detectiontext-recognitionvision-languagedocumentdocument-analysisdocument-recognitiondocument-understandingdocument-intelligencevision-language-model
C++ 1.73 k
2 个月前
https://static.github-zh.com/github_avatars/DWCTOD?size=40
DWCTOD / CVPR2024-Papers-with-Code-Demo

#大语言模型#收集 CVPR 最新的成果,包括论文、代码和demo视频等,欢迎大家推荐!Collect the latest CVPR (Conference on Computer Vision and Pattern Recognition) results, including papers, code, and demo videos, etc., and welcome recommendations...

cvpr2021cvpr机器视觉cvpr2022cvpr2023cvpr2024大语言模型multimodal-deep-learningobject-detectionsegment-anythingsegmentation
1.37 k
1 年前
jrzaurin/pytorch-widedeep
https://static.github-zh.com/github_avatars/jrzaurin?size=40
jrzaurin / pytorch-widedeep

#计算机科学#A flexible package for multimodal-deep-learning to combine tabular data with text and images using Wide and Deep models in Pytorch

PyTorchtabular-datatextImagemultimodal-deep-learningpytorch-nlppytorch-transformers深度学习model-hubPython
Python 1.36 k
4 个月前
https://static.github-zh.com/github_avatars/yuewang-cuhk?size=40
yuewang-cuhk / awesome-vision-language-pretraining-papers

Recent Advances in Vision and Language PreTrained Models (VL-PTMs)

vision-and-languagepretrainingmultimodal-deep-learningbert
1.15 k
3 年前
https://static.github-zh.com/github_avatars/TheShadow29?size=40
TheShadow29 / awesome-grounding

#自然语言处理#awesome grounding: A curated list of research papers in visual grounding

机器视觉自然语言处理groundingAwesome Listspapersarxivvideo-understandingcaptioning-videosembodied-agentmultimodal-deep-learninglanguage-groundingBukkit
1.08 k
2 年前
https://static.github-zh.com/github_avatars/declare-lab?size=40
declare-lab / multimodal-deep-learning

This repository contains various models targetting multimodal representation learning, multimodal fusion for downstream tasks such as multimodal sentiment analysis.

multimodal-deep-learningmultimodal-learningmultimodal-interactions
OpenEdge ABL 846
2 年前
https://static.github-zh.com/github_avatars/richard-peng-xia?size=40
richard-peng-xia / awesome-multimodal-in-medical-imaging

A collection of resources on applications of multi-modal learning in medical imaging.

Medical imagingmultimodal-deep-learningmultimodal-learningvisual-question-answeringlarge-language-modelslarge-multimodal-modelsmultimodal-large-language-models
760
11 天前
https://static.github-zh.com/github_avatars/omriav?size=40
omriav / blended-latent-diffusion

#计算机科学#Official implementation for "Blended Latent Diffusion" [SIGGRAPH 2023]

深度学习multimodalmultimodal-deep-learningtext-to-imagetext-to-image-synthesis机器视觉diffusiondiffusion-modelsgenerative-modelimage-generationPyTorchtext-driven-editing
Jupyter Notebook 604
1 年前
https://static.github-zh.com/github_avatars/MMMU-Benchmark?size=40
MMMU-Benchmark / MMMU

#自然语言处理#This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI"

机器视觉深度学习深度神经网络evaluationfoundation-modelslarge-language-modelslarge-multimodal-models大语言模型机器学习multimodalmultimodal-deep-learningmultimodal-learningmultimodality自然语言处理question-answeringSTEMvisual-question-answering
Python 440
1 个月前
https://static.github-zh.com/github_avatars/remyxai?size=40
remyxai / VQASynth

Compose multimodal datasets 🎹

multimodal-datasetsmultimodal-deep-learningsynthetic-dataset-generation
Python 403
5 天前
https://static.github-zh.com/github_avatars/jianghaojun?size=40
jianghaojun / Awesome-Parameter-Efficient-Transfer-Learning

#计算机科学#A collection of parameter-efficient transfer learning papers focusing on computer vision and multimodal domains.

机器视觉深度学习机器学习multimodal-deep-learningparameter-efficient-learningparameter-efficient-tuningtransfer-learning
402
9 个月前
https://static.github-zh.com/github_avatars/kyegomez?size=40
kyegomez / Med-PaLM

#计算机科学#Towards Generalist Biomedical AI

biomedical深度学习gpt4multimodalmultimodal-deep-learningmultimodalityOpen Source
Python 397
1 年前
https://static.github-zh.com/github_avatars/theislab?size=40
theislab / scarches

#计算机科学#Reference mapping for single-cell genomics

深度学习data-integrationmultimodal-deep-learning
Jupyter Notebook 368
25 天前
https://static.github-zh.com/github_avatars/westlake-repl?size=40
westlake-repl / Recommendation-Systems-without-Explicit-ID-Features-A-Literature-Review

#大语言模型#Paper List of Pre-trained Foundation Recommender Models

ChatGPTfoundation-model大语言模型multimodalpre-trainingrecommender-systemtransfer-learningchatgpt3language-modelmultimodal-deep-learningrecommendation-system
353
10 个月前
https://static.github-zh.com/github_avatars/fcakyon?size=40
fcakyon / content-moderation-deep-learning

Deep learning based content moderation from text, audio, video & image input modalities.

content-moderationmovie-trailernudity-detectionmultimodal-deep-learningnsfw-recognition
353
6 个月前
loading...