GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

multimodality

Website
Wikipedia
https://static.github-zh.com/github_avatars/lucidrains?size=40
lucidrains / big-sleep

#计算机科学#A simple command line tool for text to image generation, using OpenAI's CLIP and a BigGAN. Technique was originally created by https://twitter.com/advadnoun

人工智能深度学习text-to-imageGenerative Adversarial Networkmultimodality
Python 2.57 k
3 年前
https://static.github-zh.com/github_avatars/BAAI-Agents?size=40
BAAI-Agents / Cradle

#大语言模型#The Cradle framework is a first attempt at General Computer Control (GCC). Cradle supports agents to ace any computer task by enabling strong reasoning abilities, self-improvment, and skill curation, ...

ai-agentai-agents-frameworkcomputer-controlcradlegccgenerative-aigroundinglarge-language-models大语言模型lmmmultimodalityvision-language-modelvlm人工智能
Python 2.11 k
7 个月前
https://static.github-zh.com/github_avatars/hymie122?size=40
hymie122 / RAG-Survey

#大语言模型#Collecting awesome papers of RAG for AIGC. We propose a taxonomy of RAG foundations, enhancements, and applications in paper "Retrieval-Augmented Generation for AI-Generated Content: A Survey".

aigcragsurveydiffusion-models大语言模型multimodality
1.66 k
10 个月前
https://static.github-zh.com/github_avatars/PreferredAI?size=40
PreferredAI / cornac

A Comparative Framework for Multimodal Recommender Systems

recommender-systemrecommendation-algorithmsrecommendation-enginematrix-factorizationcollaborative-filteringmultimodal-learningrecommendation-systemmultimodality
Python 960
2 个月前
https://static.github-zh.com/github_avatars/ArrowLuo?size=40
ArrowLuo / CLIP4Clip

An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"

multimodal-learningmultimodalitymultimodalsearchrankingretrieval-modelretrievalactivitynetclip
Python 953
1 年前
https://static.github-zh.com/github_avatars/AIDC-AI?size=40
AIDC-AI / Ovis

A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.

聊天机器人llama3multimodalmultimodal-large-language-modelsmultimodalityqwenvision-language-model
Python 933
3 个月前
https://static.github-zh.com/github_avatars/fnzhan?size=40
fnzhan / Generative-AI

[TPAMI 2023] Multimodal Image Synthesis and Editing: The Generative AI Era

aigcdiffusion-modelgansmultimodality
TeX 757
2 年前
https://static.github-zh.com/github_avatars/aimclub?size=40
aimclub / FEDOT

#计算机科学#Automated modeling and machine learning framework FEDOT

automl机器学习evolutionary-algorithmsautomated-machine-learninghyperparameter-optimizationparameter-tuning自动化multimodality
Python 673
4 天前
https://static.github-zh.com/github_avatars/VITA-MLLM?size=40
VITA-MLLM / Woodpecker

#大语言模型#✨✨Woodpecker: Hallucination Correction for Multimodal Large Language Models

hallucinationhallucinationslarge-language-models大语言模型mllmmultimodal-large-language-modelsmultimodality
Python 636
6 个月前
https://static.github-zh.com/github_avatars/jshilong?size=40
jshilong / GPT4RoI

#大语言模型#GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest

gpt大语言模型multimodalityroi机器视觉
Python 531
12 天前
https://static.github-zh.com/github_avatars/microsoft?size=40
microsoft / LLM2CLIP

LLM2CLIP makes SOTA pretrained CLIP model more SOTA ever.

clipmultimodality
Python 524
3 个月前
https://static.github-zh.com/github_avatars/zengyan-97?size=40
zengyan-97 / X-VLM

X-VLM: Multi-Grained Vision Language Pre-Training (ICML 2022)

multimodalityvision-and-language
Python 478
3 年前
https://static.github-zh.com/github_avatars/YingqingHe?size=40
YingqingHe / Awesome-LLMs-meet-Multimodal-Generation

#大语言模型#🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).

aigclarge-language-modelslarge-vision-language-modelsmultimodal-generationmultimodal-large-language-modelsmultimodal-modelsmultimodalitytext-to-3dtext-to-audiotext-to-imagetext-to-speechtext-to-video大语言模型mllm
HTML 478
2 个月前
https://static.github-zh.com/github_avatars/afiaka87?size=40
afiaka87 / clip-guided-diffusion

#计算机科学#A CLI tool/python module for generating images from text using guided diffusion and CLIP from OpenAI.

multimodalimage-generationtext-to-image-synthesistext-to-imageopenai深度学习人工智能diffusionmultimodality
Python 462
3 年前
https://static.github-zh.com/github_avatars/MMMU-Benchmark?size=40
MMMU-Benchmark / MMMU

#自然语言处理#This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI"

机器视觉深度学习深度神经网络evaluationfoundation-modelslarge-language-modelslarge-multimodal-models大语言模型机器学习multimodalmultimodal-deep-learningmultimodal-learningmultimodality自然语言处理question-answeringSTEMvisual-question-answering
Python 440
1 个月前
https://static.github-zh.com/github_avatars/HazyResearch?size=40
HazyResearch / fonduer

#计算机科学#A knowledge base construction engine for richly formatted data

multimodality机器学习
Python 410
4 年前
https://static.github-zh.com/github_avatars/kyegomez?size=40
kyegomez / Med-PaLM

#计算机科学#Towards Generalist Biomedical AI

biomedical深度学习gpt4multimodalmultimodal-deep-learningmultimodalityOpen Source
Python 397
1 年前
https://static.github-zh.com/github_avatars/lium-lst?size=40
lium-lst / nmtpytorch

#计算机科学#Sequence-to-Sequence Framework in PyTorch

深度学习PyTorchseq2seqnmtneural-machine-translationasrspeech-recognitionmultimodalitycnn
Jupyter Notebook 391
2 年前
https://static.github-zh.com/github_avatars/OmicsML?size=40
OmicsML / dance

#计算机科学#DANCE: a deep learning library and benchmark platform for single-cell analysis

Bioinformatics数据科学深度学习graph-neural-networks机器学习multimodalityPythonbenchmarkcomputational-biology
Python 363
6 天前
https://static.github-zh.com/github_avatars/kyegomez?size=40
kyegomez / CM3Leon

An open source implementation of "Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning", an all-new multi modal AI that uses just a decoder to generate both text and images

attentionattention-is-all-you-needdallemultimodalmultimodal-learningmultimodality
Python 361
2 年前
loading...