GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

vlm

Website
Wikipedia
huggingface/transformers
https://static.github-zh.com/github_avatars/huggingface?size=40
huggingface / transformers

#自然语言处理#为 Jax、PyTorch 和 TensorFlow 打造的先进的自然语言处理

自然语言处理PyTorchpytorch-transformerstransformermodel-hubpretrained-modelsspeech-recognitionHacktoberfestPython机器学习深度学习audiodeepseekgemmaglm大语言模型qwenvlm
Python 145.62 k
3 小时前
https://static.github-zh.com/github_avatars/sgl-project?size=40
sgl-project / sglang

#大语言模型#SGLang is a fast serving framework for large language models and vision language models.

CUDAinferencellamallava大语言模型llm-servingmoePyTorchtransformervlmllama3llama3-1deepseekdeepseek-llmdeepseek-v3deepseek-r1deepseek-r1-zeroqwen3llama4
Python 15.14 k
7 小时前
https://static.github-zh.com/github_avatars/bytedance?size=40
bytedance / UI-TARS-desktop

A GUI Agent application based on UI-TARS(Vision-Language Model) that allows you to control your computer using natural language.

agentvlmElectronvisionVitecomputer-usegui-agentsmcpmcp-server
TypeScript 14.62 k
13 小时前
https://static.github-zh.com/github_avatars/roboflow?size=40
roboflow / notebooks

#计算机科学#A collection of tutorials on state-of-the-art computer vision models and techniques. Explore everything from foundational architectures like ResNet to cutting-edge models like YOLO11, RT-DETR, SAM 2, ...

机器视觉深度学习深度神经网络image-classificationimage-segmentationobject-detectionyolov5PyTorch教程yolov8google-colab机器学习zero-shot-classificationopen-vocabulary-detectionautomatic-labeling-systemopen-vocabulary-segmentationpaligemmaqwenvlm
Jupyter Notebook 7.83 k
6 天前
https://static.github-zh.com/github_avatars/CVHub520?size=40
CVHub520 / X-AnyLabeling

#大语言模型#Effortless data labeling with AI support from Segment Anything and other awesome models.

labeling-toolpaddlePyTorchresnetsamyolo深度学习onnxclip大语言模型annotation-toolclassificationdepth-estimationgrounding-dinoimage-segmentationmattingobject-detectionpose-estimationvlm
Python 5.77 k
17 小时前
https://static.github-zh.com/github_avatars/om-ai-lab?size=40
om-ai-lab / VLM-R1

#大语言模型#Solve Visual Understanding with Reinforced VLMs

deepseek-r1grpo大语言模型multimodalvlmqwenreinforcement-learning
Python 5.14 k
1 个月前
https://static.github-zh.com/github_avatars/NexaAI?size=40
NexaAI / nexa-sdk

#大语言模型#Nexa SDK is a comprehensive toolkit for supporting GGML and ONNX models. It supports text generation, image generation, vision-language models (VLM), Audio Language Model, auto-speech-recognition (ASR...

asredge-computing大语言模型on-device-aion-device-mlSDKstable-diffusiontransformersttsvlmlanguage-modelsdk-pythonWhisperaudio
Python 4.57 k
3 个月前
https://static.github-zh.com/github_avatars/joanrod?size=40
joanrod / star-vector

#大语言模型#StarVector is a foundation model for SVG generation that transforms vectorization into a code generation task. Using a vision-language modeling architecture, StarVector processes both visual and textu...

大语言模型multimodal-large-language-modelsSVGvlm
Python 3.88 k
2 个月前
https://static.github-zh.com/github_avatars/MiniMax-AI?size=40
MiniMax-AI / MiniMax-01

#大语言模型#The official repo of MiniMax-Text-01 and MiniMax-VL-01, large-language-model & vision-language-model based on Linear Attention

large-language-models大语言模型vision-language-modelvlm
Python 2.79 k
5 天前
https://static.github-zh.com/github_avatars/SkyworkAI?size=40
SkyworkAI / Skywork-R1V

#大语言模型#Skywork-R1V2:Multimodal Hybrid Reinforcement Learning for Reasoning

deepseek-r1大语言模型reasoningvlmgrporeinforcement-learning
Python 2.62 k
6 天前
https://static.github-zh.com/github_avatars/om-ai-lab?size=40
om-ai-lab / OmAgent

#大语言模型#Build multimodal language agents for fast prototype and production

large-language-modelsmultimodal-agentvision-and-languageagentworkflow聊天机器人gpt4大语言模型multimodalragvlmgptgradiollamallavaopenaiPythongemini
Python 2.51 k
3 个月前
https://static.github-zh.com/github_avatars/QiuYannnn?size=40
QiuYannnn / Local-File-Organizer

#大语言模型#An AI-powered file management tool that ensures privacy by organizing local texts, images. Using Llama3.2 3B and Llava v1.6 models with the Nexa SDK, it intuitively scans, restructures, and organizes ...

大语言模型on-device-aivlmllama3
Python 2.36 k
8 个月前
https://static.github-zh.com/github_avatars/BAAI-Agents?size=40
BAAI-Agents / Cradle

#大语言模型#The Cradle framework is a first attempt at General Computer Control (GCC). Cradle supports agents to ace any computer task by enabling strong reasoning abilities, self-improvment, and skill curation, ...

ai-agentai-agents-frameworkcomputer-controlcradlegccgenerative-aigroundinglarge-language-models大语言模型lmmmultimodalityvision-language-modelvlm人工智能
Python 2.11 k
7 个月前
https://static.github-zh.com/github_avatars/xlang-ai?size=40
xlang-ai / OSWorld

#自然语言处理#[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

agent人工智能benchmarkmultimodalreinforcement-learningrpacode-generationlanguage-model命令行界面GUI自然语言处理large-action-model大语言模型vlm
Python 1.92 k
5 天前
https://static.github-zh.com/github_avatars/heshengtao?size=40
heshengtao / comfyui_LLM_party

LLM Agent Framework in ComfyUI includes MCP sever, Omost,GPT-sovits, ChatTTS,GOT-OCR2.0, and FLUX prompt nodes,access to Feishu,discord,and adapts to all llms with similar openai / aisuite interfaces,...

comfyuiopenaiworkflowagentdifymacOSgraphraggeminillamaollamao1LinuxfluxggufvlmOCRmcp
Python 1.73 k
2 天前
https://static.github-zh.com/github_avatars/coderonion?size=40
coderonion / awesome-yolo-object-detection

#数据仓库#🚀🚀🚀 A collection of some awesome public YOLO object detection series projects and the related object detection datasets.

yoloyolov5tensorrtobject-detectionyolov8CUDA大语言模型llamavlm数据集deepseekGUImllmqwen
1.5 k
16 天前
https://static.github-zh.com/github_avatars/ThuCCSLab?size=40
ThuCCSLab / Awesome-LM-SSP

#自然语言处理#A reading list for large models safety, security, and privacy (including Awesome LLM Security, Safety, etc.).

adversarial-attacksAwesome Listsdiffusion-modelsjailbreaklanguage-model大语言模型自然语言处理隐私safety安全vlm
1.49 k
9 天前
https://static.github-zh.com/github_avatars/modelscope?size=40
modelscope / evalscope

#大语言模型#A streamlined and customizable framework for efficient large model evaluation and performance benchmarking

evaluation大语言模型performanceragvlm
Python 1.15 k
2 天前
https://static.github-zh.com/github_avatars/BAAI-DCAI?size=40
BAAI-DCAI / Bunny

#大语言模型#A family of lightweight multimodal models.

mllmChatGPTgpt-4multimodal-large-language-modelsvlm中文english
Python 1.02 k
7 个月前
https://static.github-zh.com/github_avatars/THUDM?size=40
THUDM / CogAgent

An open-sourced end-to-end VLM-based GUI Agent

gui-agentcomputer-usevlmagentglm
Python 969
2 个月前
loading...