paligemma · GitHub Topics

#计算机科学#A collection of tutorials on state-of-the-art computer vision models and techniques. Explore everything from foundational architectures like ResNet to cutting-edge models like YOLO11, RT-DETR, SAM 2, ...

机器视觉深度学习深度神经网络 image-classification image-segmentation object-detection yolov5 PyTorch 教程 yolov8 google-colab 机器学习 zero-shot-classification open-vocabulary-detection automatic-labeling-system open-vocabulary-segmentation paligemma qwen vlm

Jupyter Notebook 8.04 k

2 天前

roboflow / maestro

streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL

captioning fine-tuning florence-2 multimodal objectdetection paligemma phi-3-vision transformers vision-and-language vqa qwen2-vl

Python 2.6 k

3 天前

google-gemini / gemma-cookbook

A collection of guides and examples for the Gemma open models from Google.

codegemma gemma paligemma recurrentgemma

Jupyter Notebook 1.95 k

2 天前

Blaizzy / mlx-vlm

#大语言模型#MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.

llava 大语言模型 MLX vision-transformer apple-silicon idefics local-ai paligemma vision-framework vision-language-model florence2 molmo pixtral

Python 1.55 k

9 天前

adithya-s-k / YoloGemma

Testing and evaluating the capabilities of Vision-Language models (PaliGemma) in performing computer vision tasks such as object detection and segmentation.

gemma paligemma vlm

Python 82

1 年前

sayedmohamedscu / Vision-language-models-VLM

vision language models finetuning notebooks & use cases (Medgemma - paligemma - florence .....)

colab-notebook 机器视觉 finetuning multimodal paligemma vlm florence-2 lora Medical imaging qlora

Jupyter Notebook 46

1 个月前

BUAADreamer / MLLM-Finetuning-Demo

使用LLaMA-Factory微调多模态大语言模型的示例代码 Demo of Finetuning Multimodal LLM with LLaMA-Factory

llava mllm paligemma finetune-llm lora transformers pretraining

Python 45

1 年前

autodistill / autodistill-paligemma

Use PaliGemma to auto-label data for use in training fine-tuned vision models.

机器视觉 zero-shot-object-detection paligemma

Python 12

1 年前

MaxLSB / mini-paligemma2

#计算机科学#Minimalist implementation of PaliGemma 2 & PaliGemma VLM from scratch

深度学习机器学习 paligemma Python PyTorch vision-language-model vlm

Python 9

5 个月前

kornia / kornia-paligemma

Rust implementation of Google Paligemma with Candle

paligemma Rust visual-language-models

Rust 6

2 个月前

shaadclt / Fine-tune-PaliGemma-Image-Captioning

This project demonstrates how to fine-tune PaliGemma model for image captioning. The PaliGemma model, developed by Google Research, is designed to handle images and generate corresponding captions.

fine-tuning image-captioning paligemma

Jupyter Notebook 6

8 个月前