florence-2 · GitHub Topics

streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL

captioning fine-tuning florence-2 multimodal objectdetection paligemma phi-3-vision transformers vision-and-language vqa qwen2-vl

Python 2.63 k

3 天前

jhc13 / taggui

Tag manager and captioner for image datasets

image-captioning pyside6 stable-diffusion llava cogvlm florence-2

Python 1.11 k

4 个月前

D-Ogi / WatermarkRemover-AI

AI-Powered Watermark Remover using Florence-2 and LaMA Models: A Python application leveraging state-of-the-art deep learning models to effectively remove watermarks from images with a user-friendly P...

florence-2 lama-cleaner watermark-remover dataset-creation inpainting

Python 691

1 个月前

autodistill / autodistill-grounded-sam-2

Use Segment Anything 2, grounded with Florence-2, to auto-label data for use in training vision models.

florence-2

Python 128

1 年前

Ravi-Teja-konda / Surveillance_Video_Summarizer

#大语言模型#VLM driven tool that processes surveillance videos, extracts frames, and generates insightful annotations using a fine-tuned Florence-2 Vision-Language Model. Includes a Gradio-based interface for que...

人工智能 ChatGPT florence-2 gpt-4 gradio gradio-python-llm huggingface summarization Video vision-and-language vlm

Python 124

3 个月前

anyantudre / Florence-2-Vision-Language-Model

#计算机科学#Florence-2 is a novel vision foundation model with a unified, prompt-based representation for a variety of computer vision and vision-language tasks.

机器视觉深度学习 florence-2 huggingface vision-language vision-language-model vision-transformer vision-transformer-models

Jupyter Notebook 91

1 年前

Damarcreative / rem-wm

Watermark remover tool that leverages the capabilities of Microsoft Florence and Lama Cleaner models.

florence-2 lama-cleaner watermark

Python 85

8 个月前

retkowsky / florence-2

Florence-2

Azure florence-2

Jupyter Notebook 70

7 个月前

autodistill / autodistill-florence-2

Use Florence 2 to auto-label data for use in training fine-tuned object detection models.

florence-2 object-detection zero-shot-object-detection

Python 67

1 年前

sayedmohamedscu / Vision-language-models-VLM

vision language models finetuning notebooks & use cases (Medgemma - paligemma - florence .....)

colab-notebook 机器视觉 finetuning multimodal paligemma vlm florence-2 lora Medical imaging qlora

Jupyter Notebook 49

2 个月前

fireicewolf / wd-llm-caption-cli

A Python base cli tool for caption images with WD series, Joy-caption-pre-alpha,meta Llama 3.2 Vision Instruct and Qwen2 VL Instruct models.

qwen2-vl florence-2

Python 39

6 个月前

Iteranya / AktivaAI

Local LLM Discord Bot

人工智能 discord-bot florence-2 llama multimodal roleplay 聊天机器人

Python 17

3 个月前

jacobmarks / fiftyone_florence2_plugin

Run SOTA Vision-Language Model Florence-2 on your data!

机器视觉 florence-2 机器学习 transformer vision-language-model

Jupyter Notebook 13

6 个月前

mithunparab / text2segment_video

Simple Video Summarization using Text-to-Segment Anything (Florence2 + SAM2) This project provides a video processing tool that utilizes advanced AI models, specifically Florence2 and SAM2, to detect...

florence-2 optical-flow raft segment-anything

Python 10

7 个月前

nguyennpa412 / simple-multimodal-ai

#大语言模型#Simple Gradio application integrated with Hugging Face Multimodals to support visual question answering chatbot and more features

机器视觉 Docker gradio text-to-speech visual-question-answering vlm 大语言模型 mllm florence-2

Python 6

1 年前

PRITHIVSAKTHIUR / Florence-2-Image-Caption

This application utilizes the powerful Florence-2 vision-language model from Microsoft to generate comprehensive captions for images. The model is capable of understanding visual content and expressin...

florence-2 gradio huggingface image-captioning 图像处理 pillow timm torch transformers vision-language-model

Python 6

2 个月前

Rm1n90 / Florence2Onnx

ONNX deploys for Florence 2 visual multimodal

florence-2 onnx onnxruntime inference

Python 6

7 个月前

sitammeur / TextSnap

TextSnap: Demo for Florence 2 model used in OCR tasks to extract and visualize text from images.

人工智能 florence-2 gradio gradio-interface huggingface-spaces huggingface-transformers optical-character-recognition vision-language-model Python

Python 5

5 个月前

regiellis / ecko-cli

ecko-cli is a simple CLI tool that streamlines the process of processing images in a directory, generating captions, and saving them as text files. Additionally, it provides functionalities to create ...

人工智能命令行界面 florence-2 generative-ai huggingface-transformers image-classification 图像处理 onnxruntime

Python 5

2 个月前

jkawamoto / mcp-florence2

An MCP server for processing images using Florence-2

florence-2 mcp-server Python

Python 4

2 个月前