GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

image-to-text

Website
Wikipedia
https://static.github-zh.com/github_avatars/thiagoalessio?size=40
thiagoalessio / tesseract-ocr-for-php

A wrapper to work with Tesseract OCR inside PHP.

OCRtesseractPHPtext-recognitionimage-to-text
PHP 2.99 k
4 个月前
https://static.github-zh.com/github_avatars/lucidrains?size=40
lucidrains / CoCa-pytorch

#计算机科学#Implementation of CoCa, Contrastive Captioners are Image-Text Foundation Models, in Pytorch

人工智能attention-mechanismcontrastive-learning深度学习multimodaltransformersimage-to-text
Python 1.16 k
2 年前
killkimno/MORT
https://static.github-zh.com/github_avatars/killkimno?size=40
killkimno / MORT

MORT 번역기 프로젝트 - Real-time game translator with OCR

OCRauto-translationtranslationtranslategamegame-translationtesseract-ocrimage-to-text
C# 1.13 k
20 天前
https://static.github-zh.com/github_avatars/PaddlePaddle?size=40
PaddlePaddle / PaddleMIX

Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high per...

aigcstable-diffusionclipimage-to-texttext-to-imagecontrolnetmultimodaltext-to-videoditllavasoraqwen2-vlminicpm-v
Python 675
2 天前
https://static.github-zh.com/github_avatars/Flame-Code-VLM?size=40
Flame-Code-VLM / Flame-Code-VLM

#前端开发#Flame is an open-source multimodal AI system designed to translate UI design mockups into high-quality React code. It leverages vision-language modeling, automated data synthesis, and structured train...

code-generationfrontend-developmentvision-language-model人工智能深度学习前端multimodalOpen SourceReactvlmdeepseekdesign-to-codefront-endimage-to-text大语言模型Vue.js
Python 537
4 个月前
https://static.github-zh.com/github_avatars/zapolnoch?size=40
zapolnoch / node-tesseract-ocr

A Node.js wrapper for the Tesseract OCR API

tesseractOCRtext-recognitionimage-to-text
JavaScript 312
2 年前
https://static.github-zh.com/github_avatars/google?size=40
google / imageinwords

Data release for the ImageInWords (IIW) paper.

evaluationimage-captioningimage-to-textdatasetdataset-generation
JavaScript 217
8 个月前
https://static.github-zh.com/github_avatars/Yushi-Hu?size=40
Yushi-Hu / tifa

TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering

image-to-textlarge-language-modelstext-to-imagevisual-question-answering
Python 169
1 年前
https://static.github-zh.com/github_avatars/NormXU?size=40
NormXU / nougat-latex-ocr

Codebase for fine-tuning / evaluating nougat-based image2latex generation models

image-to-text
Python 154
10 个月前
https://static.github-zh.com/github_avatars/yardstick17?size=40
yardstick17 / image_text_reader

The module extracts text from image using the tesseract-OCR engine. Generally, text present in the images are blur or are of uneven sizes. The image is pre-processed for better comprehension by OCR. T...

OCRimage-to-texttesseract-ocr
Python 147
6 年前
https://static.github-zh.com/github_avatars/shoryasethia?size=40
shoryasethia / markdrop

#大语言模型#A Python package for converting PDFs to markdown while extracting images and tables, generate descriptive text descriptions for extracted tables/images using several LLM clients. And many more functio...

Open Sourcepypi-packageimage-to-text大语言模型pdf-to-markdownpdf-to-texttable-to-textagents
Python 135
1 个月前
https://static.github-zh.com/github_avatars/nateshmbhat?size=40
nateshmbhat / card-scanner-flutter

A flutter package for Fast, Accurate and Secure Credit card & Debit card scanning

FlutterDart机器学习人工智能credit-card图像处理image-to-text
Swift 126
6 个月前
https://static.github-zh.com/github_avatars/NanoNets?size=40
NanoNets / ocr-python

OCR library to extract text & tables from PDF files and images. Convert any image or PDF to CSV / TXT / JSON / Searchable PDF.

OCRtesseractpdfPythonpdf-to-jsonpdf-to-textimage-to-text
Jupyter Notebook 110
3 年前
https://static.github-zh.com/github_avatars/mshdabiola?size=40
mshdabiola / NotePad

Notepad is multi module Jetpack compose note taking app with sketch pad, voice recorder, image capturing app

AndroidActionsJetpack ComposeKotlinimage-to-textroom-persistence-library
Kotlin 109
17 天前
https://static.github-zh.com/github_avatars/MIMICLab?size=40
MIMICLab / L-Verse

#计算机科学#L-Verse: Bidirectional Generation Between Image and Text

深度学习PyTorchpytorch-lightningvq-vaetransformerimage-to-texttext-to-imageimage-captioning
Python 108
4 个月前
https://static.github-zh.com/github_avatars/BEPb?size=40
BEPb / image_to_ascii

Everything is very simple: you either download a picture file or specify its link when running a python script, and output you get a text file, and you can immediately view on the command line how it ...

cmdimage-to-textconversionconvertconverterPythonScript
Python 102
2 年前
https://static.github-zh.com/github_avatars/MuhametSmaili?size=40
MuhametSmaili / note-it

OCR functionality in a feature-rich note-taking extension.

ChromeChrome 插件image-to-textnote-takingOCRocr-recognitionReacttiptap
TypeScript 100
8 个月前
https://static.github-zh.com/github_avatars/untrix?size=40
untrix / im2latex

#计算机科学#Solution to im2latex request for research of openai

神经网络深度学习机器视觉机器学习Tensorflowgenerative-modelsequence-to-sequenceencoder-decoderocr-recognitionim2lateximage-to-text
Jupyter Notebook 90
1 年前
https://static.github-zh.com/github_avatars/farhanchoudhary?size=40
farhanchoudhary / PAN_Card_OCR_Project

To extract details from Indian National Identification Cards such as PAN (completed) & Aadhar, Passport, Driving License (WIP) in a structured format

tesseractpanOCRoptical-character-recognition图像处理image-to-text
Python 81
5 年前
https://static.github-zh.com/github_avatars/Carleslc?size=40
Carleslc / ImageToText

OCR with Google's AI technology (Cloud Vision API)

OCRoptical-character-recognitionimage-to-text人工智能Google 云
Python 75
2 年前
loading...