tesseract · GitHub Topics

tesseract-ocr / tesseract

#计算机科学#ORC 文字识别引擎。

tesseract tesseract-ocr OCR lstm 机器学习 ocr-engine Hacktoberfest

C++ 69.6 k

1 个月前

naptha / tesseract.js

#计算机科学#纯JavaScript OCR（文字识别），能识别超过100种语言文字

tesseract WebAssembly OCR JavaScript 深度学习

JavaScript 37.23 k

17 天前

ocrmypdf / OCRmyPDF

为扫描的 PDF 文件添加 OCR 文本层，以方便文本搜索和复制粘贴

Python OCR pdf 图像处理 tesseract

Python 31.17 k

2 天前

pymupdf / PyMuPDF

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

mupdf xps pdf-documents epub OCR pdf 字体 Python 数据科学 extract-data table-extraction tesseract text-processing text-shaping

Python 8.03 k

4 天前

tesseract-ocr / tessdata

Trained models with fast variant of the "best" LSTM models + legacy models

OCR tesseract

7.13 k

2 年前

aisingapore / TagUI

#自然语言处理#Free RPA tool by AI Singapore

rpa 人工智能 OpenCV tesseract 自然语言处理

JavaScript 6.08 k

6 个月前

tebelorg / RPA-Python

Python package for doing RPA

rpa Python OpenCV tesseract tagui sikuli cross-platform

Python 5.33 k

3 天前

thiagoalessio / tesseract-ocr-for-php

A wrapper to work with Tesseract OCR inside PHP.

OCR tesseract PHP text-recognition image-to-text

PHP 3.01 k

6 个月前

otiai10 / gosseract

Go package for OCR (Optical Character Recognition), by using Tesseract C++ library

Go tesseract tesseract-ocr OCR

Go 2.96 k

6 个月前

Dicklesworthstone / llm_aided_ocr

#大语言模型#Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections.

ai-assist llama2 大语言模型 OCR tesseract ocr-correction

Python 2.75 k

7 个月前

Goldziher / kreuzberg

Document intelligence framework for Python - Extract text, metadata, and structured data from PDFs, images, Office documents, and more. Built on Pandoc, PDFium, and Tesseract.

OCR text-extraction async document-intelligence mcp pandoc Python rag table-extraction tesseract

Python 2.35 k

7 小时前

rmtheis / android-ocr

#安卓#Experimental optical character recognition app

Android OCR tesseract optical-character-recognition

Java 2.24 k

7 年前

sirfz / tesserocr

A Python wrapper for the tesseract-ocr API

optical-character-recognition OCR tesseract python-library cython

Python 2.12 k

1 个月前

Pulover / PuloversMacroCreator

Automation Utility - Recorder & Script Generator

autohotkey tesseract 自动化 rpa robot

AutoHotkey 1.86 k

3 年前

ianzhao / textshot

Python tool for grabbing text via screenshot

Python Script python-script screenshot OCR ocr-recognition tesseract tesseract-ocr

Python 1.77 k

9 个月前

Akylas / OSS-DocumentScanner

#安卓#Document scanning app

Android document document-scanner OpenCV pdf scanner tesseract 图像处理

C++ 1.46 k

1 天前

ryfeus / lambda-packs

Precompiled packages for AWS Lambda

aws-lambda phantomjs Serverless Amazon Web Services Tensorflow Keras NumPy pandas tesseract scikit-learn OpenCV pillow Selenium Python lightgbm spaCy hdf PyTorch

Python 1.12 k

2 年前

GauravSingh9356 / J.A.R.V.I.S

#大语言模型#Personal Assistant built using python libraries. It does almost anything which includes sending emails, Optical Text Recognition, Dynamic News Reporting at any time with API integration, Todo list gen...