GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

pdf-parsing

Website
Wikipedia
https://static.github-zh.com/github_avatars/py-pdf?size=40
py-pdf / pypdf

A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files

pypdf2pdfPythonpdf-parserpdf-parsingpdf-manipulationpdf-documentshelp-wanted
Python 9.15 k
6 天前
https://static.github-zh.com/github_avatars/jsvine?size=40
jsvine / pdfplumber

Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.

pdfpdf-parsingtable-extraction
Python 7.84 k
1 个月前
https://static.github-zh.com/github_avatars/galkahana?size=40
galkahana / HummusJS

Node.js module for high performance creation, modification and parsing of PDF files and streams

pdf-generationpdf-parsingNode.jspdf-manipulation
C 1.16 k
4 个月前
https://static.github-zh.com/github_avatars/adithya-s-k?size=40
adithya-s-k / marker-api

Easily deployable 🚀 API to convert PDF to markdown quickly with high accuracy.

FastAPIpdf-converterpdf-filespdf-parserpdf-parsingAPIREST API
Python 865
8 个月前
https://static.github-zh.com/github_avatars/drmingler?size=40
drmingler / docling-api

Easily deployable and scalable backend server that efficiently converts various document formats (pdf, docx, pptx, html, images, etc) into Markdown. With support for both CPU and GPU processing, it is...

APIFastAPImarkdown-parserpdf-conversionpdf-converterpdf-parserpdf-parsingpdf-to-markdown
Python 617
3 个月前
https://static.github-zh.com/github_avatars/jstockwin?size=40
jstockwin / py-pdf-parser

A Python tool to help extracting information from structured PDFs.

pdfParsingpdf-parsing
Python 404
13 天前
https://static.github-zh.com/github_avatars/chunyenHuang?size=40
chunyenHuang / hummusRecipe

A powerful PDF tool for NodeJS based on HummusJS.

pdfpdf-filespdf-generationpdf-parsingpdf-manipulationNode.js
JavaScript 346
2 年前
https://static.github-zh.com/github_avatars/thoqbk?size=40
thoqbk / traprange

(Java)A Method to Extract Tabular Content from PDF Files

JavapdfpdfboxParserpdf-parsingpdf-manipulationpdf-files
HTML 335
2 年前
https://static.github-zh.com/github_avatars/ck-unifr?size=40
ck-unifr / pdf_parsing

#大语言模型#PDF解析(文字,章节,表格,图片,参考),基于大模型(ChatGLM2-6B, RWKV)+langchain+streamlit的PDF问答,摘要,信息抽取

langchain大语言模型pdfpdf-parsingrwkvPythonchatglm2-6binformation-extractionchatpdfStreamlit
Python 197
2 年前
https://static.github-zh.com/github_avatars/ScientaNL?size=40
ScientaNL / pdf-extractor

Node.js module for rendering pdf pages to images, svgs, html files, text files and json metadata

pdf-parsingNode.jsimage-generation
JavaScript 100
2 年前
https://static.github-zh.com/github_avatars/iamarunbrahma?size=40
iamarunbrahma / pdf-to-markdown

Conversion of PDF documents to structured Markdown, optimized for Retrieval Augmented Generation (RAG) and other NLP tasks. Extract text, tables, and images with preserved formatting for enhanced info...

document-conversiondocument-processinginformation-retrievalpdf-parsingpdf-to-markdownPythonragretrieval-augmented-generationtext-extractionpdf-converter
Python 83
7 个月前
https://static.github-zh.com/github_avatars/rostrovsky?size=40
rostrovsky / pdf-table

Java utility for parsing PDF tabular data using Apache PDFBox and OpenCV

OpenCVopencv3pdfboxtablestableJavajava-librarypdf-parsing
Java 72
2 年前
https://static.github-zh.com/github_avatars/hellpanderrr?size=40
hellpanderrr / linkedin-pdf-parsing

Parsing resumes in a PDF format from linkedIn

linkedinPythonpdf-parsingresume-parser
Python 68
9 年前
https://static.github-zh.com/github_avatars/tuffstuff9?size=40
tuffstuff9 / nextjs-pdf-parser

Next.js template for seamless PDF parsing using pdf2json and FilePond. Ideal for developers seeking a ready-to-use solution for PDF content extraction in Next.js projects.

content-extractionfilepondNextpdf-parserpdf-parsing
TypeScript 61
2 年前
https://static.github-zh.com/github_avatars/dipietrantonio?size=40
dipietrantonio / pdf4py

A PDF parser written in Python 3 with no external dependencies.

pdfParserpdf-parsingPythoninformation-extraction
Python 57
5 年前
https://static.github-zh.com/github_avatars/abdullahshafiq-20?size=40
abdullahshafiq-20 / ResumeTex

ResumeTex is an AI-powered tool that converts standard PDF resumes into professionally formatted LaTeX documents. This service helps you create elegant, structured resumes without needing to learn LaT...

自动化developer-toolsdocument-processingExpressLaTeXNode.jsOpen Sourcepdf-parsingReactresumeTailwind CSSTeX
JavaScript 31
20 天前
https://static.github-zh.com/github_avatars/DQ-Zhang?size=40
DQ-Zhang / refchaser

Written in python, for checking reference lists in systematic reviews and literature reviews, helps with reference list searching both backward&forward by extracting references and creating search que...

research-papertext-miningpdf-parsing
Python 23
5 年前
https://static.github-zh.com/github_avatars/adrienjoly?size=40
adrienjoly / npm-pdfreader-example

Example of use of pdfreader: parse a PDF résumé

pdf-parsingExample
JavaScript 16
3 年前
https://static.github-zh.com/github_avatars/malice-plugins?size=40
malice-plugins / pdf

Malice PDF Plugin

maliceMalwarepdf插件pdf-parsingDockermalware-analysis
Python 16
6 年前
https://static.github-zh.com/github_avatars/IQDM?size=40
IQDM / IQDM-PDF

A collection of PDF data mining scripts for various IMRT QA vendors

pdf-parsingdataminingqa
Python 12
4 年前
loading...