GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

document-parser

Website
Wikipedia
https://static.github-zh.com/github_avatars/infiniflow?size=40
infiniflow / ragflow

#大语言模型#RAGFlow 是一款基于深度文档理解构建的开源 RAG(Retrieval-Augmented Generation)引擎

document-understanding大语言模型rag深度学习document-parserretrieval-augmented-generationagentgraphragai-searchdeepseekdeepseek-r1ollama人工智能agentic-aimcpopenaiagenticdeep-researchagentic-workflowmulti-agent
TypeScript 64.2 k
1 天前
docling-project/docling
https://static.github-zh.com/github_avatars/docling-project?size=40
docling-project / docling

Get your documents ready for gen AI

人工智能convertdocumentspdftablesdocument-parserdocument-parsingdocxHTMLMarkdownpdf-converterpdf-to-jsonpdf-to-textpptxxlsx
Python 38.56 k
3 天前
https://static.github-zh.com/github_avatars/Unstructured-IO?size=40
Unstructured-IO / unstructured

#自然语言处理#Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to...

深度学习document-parsing机器学习自然语言处理OCRinformation-retrievaldata-pipelinespreprocessingpdf-to-textpdfpdf-to-jsondocument-image-analysisdonutdocument-image-processingdocument-parserdocxlangchain大语言模型
HTML 12.65 k
4 天前
Marker-Inc-Korea/AutoRAG
https://static.github-zh.com/github_avatars/Marker-Inc-Korea?size=40
Marker-Inc-Korea / AutoRAG

#大语言模型#AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation

analysisautomlbenchmarkingdocument-parserembeddingsevaluation大语言模型llm-evaluationllm-opsOpen SourceopsoptimizationpipelinePythonqaragrag-evaluationretrieval-augmented-generation
Python 4.27 k
13 天前
https://static.github-zh.com/github_avatars/run-llama?size=40
run-llama / llama_cloud_services

Knowledge Agents and Management in the Cloud

documentParsingpdfpdf-document-processorpptxstructured-datadocument-parserdocument-parsingdocx-to-markdownpdf-to-excelpdf-to-jsonpdf-to-textppt-to-jsontablesppt-to-markdownpdf-to-markdown
TypeScript 4.14 k
2 天前
https://static.github-zh.com/github_avatars/freeok?size=40
freeok / so-novel

小说下载|网文下载 | 网络小说

content-exportdocument-parserebookoffline-reader命令行界面tuinovel
Java 4.05 k1
4 小时前
Filimoa/open-parse
https://static.github-zh.com/github_avatars/Filimoa?size=40
Filimoa / open-parse

Improved file parsing for LLM’s

document-structuretable-detectiondocument-parserlayout-parsing
Python 3.06 k
10 个月前
https://static.github-zh.com/github_avatars/deepdoctection?size=40
deepdoctection / deepdoctection

#自然语言处理#A Repo For Document AI

document-parserdocument-image-analysistable-recognitionOCRdocument-aidocument-understandingPythondocument-layout-analysistable-detectionPyTorchTensorflowlayoutlm自然语言处理
Python 2.95 k
1 天前
https://static.github-zh.com/github_avatars/liweiphys?size=40
liweiphys / layra

#大语言模型#LAYRA—an enterprise-ready, out-of-the-box solution—unlocks next-generation intelligent systems powered by visual RAG and limitless visual multi-step agent workflow orchestration.

agentdocument-parserknowledge-basegpt-4o大语言模型FastAPIworkflow
TypeScript 807
1 个月前
https://static.github-zh.com/github_avatars/NanoNets?size=40
NanoNets / docstrange

#大语言模型#Extract and convert data from any document, images, pdfs, word doc, ppt or URL into multiple formats (Markdown, JSON, CSV, HTML) with intelligent structured data extraction and advanced OCR.

大语言模型MarkdownOCRpdf-to-markdownstructured-data人工智能document-parserdocument-parsingpdf-parserpdf-to-jsontables
Python 546
3 天前
https://static.github-zh.com/github_avatars/iamarunbrahma?size=40
iamarunbrahma / vision-parse

Parse PDFs into markdown using Vision LLMs

document-parserpdf-parserpdf-to-markdowntext-extraction
Python 427
8 天前
https://static.github-zh.com/github_avatars/GiftMungmeeprued?size=40
GiftMungmeeprued / document-parsers-list

A comprehensive list of document parsers, covering PDF-to-text conversion and layout extraction. Each tested for support of tables, equations, handwriting, two-column layouts, and multi-column layouts...

data-pipelinedocument-image-processingdocument-parserdocument-parsinglangchainOCRpdfpdf-to-textpreprocessing
149
2 个月前
https://static.github-zh.com/github_avatars/marieai?size=40
marieai / marie-ai

Complex data extraction and orchestration framework designed for processing unstructured documents. It integrates AI-powered document pipelines (GenAI, LLM, VLLM) into your applications, supporting va...

OCRoptical-character-recognitionDockerdocument-layout-analysisdocument-parserPythonPyTorchtable-detectiontable-recognition
Python 73
2 天前
https://static.github-zh.com/github_avatars/papercast-dev?size=40
papercast-dev / papercast

#自然语言处理#A Python pipeline tool and plugin ecosystem for processing technical documents. Process papers from arXiv, SemanticScholar, PDF, with GROBID, LangChain, listen as podcast. Customize your own pipelines...

arxivPythondag自然语言处理pdf-converterpdf-document-processorpipelinedocument-parserdocument-parsingpdf-to-textpodcasttts
Python 52
6 个月前
https://static.github-zh.com/github_avatars/JPLeoRX?size=40
JPLeoRX / opencv-text-deskew

Tutorial on how to deskew (straighten) text images

PythonOpenCV机器视觉图像处理document-parseropencv-python教程
Python 52
3 年前
https://static.github-zh.com/github_avatars/InvoiceableAI?size=40
InvoiceableAI / Invoiceable

The invoice, document, and resume parser powered by AI.

人工智能document-parserdocumentsexperimentalinvoiceinvoicesPythonresumeresume-parserresumes
Python 38
10 个月前
https://static.github-zh.com/github_avatars/LianjiaTech?size=40
LianjiaTech / bella-domify

文档解析(Document Parser),支持 PDF、TXT、DOC、DOCX、Markdown 等文件格式,高效提取与解析内容,生成标准文档树结构。内置 PDF Parser、Text Parser、Word Parser,助力 RAG、知识库、全文检索等智能应用。

document-parserpdf-parserParser
Python 36
2 天前
https://static.github-zh.com/github_avatars/graphlit?size=40
graphlit / graphlit

#自然语言处理#Graphlit Platform

聊天机器人copilotdata框架大语言模型ragvector-databasedocument-parserinformation-retrieval自然语言处理pdf-to-jsonpdf-to-text
22
2 年前
https://static.github-zh.com/github_avatars/decisionfacts?size=40
decisionfacts / semantic-ai

#大语言模型#An open source framework for Retrieval-Augmented System (RAG) uses semantic search helps to retrieve the expected results and generate human readable conversational response with the help of LLM (Lar...

approximate-nearest-neighbor-search深度神经网络document-parserdocxFastAPIinference-apillama2大语言模型机器学习OCRopenaipdfragretrieval-augmented-generationsemantic-searchvector-databaseopenai-api
Python 21
1 年前
https://static.github-zh.com/github_avatars/urbanclap-engg?size=40
urbanclap-engg / smart-docs-parser

An OCR based document parser to extract information from identity document images

document-parserOCRgoogle-visionuser-onboardingNode.jsTypeScript
TypeScript 21
3 年前
loading...