GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

document-parser

Website
Wikipedia
https://static.github-zh.com/github_avatars/infiniflow?size=40
infiniflow / ragflow

#大语言模型#RAGFlow 是一款基于深度文档理解构建的开源 RAG(Retrieval-Augmented Generation)引擎

document-understanding大语言模型ragtable-structure-recognition深度学习document-parserretrieval-augmented-generation聊天机器人agentgraphragai-searchdeepseekdeepseek-r1ollama人工智能agentic-aimcpopenaiagenticdeep-research
Python 61.35 k
3 小时前
docling-project/docling
https://static.github-zh.com/github_avatars/docling-project?size=40
docling-project / docling

Get your documents ready for gen AI

人工智能convertdocumentspdftablesdocument-parserdocument-parsingdocxHTMLMarkdownpdf-converterpdf-to-jsonpdf-to-textpptxxlsx
Python 34.96 k
4 小时前
https://static.github-zh.com/github_avatars/Unstructured-IO?size=40
Unstructured-IO / unstructured

#自然语言处理#Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to...

深度学习document-parsing机器学习自然语言处理OCRinformation-retrievaldata-pipelinespreprocessingpdf-to-textpdfpdf-to-jsondocument-image-analysisdonutdocument-image-processingdocument-parserdocxlangchain大语言模型
HTML 12.14 k
3 天前
Marker-Inc-Korea/AutoRAG
https://static.github-zh.com/github_avatars/Marker-Inc-Korea?size=40
Marker-Inc-Korea / AutoRAG

#大语言模型#AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation

analysisautomlbenchmarkingdocument-parserembeddingsevaluation大语言模型llm-evaluationllm-opsOpen SourceopsoptimizationpipelinePythonqaragrag-evaluationretrieval-augmented-generation
Python 4.15 k
1 个月前
https://static.github-zh.com/github_avatars/run-llama?size=40
run-llama / llama_cloud_services

Knowledge Agents and Management in the Cloud

documentParsingpdfpdf-document-processorpptxstructured-datadocument-parserdocument-parsingdocx-to-markdownpdf-to-excelpdf-to-jsonpdf-to-textppt-to-jsontablesppt-to-markdownpdf-to-markdown
TypeScript 4.07 k
6 小时前
https://static.github-zh.com/github_avatars/freeok?size=40
freeok / so-novel

小说下载|网文下载 | 网络小说

content-exportdocument-parserebookoffline-reader命令行界面tuinovel
Java 3.66 k1
3 天前
Filimoa/open-parse
https://static.github-zh.com/github_avatars/Filimoa?size=40
Filimoa / open-parse

Improved file parsing for LLM’s

document-structuretable-detectiondocument-parserlayout-parsing
Python 3.03 k
9 个月前
https://static.github-zh.com/github_avatars/deepdoctection?size=40
deepdoctection / deepdoctection

#自然语言处理#A Repo For Document AI

document-parserdocument-image-analysistable-recognitionOCRdocument-aidocument-understandingPythondocument-layout-analysistable-detectionPyTorchTensorflowlayoutlm自然语言处理
Python 2.9 k
2 天前
https://static.github-zh.com/github_avatars/liweiphys?size=40
liweiphys / layra

#大语言模型#LAYRA—an enterprise-ready, out-of-the-box solution—unlocks next-generation intelligent systems powered by visual RAG and limitless visual multi-step agent workflow orchestration.

agentdocument-parserknowledge-basegpt-4o大语言模型FastAPIworkflow
TypeScript 780
4 天前
https://static.github-zh.com/github_avatars/iamarunbrahma?size=40
iamarunbrahma / vision-parse

Parse PDFs into markdown using Vision LLMs

document-parserpdf-parserpdf-to-markdowntext-extraction
Python 408
6 个月前
https://static.github-zh.com/github_avatars/GiftMungmeeprued?size=40
GiftMungmeeprued / document-parsers-list

A comprehensive list of document parsers, covering PDF-to-text conversion and layout extraction. Each tested for support of tables, equations, handwriting, two-column layouts, and multi-column layouts...

data-pipelinedocument-image-processingdocument-parserdocument-parsinglangchainOCRpdfpdf-to-textpreprocessing
118
17 天前
https://static.github-zh.com/github_avatars/marieai?size=40
marieai / marie-ai

Complex data extraction and orchestration framework designed for processing unstructured documents. It integrates AI-powered document pipelines (GenAI, LLM, VLLM) into your applications, supporting va...

OCRoptical-character-recognitionDockerdocument-layout-analysisdocument-parserPythonPyTorchtable-detectiontable-recognition
Python 70
8 天前
https://static.github-zh.com/github_avatars/papercast-dev?size=40
papercast-dev / papercast

#自然语言处理#A Python pipeline tool and plugin ecosystem for processing technical documents. Process papers from arXiv, SemanticScholar, PDF, with GROBID, LangChain, listen as podcast. Customize your own pipelines...

arxivPythondag自然语言处理pdf-converterpdf-document-processorpipelinedocument-parserdocument-parsingpdf-to-textpodcasttts
Python 51
4 个月前
https://static.github-zh.com/github_avatars/JPLeoRX?size=40
JPLeoRX / opencv-text-deskew

Tutorial on how to deskew (straighten) text images

PythonOpenCV机器视觉图像处理document-parseropencv-python教程
Python 51
3 年前
https://static.github-zh.com/github_avatars/InvoiceableAI?size=40
InvoiceableAI / Invoiceable

The invoice, document, and resume parser powered by AI.

人工智能document-parserdocumentsexperimentalinvoiceinvoicesPythonresumeresume-parserresumes
Python 36
8 个月前
https://static.github-zh.com/github_avatars/decisionfacts?size=40
decisionfacts / semantic-ai

#大语言模型#An open source framework for Retrieval-Augmented System (RAG) uses semantic search helps to retrieve the expected results and generate human readable conversational response with the help of LLM (Lar...

approximate-nearest-neighbor-search深度神经网络document-parserdocxFastAPIinference-apillama2大语言模型机器学习OCRopenaipdfragretrieval-augmented-generationsemantic-searchvector-databaseopenai-api
Python 21
1 年前
https://static.github-zh.com/github_avatars/urbanclap-engg?size=40
urbanclap-engg / smart-docs-parser

An OCR based document parser to extract information from identity document images

document-parserOCRgoogle-visionuser-onboardingNode.jsTypeScript
TypeScript 21
3 年前
https://static.github-zh.com/github_avatars/graphlit?size=40
graphlit / graphlit

#自然语言处理#Graphlit Platform

聊天机器人copilotdata框架大语言模型ragvector-databasedocument-parserinformation-retrieval自然语言处理pdf-to-jsonpdf-to-text
21
1 年前
https://static.github-zh.com/github_avatars/novaladai?size=40
novaladai / novalad

#自然语言处理#Novalad offers a unified, centralized platform enabling organizations to extract meaningful data and perform advanced processing at high speed.

人工智能APIdocumentgenailayout-parser机器学习自然语言处理pdfpptxPythondocument-parser
Jupyter Notebook 17
17 天前
https://static.github-zh.com/github_avatars/brazilian-code?size=40
brazilian-code / Resume_Parsing

Resume Parsing app to extract information using AI

neural-networkscnn-kerasOCRdocument-parser机器视觉Streamlitmask-rcnneasyocr
Jupyter Notebook 17
4 年前
loading...