GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

document-parser

Website
Wikipedia
https://static.github-zh.com/github_avatars/infiniflow?size=40
infiniflow / ragflow

#自然语言处理#RAGFlow 是一款基于深度文档理解构建的开源 RAG(Retrieval-Augmented Generation)引擎

document-understanding大语言模型ragtable-structure-recognition深度学习document-parser自然语言处理pdf-to-textretrieval-augmented-generation聊天机器人agentagentsgraphragtext2sqlai-searchChatGPTdeepseekdeepseek-r1ollama人工智能
Python 55.11 k
2 天前
docling-project/docling
https://static.github-zh.com/github_avatars/docling-project?size=40
docling-project / docling

Get your documents ready for gen AI

人工智能convertdocumentspdftablesdocument-parserdocument-parsingdocxHTMLMarkdownpdf-converterpdf-to-jsonpdf-to-textpptxxlsx
Python 31.83 k
2 天前
https://static.github-zh.com/github_avatars/Unstructured-IO?size=40
Unstructured-IO / unstructured

#自然语言处理#Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to...

深度学习document-parsing机器学习自然语言处理OCRinformation-retrievaldata-pipelinespreprocessingpdf-to-textpdfpdf-to-jsondocument-image-analysisdonutdocument-image-processingdocument-parserdocxlangchain大语言模型
HTML 11.49 k
2 天前
Marker-Inc-Korea/AutoRAG
https://static.github-zh.com/github_avatars/Marker-Inc-Korea?size=40
Marker-Inc-Korea / AutoRAG

#大语言模型#AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation

analysisautomlbenchmarkingdocument-parserembeddingsevaluation大语言模型llm-evaluationllm-opsOpen SourceopsoptimizationpipelinePythonqaragrag-evaluationretrieval-augmented-generation
Python 4.03 k
1 个月前
https://static.github-zh.com/github_avatars/run-llama?size=40
run-llama / llama_cloud_services

Knowledge Agents and Management in the Cloud

documentParsingpdfpdf-document-processorpptxstructured-datadocument-parserdocument-parsingdocx-to-markdownpdf-to-excelpdf-to-jsonpdf-to-textppt-to-jsontablesppt-to-markdownpdf-to-markdown
Python 4.01 k
5 天前
Filimoa/open-parse
https://static.github-zh.com/github_avatars/Filimoa?size=40
Filimoa / open-parse

Improved file parsing for LLM’s

document-structuretable-detectiondocument-parserlayout-parsing
Python 2.99 k
7 个月前
https://static.github-zh.com/github_avatars/deepdoctection?size=40
deepdoctection / deepdoctection

#自然语言处理#A Repo For Document AI

document-parserdocument-image-analysistable-recognitionOCRdocument-aidocument-understandingPythondocument-layout-analysistable-detectionPyTorchTensorflowlayoutlm自然语言处理
Python 2.85 k
4 天前
https://static.github-zh.com/github_avatars/liweiphys?size=40
liweiphys / layra

#大语言模型#LAYRA unlocks next-generation intelligent systems—powered by vision-driven RAG and multi-step agent orchestration—with no limits, no compromises.

agentdocument-parserknowledge-basegpt-4o大语言模型FastAPIworkflow
TypeScript 719
4 天前
https://static.github-zh.com/github_avatars/iamarunbrahma?size=40
iamarunbrahma / vision-parse

Parse PDFs into markdown using Vision LLMs

document-parserpdf-parserpdf-to-markdowntext-extraction
Python 386
4 个月前
https://static.github-zh.com/github_avatars/marieai?size=40
marieai / marie-ai

Complex data extraction and orchestration framework designed for processing unstructured documents. It integrates AI-powered document pipelines (GenAI, LLM, VLLM) into your applications, supporting va...

OCRoptical-character-recognitionDockerdocument-layout-analysisdocument-parserPythonPyTorchtable-detectiontable-recognition
Python 70
4 天前
https://static.github-zh.com/github_avatars/JPLeoRX?size=40
JPLeoRX / opencv-text-deskew

Tutorial on how to deskew (straighten) text images

PythonOpenCV机器视觉图像处理document-parseropencv-python教程
Python 51
3 年前
https://static.github-zh.com/github_avatars/papercast-dev?size=40
papercast-dev / papercast

#自然语言处理#A Python pipeline tool and plugin ecosystem for processing technical documents. Process papers from arXiv, SemanticScholar, PDF, with GROBID, LangChain, listen as podcast. Customize your own pipelines...

arxivPythondag自然语言处理pdf-converterpdf-document-processorpipelinedocument-parserdocument-parsingpdf-to-textpodcasttts
Python 50
3 个月前
https://static.github-zh.com/github_avatars/InvoiceableAI?size=40
InvoiceableAI / Invoiceable

The invoice, document, and resume parser powered by AI.

人工智能document-parserdocumentsexperimentalinvoiceinvoicesPythonresumeresume-parserresumes
Python 35
7 个月前
https://static.github-zh.com/github_avatars/urbanclap-engg?size=40
urbanclap-engg / smart-docs-parser

An OCR based document parser to extract information from identity document images

document-parserOCRgoogle-visionuser-onboardingNode.jsTypeScript
TypeScript 21
3 年前
https://static.github-zh.com/github_avatars/decisionfacts?size=40
decisionfacts / semantic-ai

#大语言模型#An open source framework for Retrieval-Augmented System (RAG) uses semantic search helps to retrieve the expected results and generate human readable conversational response with the help of LLM (Lar...

approximate-nearest-neighbor-search深度神经网络document-parserdocxFastAPIinference-apillama2大语言模型机器学习OCRopenaipdfragretrieval-augmented-generationsemantic-searchvector-databaseopenai-api
Python 20
1 年前
https://static.github-zh.com/github_avatars/graphlit?size=40
graphlit / graphlit

#自然语言处理#Graphlit Platform

聊天机器人copilotdata框架大语言模型ragvector-databasedocument-parserinformation-retrieval自然语言处理pdf-to-jsonpdf-to-text
19
1 年前
https://static.github-zh.com/github_avatars/brazilian-code?size=40
brazilian-code / Resume_Parsing

Resume Parsing app to extract information using AI

neural-networkscnn-kerasOCRdocument-parser机器视觉Streamlitmask-rcnneasyocr
Jupyter Notebook 17
3 年前
https://static.github-zh.com/github_avatars/novaladai?size=40
novaladai / novalad

#自然语言处理#Novalad offers a unified, centralized platform enabling organizations to extract meaningful data and perform advanced processing at high speed.

人工智能APIdocumentgenailayout-parser机器学习自然语言处理pdfpptxPythondocument-parser
Jupyter Notebook 15
1 个月前
https://static.github-zh.com/github_avatars/decisionfacts?size=40
decisionfacts / df-extract

DF Extract Lib

document-parserpptxPythonasynciodocxextractionjpegjpgpdfpng
Python 14
1 年前
https://static.github-zh.com/github_avatars/graphlit?size=40
graphlit / graphlit-client-python

Python client library for Graphlit Platform

API聊天机器人copilotdocument-parserpdf-to-jsonragagents人工智能ai-agents大语言模型
Python 13
7 天前
loading...