#搜索#PaddleNLP 2.0是飞桨生态的文本领域核心库,具备易用的文本领域API,多场景的应用示例、和高性能分布式训练三大特点,旨在提升开发者文本领域的开发效率,并提供基于飞桨2.0核心框架的NLP任务最佳实践。
Document intelligence framework for Python - Extract text, metadata, and structured data from PDFs, images, Office documents, and more. Built on Pandoc, PDFium, and Tesseract.
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
#自然语言处理#A curated list of resources for Document Understanding (DU) topic
#自然语言处理#ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.
#自然语言处理#ContextGem: Effortless LLM extraction from documents
#大语言模型#AI-in-a-Box leverages the expertise of Microsoft across the globe to develop and provide AI and ML solutions to the technical community. Our intent is to present a curated collection of solution acce...
#自然语言处理#ReadingBank: A Benchmark Dataset for Reading Order Detection
A collection of samples demonstrating techniques for processing documents with Azure AI including AI Foundry, OpenAI, Document Intelligence, etc.
The Doc Intelligence in-a-Box project leverages Azure AI Document Intelligence to extract data from PDF forms and store the data in a Azure Cosmos DB. This solution, part of the AI-in-a-Box framework ...
This sample demonstrates how to use Document Intelligence's Layout model to convert a PDF document, such as invoices, into Markdown, then use GPT-3.5 Turbo to extract structured JSON data using the Az...
#计算机科学#BoundaryNet - A Semi-Automatic Layout Annotation Tool
A curated list of resources on Table Structure Recognition
An experiment to provide the capabilities of Azure AI Document Intelligence Studio template training for feedback loop
A curated list of resources on Document Layout Analysis
App used to extract structured data from documents photos or pdfs via custom templating and commercial LLM (GPT and Azure Document Intelligence). Developed as a Computer Science Thesis at University o...
Using Azure Document Intelligence and Azure OpenAI services to automatically extract data from invoices.
A collection of solutions that leverage Azure AI services.