Document chatbot — multiple files, topics, chat windows and chat history. Powered by GPT.
#自然语言处理#library supporting NLP and CV research on scientific papers
Multiple and Large PDF Documents Text Extraction.
The Privacy Firewall for LLMs
#计算机科学#A boilerplate solution for processing image and PDF documents for regulated industries, with lineage and pipeline operations metadata services.
Official Python client library for Nutrient Document Web Services API - PDF processing, OCR, watermarking, and document manipulation with automatic Office format conversion
Python scripts that converts PDF files to text, splits them into chunks, and stores their vector representations using GPT4All embeddings in a Chroma DB. It also provides a script to query the Chroma ...
A NPM Package built on top of pdf-lib that provides functonalities like merge, rotate, split,download pdf to disk and many more...
#自然语言处理#LangGraphRAG: A terminal-based Retrieval-Augmented Generation system using LangGraph. Features include message history caching, query transformation, and vector database retrieval. Ideal for NLP resea...
Built with pdf-actions NPM package.
An all-in-one GUI management toolkit built with PyQt6, offering a suite of tools for file synchronization, media organization, PDF merging, code formatting, and more.
AI-powered RAG-based tool for summarizing, extracting insights, and answering questions about research papers with high accuracy
Polymind is a powerful multi-modal Telegram bot built with Gemini, DeepSeek, OpenRouter, and over 50 cutting-edge AI models. It offers seamless conversational intelligence, Mermaid diagram rendering, ...
#自然语言处理#The Document Summarizer leverages Hugging Face’s facebook/bart-large-cnn model to transform lengthy documents into concise summaries. Built with ReactJS (Vite) for the frontend and Flask for the backe...
#自然语言处理#📚 AI-Powered Book PDF Knowledge Extractor & Summarizer Transform your PDF books into structured knowledge effortlessly! This tool leverages AI to analyze books page by page, extracting key insights, ...
A Streamlit-based app for asking questions directly from uploaded documents using Gemini embeddings and a language model. Supports PDF, TXT, and DOCX files. Fast, simple, and powerful document-based Q...
A side project to easily get and annotate questions and answers to the PsychometryBot project DB using computer vision and pdf parsing
This is some useful mini projects that I had worked for self-learning Python programming.
PdfSnipper is a lightweight and efficient Python package designed to simplify the management of PDF files, pages, and their conversions during various NLP, Computer Vision (CV), or other data processi...
Azure Document Intelligence Result Processor: A toolset for annotating PDFs based on Azure Document Intelligence analysis results, featuring a React web application and a standalone Python script for ...