⚡ From finding text to search and replace, from sorting to beautifying text and more 🎨
Diff Match Patch is a high-performance library in multiple languages that manipulates plain text.
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
Intuitive find & replace CLI (sed alternative)
#自然语言处理#fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.
#自然语言处理#🦛 CHONK your texts with Chonkie ✨ - The no-nonsense RAG chunking library
Python library for creating PEG parsers
#计算机科学#Text Classification Algorithms: A Survey
#自然语言处理#Persian NLP Toolkit
#自然语言处理#The most accurate natural language detection library for Go, suitable for short text and mixed-language text
Program to convert lines of text into a tree structure.
A fast implementation of Aho-Corasick in Rust.
A fast and convenient fuzzy matcher library for rust
#自然语言处理#Thai natural language processing in Python
A simple Python module for parsing human names into their individual components
#自然语言处理#Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashta...
#自然语言处理#Natural language detection library for Go
#自然语言处理#Open Korean Text Processor - An Open-source Korean Text Processor