pycorrector is a toolkit for text error correction. 文本纠错,实现了Kenlm,T5,MacBERT,ChatGLM3,Qwen2.5等模型应用在纠错场景,开箱即用。
翻译 - pycorrector是用于文本错误纠正的工具包。它的开发是为了方便设计,比较和共享深层文本纠错模型。
#计算机科学#State-of-the-art (ranked #1 Aug 2022) German Speech Recognition in 284 lines of C++. This is a 100% private 100% offline 100% free CLI tool.
#自然语言处理#Training an n-gram based Language Model using KenLM toolkit for Deep Speech 2
CTC+Beam_Search+kenlm 是用于以汉字为声学模型建模单元的解码系统
#自然语言处理#A complete instruction for training a Persian spell checker and a language model based on SymSpell and KenLM, respectively using Wikipedia dataset.
Wave2vec 2.0 Recognize pipeline
Optical Character Recognition + Instance Segmentation for russian and english languages
Romanian Automatic Speech Recognition from the ROBIN project
#自然语言处理#🎲 KenLM extension for spaCy 2.0.
#自然语言处理#A Java JNI wrapper for KenLM: Faster and Smaller Language Model Queries
Real-Time ASR with CNN-BiLSTM: End-to-End Live Streaming Using PyTorch Lightning⚡
#计算机科学#INACTIVE - http://mzl.la/ghe-archive - Generate language models from OSCAR corpora
#计算机科学#Neural Grammatical Error Correction for Romanian using Transformer
#自然语言处理#Developed an AI tool to automatically generate captions and transcripts for YouTube videos in 67 languages and can generate summarized texts in 133 languages.
Automatic Speech Recognition using Conformer with Speech Sentiment Analysis & Text Summarizer
Scripts to train a n-gram language models on Wikipedia articles
#自然语言处理#This repo shows how to finetune the wav2vec2.0 model along with its prerequisites.
demo of domain corpus bootstrapping using language model perplexity