GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

sentencepiece

Website
Wikipedia
https://static.github-zh.com/github_avatars/niedev?size=40
niedev / RTranslator

#安卓#Open source real-time translation app for Android that runs locally

translatorbluetooth-lerealtime-translatorAndroidonnxonnxruntimesentencepiecetransformerstranslationnllbWhispermobile-appoffline
C++ 8.14 k
3 天前
https://static.github-zh.com/github_avatars/OpenNMT?size=40
OpenNMT / Tokenizer

#自然语言处理#Fast and customizable text tokenization library with BPE and SentencePiece support

Parsingsentencepiece自然语言处理machine-translationbpeunicodetokenizationicuPythonC++
C++ 308
2 个月前
https://static.github-zh.com/github_avatars/himkt?size=40
himkt / konoha

#自然语言处理#🌿 An easy-to-use Japanese Text Processing tool, which makes it possible to switch tokenizers with small changes of code.

自然语言处理text-processingsentencepiecejapanese
Python 249
2 个月前
https://static.github-zh.com/github_avatars/taishan1994?size=40
taishan1994 / sentencepiece_chinese_bpe

使用sentencepiece中BPE训练中文词表,并在transformers中进行使用。

sentencepiecetokenization
Python 118
2 年前
https://static.github-zh.com/github_avatars/lingvanex-mt?size=40
lingvanex-mt / models

#自然语言处理#Free and open source pre-trained translation models, including Kurdish, Samoan, Xhosa, Lao, Corsican, Cebuano, Galician, Yiddish, Swahili, Russian, Belarusian and Yoruba.

ctranslate2machine-translationmultilingualneural-networks自然语言处理sentencepieceswahiliyorubatranslatetranslationtranslator
66
3 个月前
https://static.github-zh.com/github_avatars/dhpollack?size=40
dhpollack / huggingface_libtorch

#自然语言处理#Minimal example of using a traced huggingface transformers model with libtorch

PyTorchlibtorch自然语言处理C++sentencepiecealbert
C++ 35
5 年前
https://static.github-zh.com/github_avatars/nguyenvulebinh?size=40
nguyenvulebinh / vietnamese-roberta

#自然语言处理#A Robustly Optimized BERT Pretraining Approach for Vietnamese

vietnamesepretrained-models自然语言处理robertabertPyTorchfairseqsentencepiecevietnamese-nlptransformer
Python 32
1 年前
https://static.github-zh.com/github_avatars/eliben?size=40
eliben / go-sentencepiece

#大语言模型#Go implementation of the SentencePiece tokenizer

encodingGolanguage-model大语言模型tokenizationsentencepiece
Go 29
9 个月前
https://static.github-zh.com/github_avatars/bnosac?size=40
bnosac / sentencepiece

#自然语言处理#R package for Byte Pair Encoding / Unigram modelling based on Sentencepiece

sentencepiecebyteword-segmentation自然语言处理
C++ 25
3 年前
https://static.github-zh.com/github_avatars/Systemcluster?size=40
Systemcluster / kitoken

#自然语言处理#Fast and versatile tokenizer for language models, compatible with SentencePiece, Tokenizers, Tiktoken and more. Supports BPE, Unigram and WordPiece tokenization in JavaScript, Python and Rust.

bpe自然语言处理sentencepieceParsingunigramword-segmentationNode.jsPythonRustWeb
Rust 25
3 个月前
https://static.github-zh.com/github_avatars/danieldk?size=40
danieldk / sentencepiece

Rust binding for the sentencepiece library

sentencepieceRust
Rust 22
2 个月前
https://static.github-zh.com/github_avatars/Andras7?size=40
Andras7 / gpt2-pytorch

Extremely simple and understandable GPT2 implementation with minor tweaks

PyTorchgpt2sentencepiecetransformers
Python 21
6 年前
https://static.github-zh.com/github_avatars/stephantul?size=40
stephantul / piecelearn

Learning BPE embeddings by first learning a segmentation model and then training word2vec

bpesentencepieceembeddingsword2vec
Python 19
2 年前
https://static.github-zh.com/github_avatars/sctg-development?size=40
sctg-development / sentencepiece-js

sentencepiece port to webassembly with browser compatibility

人工智能sentencepieceParsing
TypeScript 13
8 个月前
https://static.github-zh.com/github_avatars/to-aoki?size=40
to-aoki / my-pytorch-bert

#自然语言处理#BERT implementation of PyTorch

自然语言处理bertPyTorchjapanese-languagesentencepiecealbert
Python 11
5 年前
https://static.github-zh.com/github_avatars/jkrukowski?size=40
jkrukowski / swift-sentencepiece

Use SentencePiece in Swift for tokenization and detokenization.

sentencepiecetokenization
Swift 10
4 个月前
https://static.github-zh.com/github_avatars/Masao-Taketani?size=40
Masao-Taketani / japanese_text_classification

#自然语言处理#To investigate various DNN text classifiers including MLP, CNN, RNN, BERT approaches.

text-recognition自然语言处理text-classificationsentencepiece深度学习japanese
Jupyter Notebook 9
5 年前
https://static.github-zh.com/github_avatars/wang1ang?size=40
wang1ang / SentencePieceWrapper

sentencepiece C# wrapper

sentencepieceC#wrapper
C++ 5
6 年前
https://static.github-zh.com/github_avatars/leliuga?size=40
leliuga / datrin

dataset, train, inference

datasetflaxinferencejaxsafetensorssentencepiece
Python 4
1 年前
https://static.github-zh.com/github_avatars/smafjal?size=40
smafjal / bengali_tokenizer

Bengali language Tokenizer (SentencePiece)

sentencepiecebengaliParsingunsupervised-learning
Python 4
6 年前
loading...