GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

bpe

Website
Wikipedia
https://static.github-zh.com/github_avatars/rsennrich?size=40
rsennrich / subword-nmt

Unsupervised Word Segmentation for Neural Machine Translation and Text Generation

neural-machine-translationsegmentationmachine-translationnmtsubword-unitsbpe
Python 2.24 k
10 个月前
https://static.github-zh.com/github_avatars/VKCOM?size=40
VKCOM / YouTokenToMe

#自然语言处理#Unsupervised text tokenizer focused on computational efficiency

自然语言处理word-segmentationbpetokenization
C++ 968
1 年前
https://static.github-zh.com/github_avatars/niieani?size=40
niieani / gpt-tokenizer

#计算机科学#The fastest JavaScript BPE Tokenizer Encoder Decoder for OpenAI's GPT models (o1, o3, o4, gpt-4o, gpt-4, etc.). Port of OpenAI's tiktoken with additional features.

bpegpt-2gpt-3机器学习gpt-4Parsingdecoderencoderopenaigpt-4o
TypeScript 573
8 天前
https://static.github-zh.com/github_avatars/zurawiki?size=40
zurawiki / tiktoken-rs

Ready-made tokenizer library for working with GPT and tiktoken

bpeopenaiRustParsing
Rust 311
1 个月前
https://static.github-zh.com/github_avatars/OpenNMT?size=40
OpenNMT / Tokenizer

#自然语言处理#Fast and customizable text tokenization library with BPE and SentencePiece support

Parsingsentencepiece自然语言处理machine-translationbpeunicodetokenizationicuPythonC++
C++ 308
2 个月前
https://static.github-zh.com/github_avatars/Kyubyong?size=40
Kyubyong / nlp_made_easy

#自然语言处理#Explains nlp building blocks in a simple manner.

自然语言处理seq2seqbeam-searchbpe
Jupyter Notebook 251
6 年前
https://static.github-zh.com/github_avatars/soaxelbrooke?size=40
soaxelbrooke / python-bpe

#自然语言处理#Byte Pair Encoding for Python!

自然语言处理bpePython
Python 230
3 年前
https://static.github-zh.com/github_avatars/akretion?size=40
akretion / nfelib

nfelib - bindings Python para e ler e gerir XML de NF-e, NFS-e nacional, CT-e, MDF-e, BP-e

brasilPythonbpe
Python 159
2 个月前
https://static.github-zh.com/github_avatars/gautierdag?size=40
gautierdag / bpeasy

Fast bare-bones BPE for modern tokenizer training

bpetokenizationParsing
Python 155
2 个月前
https://static.github-zh.com/github_avatars/samber?size=40
samber / go-gpt-3-encoder

Go BPE tokenizer (Encoder+Decoder) for GPT2 and GPT3

bpecodexdecoderencoderGogpt-2gpt-3openaitokenParsingtransformer
Go 79
6 个月前
https://static.github-zh.com/github_avatars/faizann24?size=40
faizann24 / phishytics-machine-learning-for-phishing

#计算机科学#Machine Learning for Phishing Website Detection

机器学习Cybersecuritybpephishing人工智能random-forest安全数据科学
HTML 55
5 年前
https://static.github-zh.com/github_avatars/jiesutd?size=40
jiesutd / SubwordEncoding-CWS

Subword Encoding in Lattice LSTM for Chinese Word Segmentation

segmentationbpe
Python 53
6 年前
https://static.github-zh.com/github_avatars/zouharvi?size=40
zouharvi / tokenization-scorer

Simple-to-use scoring function for arbitrarily tokenized texts.

bpesegmentationtokenization
Python 40
4 个月前
https://static.github-zh.com/github_avatars/aespinilla?size=40
aespinilla / GPT3-Tokenizer

#大语言模型#GPT3 encoder & decoder tool written in Swift

openaiApplebpeChatGPTChatGPT APIchatgpt3decoderencoderencoder-decodergpt3openai-apiSwiftswift-package-managerParsing
Swift 32
2 年前
https://static.github-zh.com/github_avatars/aallam?size=40
aallam / ktoken

Kotlin multiplatform BPE tokenizer library for OpenAI models

bpegptKotlinopenaiParsing
Kotlin 32
5 个月前
https://static.github-zh.com/github_avatars/OctopusMind?size=40
OctopusMind / BBPE

BBPE 底层实现

bpellama3qwenParsing
Python 26
1 年前
https://static.github-zh.com/github_avatars/Systemcluster?size=40
Systemcluster / kitoken

#自然语言处理#Fast and versatile tokenizer for language models, compatible with SentencePiece, Tokenizers, Tiktoken and more. Supports BPE, Unigram and WordPiece tokenization in JavaScript, Python and Rust.

bpe自然语言处理sentencepieceParsingunigramword-segmentationNode.jsPythonRustWeb
Rust 25
3 个月前
https://static.github-zh.com/github_avatars/ankane?size=40
ankane / youtokentome-ruby

High performance unsupervised text tokenization for Ruby

word-segmentationtokenizationbpeunsupervised-learning
Ruby 21
1 年前
https://static.github-zh.com/github_avatars/hupe1980?size=40
hupe1980 / go-tiktoken

✂️ OpenAI's tiktoken tokenizer written in Go

bpeGogpt2openaiParsing
Go 19
5 个月前
https://static.github-zh.com/github_avatars/stephantul?size=40
stephantul / piecelearn

Learning BPE embeddings by first learning a segmentation model and then training word2vec

bpesentencepieceembeddingsword2vec
Python 19
2 年前
loading...