GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

chinese-dataset

Website
Wikipedia
https://static.github-zh.com/github_avatars/brightmart?size=40
brightmart / nlp_chinese_corpus

#自然语言处理#大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP

chinese-datasetchinese-corpuspretrainword2vec自然语言处理bertlanguage-modelWikinewsquestion-answering中文corpuschinese-nlpdatasettext-classification
9.73 k
1 年前
https://static.github-zh.com/github_avatars/chaoswork?size=40
chaoswork / sft_datasets

#数据仓库#开源SFT数据集整理,随时补充

数据集chinese-datasetlarge-language-models大语言模型
519
2 年前
https://static.github-zh.com/github_avatars/zake7749?size=40
zake7749 / Gossiping-Chinese-Corpus

PTT 八卦版問答中文語料

corpuschinese-corpusdatasetdialog聊天机器人chinese-datasetquestion-answeringchinese-nlp
Jupyter Notebook 243
8 个月前
https://static.github-zh.com/github_avatars/FunnySaltyFish?size=40
FunnySaltyFish / Better-Ruozhiba

#大语言模型#【逐条处理完成】人为审核+修改每一条的弱智吧精选问题QA数据集

chinese-datasetdataset大语言模型
205
2 个月前
https://static.github-zh.com/github_avatars/sovaai?size=40
sovaai / sova-dataset

#数据仓库#

dataset数据集Open DataOpen Sourcedataaudiochinese-datasetcorpus
128
3 年前
https://static.github-zh.com/github_avatars/secsilm?size=40
secsilm / zi-dataset

#自然语言处理#汉字数据集,包括汉字的相关信息,例如笔画数、部首、拼音、英文释义/同义词等。

自然语言处理chinese-nlpchinese-datasetdatasethanzi
118
5 年前
https://static.github-zh.com/github_avatars/CLUEbenchmark?size=40
CLUEbenchmark / QBQTC

QBQTC: 大规模搜索匹配数据集

semantic-searchsemantic-similaritysearchQuery (disambiguation)chinese-dataset
Python 82
4 年前
https://static.github-zh.com/github_avatars/lvyufeng?size=40
lvyufeng / SciBERT_CN

Pretrained model for Chinese Scientific Text

bertalbertpre-trainedpre-trained-modelTensorflowPyTorchchinese-corpusscientific-paperschinese-dataset
45
5 年前
https://static.github-zh.com/github_avatars/hsinmin?size=40
hsinmin / HanSig

#计算机科学#A large-scale offline Chinese handwritten signature dataset

dataset深度学习metric-learningsignature-recognitionsignature-verificationchinese-dataset
14
1 年前
https://static.github-zh.com/github_avatars/Eurus-Holmes?size=40
Eurus-Holmes / CHABCNet

[CHABCNet] ABCNet on the Chinese dataset, building on Detectron2 (Facebook AI Research)

text-detection-recognitionchinese-datasetdetectron2
Python 11
2 年前
https://static.github-zh.com/github_avatars/seanpm2001?size=40
seanpm2001 / AI2001_Category-Linguistics-SC-Chinese-Traditional

🧠️🖥️2️⃣️0️⃣️0️⃣️1️⃣️🔠️🔢️ The linguistic:Chinese-Traditional category for AI2001, containing Chinese (Traditional) language linguistic datasets

人工智能chinese-datasetdatasetgpl3GNU General Public LicenseMarkdowntxt
R 3
2 年前
https://static.github-zh.com/github_avatars/seanpm2001?size=40
seanpm2001 / AI2001_Category-Linguistics-SC-Chinese-Simplified

🧠️🖥️2️⃣️0️⃣️0️⃣️1️⃣️🔠️🔢️ The linguistic:Chinese-Simplified category for AI2001, containing Chinese (Simplified) language linguistic datasets

人工智能chinese-datasetchinese-simplifieddatasetgpl3GNU General Public LicenseMarkdowntxt
R 2
2 年前
https://static.github-zh.com/github_avatars/DolbyUUU?size=40
DolbyUUU / Spring-Festival-Gala-Dataset

#自然语言处理#中国40年春晚小品类节目的文本数据及数据分析 Text Data and Data Analysis of Chinese Spring Festival Gala Comedy Sketches Over 40 Years

chinese-dataset自然语言处理sentiment-analysistext-analysistext-data
Python 1
6 个月前
https://static.github-zh.com/github_avatars/DolbyUUU?size=40
DolbyUUU / Top-Economics-Journals-Publications-Dataset

#自然语言处理#Top Economics Journals Publications Dataset and Data Analysis: Top 5 English Journals and Top 3 Chinese Journals

chinese-dataset自然语言处理text-analysistext-data
Python 1
6 个月前
https://static.github-zh.com/github_avatars/DolbyUUU?size=40
DolbyUUU / Focus-Report-Dataset

#自然语言处理#2003-2023焦点访谈节目文本数据及数据分析 Text Data and Data Analysis of Focus Report, a Chinese Investigative TV Program, 2003-2023

chinese-dataset自然语言处理sentiment-analysistext-analysistext-data
Python 1
6 个月前
https://static.github-zh.com/github_avatars/pha123661?size=40
pha123661 / Taiwan-ELM

#自然语言处理#Code repository for training Taiwan-ELM models, including data preprocessing, tokenizer development, and model fine-tuning.

instruction-tuning大语言模型large-language-modelstraditional-chinesetransformerllama自然语言处理apache2chinese-datasettaiwantwllm
Jupyter Notebook 0
10 个月前