#自然语言处理#MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化，也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。

中文 chinese-language chinese-nlp chinese-simplified corpus-data

3.92 k

14 天前

ollama

@ollama

#大语言模型#本地化搭建和运行 Llama2 和其他大模型

llama 大语言模型 llama2 Go

Go148.73 k

5 小时前

grok-1

@xai-org

大模型Grok-1开源

Python50.39 k

1 年前

ChineseNlpCorpus

@SophonPlus

搜集、整理、发布中文自然语言处理语料/数据集，与有志之士共同促进中文自然语言处理的发展。

Jupyter Notebook6.29 k

7 年前

funNLP

Yang@fighting41love

内容违规，已屏蔽

Python75.25 k

1 年前🇨🇳

Curator

@NVIDIA-NeMo

#大语言模型#Scalable data pre processing and curation toolkit for LLMs

data-curation 大语言模型 data data-prep data-preparation

Python1.05 k

3 天前

generative-ai-for-beginners

Microsoft@microsoft

#大语言模型#微软生成式AI课程，12课时

人工智能 ChatGPT dall-e generativeai gpt

Jupyter Notebook93.97 k

1 天前

ChineseNLPCorpus

AesopChow@InsaneLife

中文自然语言处理数据集，平时做做实验的材料。欢迎补充提交合并。

Python4.47 k

2 年前🇨🇳

llm-action

@liguodongiot

#大语言模型#本项目旨在分享大模型相关技术原理以及实战经验（大模型工程化、大模型应用落地）

大语言模型 llm-inference llm-serving llm-training llmops

HTML19.9 k

2 天前

transformer-debugger

OpenAI@openai

Python4.09 k

1 年前

Chinese-Word-Vectors

embedding@Embedding

100+ Chinese Word Vectors 上百种预训练中文词向量

中文 chinese-word-segmentation embeddings word-embeddings vectors-trained

Python12.06 k

2 年前

ComfyUI

@comfyanonymous

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.

stable-diffusion PyTorch 人工智能 Python

Python84.41 k

3 小时前

Fay

@xszyou

#安卓#fay是一个帮助数字人（2.5d、3d、移动、pc、网页）或大语言模型（openai兼容、deepseek）连通业务系统的mcp框架。

人工智能 Python Android API ue5

JavaScript11.65 k

13 小时前🇨🇳

Qwen

@QwenLM • 阿里巴巴

#自然语言处理#通义千问-7B（Qwen-7B）是阿里云研发的通义千问大模型系列的70亿参数规模的模型

中文 large-language-models 自然语言处理 flash-attention 大语言模型

Python18.94 k

13 天前

hello-algo

@krahets

#算法刷题#《Hello 算法》：动画图解、一键运行的数据结构与算法教程。支持 Python, Java, C++, C, C#, JS, Go, Swift, Rust, Ruby, Kotlin, TS, Dart 代码。简体版和繁体版同步更新，English version in translation

算法数据结构 data-structures-and-algorithms dsa LeetCode

Java115.1 k

1 天前🇨🇳

jieba

Sun Junyi@fxsjy

结巴中文分词

Python34.3 k

1 年前🇨🇳

LLaMA-Factory

@hiyouga

#自然语言处理#Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

fine-tuning llama 大语言模型 peft transformers

Python55.53 k

3 小时前

Awesome-Chinese-NLP

@crownpku

#自然语言处理#A curated list of resources for Chinese NLP 中文自然语言处理相关资料

自然语言处理 chinese-nlp

7.9 k

2 年前

litellm

@BerriAI

#大语言模型#Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]

anthropic langchain 大语言模型 llmops openai

Python26.85 k

30 分钟前

brightmart / nlp_chinese_corpus

0 条讨论

关于

创建时间

是否国产

修改时间

brightmart 的其他开源项目

您可能感兴趣的