该仓库已收录但尚未编辑。项目介绍及使用教程请前往 GitHub 阅读 README
Train llm (bloom, llama, baichuan2-7b, chatglm3-6b) with deepspeed pipeline mode. Faster than zero/zero++/fsdp.
2023-06-24
否
2024-02-05T05:23:05Z
label-smooth, amsoftmax, partial-fc, focal-loss, triplet-loss, lovasz-softmax. Maybe useful
Add bisenetv2. My implementation of BiSeNet
mIOU=80.02 on cityscapes. My implementation of deeplabv3+ (also know as 'Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation' based on the dataset of cityscapes).
a 2048 small game writen by python and c
#大语言模型#基于ChatGLM-6B、ChatGLM2-6B、ChatGLM3-6B模型,进行下游具体任务微调,涉及Freeze、Lora、P-tuning、全参微调等
#大语言模型#The blog, read report and code example for AGI/LLM related knowledge.
#大语言模型#LLaVA是一个具有 GPT-4V 级别功能的大语言和视觉模型助手
#自然语言处理#收集和梳理垂直领域的开源模型、数据集及评测基准。
Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models (CVPR 2024 Highlight)
Go ahead and axolotl questions
#大语言模型#Code examples and resources for DBRX, a large language model developed by Databricks
The personal finance app for everyone
Open-Sora: 完全开源的高效复现类Sora视频生成方案
"他山之石、可以攻玉":复旦白泽智能发布面向国内开源和国外商用大模型的Demo数据集JADE-DB
RankLLM is a Python toolkit for reproducible information retrieval research using rerankers, with a focus on listwise reranking.
Minimalistic large language model 3D-parallelism training
Window management made elegant.
0 条讨论