Loading

该仓库已收录但尚未编辑。项目介绍及使用教程请前往 GitHub 阅读 README


0 条讨论

登录后发表评论

关于

FlashInfer: Kernel Library for LLM Serving

创建时间
是否国产

  修改时间

2025-09-10T06:10:38Z


语言

  • Cuda36.8%
  • Python33.4%
  • C++28.8%
  • Jinja0.6%
  • Shell0.4%
  • C0.1%
  • 其他0.01%

您可能感兴趣的

大模型Grok-1开源

Python50.49 k
1 年前

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Python12.02 k
2 个月前

Open-Sora: 完全开源的高效复现类Sora视频生成方案

Python27.16 k
4 个月前

#大语言模型#SGLang is a fast serving framework for large language models and vision language models.

Python17.79 k
1 小时前

#大语言模型#vLLM 是一个高效的开源库,用于加速大语言模型推理,通过优化内存管理和分布式处理实现高吞吐量和低延迟。

Python57.66 k
1 小时前

#大语言模型#Code examples and resources for DBRX, a large language model developed by Databricks

Python2.57 k
1 年前
xlite-dev/Awesome-LLM-Inference

📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

Python4.48 k
22 天前

Devika is an Agentic AI Software Engineer that can understand high-level human instructions, break them down into steps, research relevant information, and write code to achieve the given objective. D...

Python19.4 k
1 年前

#自然语言处理#LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python3.59 k
17 小时前

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Python1.6 k
10 个月前

Training LLMs with QLoRA + FSDP

Jupyter Notebook1.53 k
10 个月前

[CVPR 2024 Highlight] DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models

Python704
9 个月前

Fast and memory-efficient exact attention

Python19.42 k
5 天前

Flash Attention in ~100 lines of CUDA (forward pass only)

Cuda919
8 个月前
Python4.18 k1
1 年前

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-L...

C++11.55 k1
2 小时前
Jupyter Notebook1.98 k
10 个月前
Python63.35 k
42 分钟前