distributed-inference · GitHub Topics

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

gpu large-large-models CUDA PyTorch llm-inference jit attention Nvidia distributed-inference moe

Cuda 3.73 k

3 天前

gpustack / gpustack

#大语言模型#Simple, scalable AI model deployment on GPU clusters

ascend CUDA deepseek distributed-inference genai inference llama llamacpp 大语言模型 maas metal openai qwen rocm vllm mindie llm-inference llm-serving local-ai heterogeneous-cluster

Python 3.7 k

4 天前

Lizonghang / prima.cpp

prima.cpp: Speeding up 70B-scale LLM inference on low-resource everyday home clusters

distributed-ai llm-inference on-device-llms llama-cpp distributed-inference

C++ 997

2 个月前

mzbac / mlx_sharding

Distributed Inference for mlx LLm

MLX distributed-inference

Python 95

1 年前

ADT109119 / llamacpp-distributed-inference

#大语言模型#一個基於 llama.cpp 的分佈式 LLM 推理程式，讓您能夠利用區域網路內的多台電腦協同進行大型語言模型的分佈式推理，使用 Electron 的製作跨平台桌面應用程式操作 UI。

distributed-inference gguf llamacpp 大语言模型 llm-inference 远程过程调用 (RPC)distributed-llm

JavaScript 60

21 天前

ipc-lab / collaborative-inference-oac

#计算机科学#Source code of the paper "Private Collaborative Edge Inference via Over-the-Air Computation".

differential-privacy distributed-inference ensemble-learning 机器学习

Python 4

8 个月前

JiangkaiWu / Attribute_Reid

Official impl. of ACM MM paper "Identity-Aware Attribute Recognition via Real-Time Distributed Inference in Mobile Edge Clouds". A distributed inference model for pedestrian attribute recognition with...

distributed-inference edge-computing re-identification

Python 2

5 年前