#大语言模型#A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.
#安卓#Maid is a cross-platform Flutter app for interfacing with GGUF / llama.cpp models locally, and with Ollama and OpenAI models remotely.
#大语言模型#Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema on the model output on the generation level
prima.cpp: Speeding up 70B-scale LLM inference on low-resource everyday home clusters
#大语言模型#Self-evaluating interview for AI coders
#大语言模型#Run larger LLMs with longer contexts on Apple Silicon by using differentiated precision for KV cache quantization. KVSplit enables 8-bit keys & 4-bit values, reducing memory by 59% with <1% quality lo...
This repo is to showcase how you can run a model locally and offline, free of OpenAI dependencies.
Review/Check GGUF files and estimate the memory usage and maximum tokens per second.
#自然语言处理#Local ML voice chat using high-end models.
#安卓#Making offline AI models accessible to all types of edge devices.
workbench for learing&practising AI tech in real scenario on Android device, powered by GGML(Georgi Gerganov Machine Learning) and FFmpeg
LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.