local-inference

Star

Here are 35 public repositories matching this topic...

Tiiny-AI / PowerInfer

Star

High-speed Large Language Model Serving for Local Deployment

llama large-language-models llm local-inference llm-inference

Updated Jan 24, 2026
C++

efeslab / fiddler

Star

[ICLR'25] Fast Inference of MoE Models with CPU-GPU Orchestration

mixture-of-experts llm local-inference llm-inference mixtral-8x7b

Updated Nov 18, 2024
Python

Modern desktop application (Rust + Tauri v2 + Svelte 5 + Candle (HF)) for communicating with AI models that runs completely locally on your computer. No subscriptions, no data sent to the internet — just you and your personal AI assistant

Updated Feb 25, 2026
Rust

sbhjt-gr / InferrLM

Star

On-device AI for iOS & Android

embeddings gemini http-server openai document-processing rag edge-ai on-device-ai local-inference anthropic llamacpp llama-cpp local-llm gguf deepseek multimodal-ai

Updated Mar 1, 2026
TypeScript

notolog / notolog-editor

Star

Notolog Markdown Editor

python markdown qt emacs markdown-editor onnx python-ai on-device-ai ai-assistant pyside6 python-qt local-inference llama-cpp local-llm qwen llama-cpp-python gguf phi-4 privacy-first-ai

Updated Feb 7, 2026
Python

BorjaOteroFerreira / IALab-Suite

Star

Tool for test diferents large language models without code.

api-rest chat-application flask-api inference-api large-language-models llm local-inference llamacpp llm-inference llama2 llama-cpp-python llama2-7b mixtral-8x7b

Updated Oct 18, 2025
Python

michael-borck / study-buddy

Star

Desktop AI tutoring app with local inference using Ollama for privacy-focused education.

electron javascript desktop-app css education privacy typescript ai offline nextjs edtech desktop-application llama tutoring privacy-focused local-inference ollama ai-tutor offline-application

Updated Mar 1, 2026
TypeScript

strnad / HeartMuse

Sponsor

Star

Local AI music generator with smart lyrics: Gradio web UI for HeartMuLa + Ollama/OpenAI, tags, history, and high-fidelity audio.

Updated Feb 16, 2026
Python

yas-sim / openvino-llm-chatbot-rag

Star

LLM chatbot example using OpenVINO with RAG (Retrieval Augmented Generation).

natural-language-processing offline chatbot intel edge-computing rag openvino huggingface edge-inference cloud-free llm local-inference langchain dolly2 retrieval-augmented-generation llama2 neural-chat

Updated Jan 25, 2024
Python

LianHe-BI / Basic-Qwen-3B-SD-Prompt-SOUL-ARCHITECT-v2.0-DEMO

Star

EN: An overfitted SD prompt engine with severe "aesthetic snobbery," forcibly transforming mundane ideas into professional-grade physical rendering instructions. CN: 一个具备“审美洁癖”的过拟合提示词引擎，强行将平庸构思纠偏为具备极致物理质感的工业级渲染指令。

Updated Jan 19, 2026
Python

Raxephion / AuraGen-AuraFlow-WebUI

Star

Lightweight 6GB VRAM Gradio web app with auto-installer for running AuraFlow locally — no cloud, no clutter.

python open-source image-generation webui gradio text-to-image stable-diffusion diffusers local-inference generative-ai ai-image-generator auraflow low-vram

Updated Jun 7, 2025
Python

monday8am / edgelab

Star

Edge Agent Lab is an Android testing platform for evaluating small language model (SLM) agents directly on mobile devices.

android-development gemma litert edge-ai mediapipe local-inference function-calling agentic-ai koog litert-lm

Updated Mar 1, 2026
Kotlin

aperepel / mlx-serve-embeddings

Star

Local embeddings server for Apple Silicon using MLX, providing OpenAI-compatible API endpoints

machine-learning privacy embeddings m2 m3 mlx m1 text-embeddings vector-embeddings apple-silicon openai-api local-inference qwen litellm openai-compatible

Updated Feb 17, 2026
Shell

aTh1ef / ai-debate-agents

Star

Verify claims using AI agents that debate using scraped evidence and local language models.

python scraping beautifulsoup autonomous-agents local-inference qwen lm-studio langgraph private-llm agentic-ai claim-verification phi-4-mini evidence-based-ai

Updated Jun 1, 2025
Python

danielbodnar / keyless

Star

Privacy‑first, real‑time speech‑to‑text dictation. 100% local inference in Rust; hotkey to dictate anywhere (macOS, Linux, Windows).

macos linux rust accessibility speech-to-text dictation whisper local-inference

Updated Feb 9, 2026
Rust

Rayyan9477 / Agentic-Document-Extraction-PDF

Star

An agentic, zero‑shot document intelligence engine that sees, understands, and extracts from any PDF, no training, no hallucinations. Just define your fields and get trusted, structured outputs with confidence scores, deployed locally and built for the enterprise.

zero-shot-learning on-premise mutilmodel llm local-inference vision-language-model hallucination-detection agentic-ai compliant-yes