A modular, production-grade AI infrastructure framework for AMD, NVIDIA, and ARM64 hardware. LLM inference · RAG pipeline · Workflow automation · Full observability.
Each component is independently deployable. Start with LLM inference, add RAG when ready, bolt on observability later.
AMD ROCm, NVIDIA CUDA, and ARM64 stacks included. Same 12-phase workflow across all hardware targets.
Ollama + OpenWebUI with always-on VRAM optimization. Lemonade native engine for AMD high-performance inference.
Qdrant vector database, Docling document processor, and Mosquitto MQTT broker — fully wired and ready.
n8n in queue mode with Redis and distributed workers. Enterprise-grade orchestration on your own hardware.
Grafana + Prometheus + Loki + cAdvisor. DCGM Exporter for GPU telemetry and SLA dashboards out of the box.
HWI Advisor auto-detects your CPU and GPU, then writes an optimized tuning profile before first deploy.
Timestamped backup and restore for all persistent data. VRAM purge included. No scripting required.
Structured, independently deployable modules — from driver setup to lifecycle management. Deploy what you need, skip the rest.
Each phase is a self-contained Docker Compose module with its own deploy.sh.
Roll forward one layer at a time, or deploy everything in a single command.
| Service | Description | Port |
|---|---|---|
| OpenWebUI | LLM chat interface | 8080 |
| n8n | Workflow automation | 5678 |
| Grafana | Observability dashboard | 3000 |
| Portainer | Container management | 9000 |
| pgAdmin | Database admin UI | 8000 |
| Qdrant | Vector DB REST API | 6333 |
| Ollama | Inference API | 11434 |
| WUD | Container update manager | 3838 |
A standard llms.txt
index for any AI assistant, plus a 5-skill deployment protocol that turns the install into a fully autonomous
flow.
Standardised llms.txt index at the project root. Lets any LLM-powered assistant (Claude, ChatGPT, Gemini, Perplexity, etc.) discover and ingest the project's docs, skills, hardware stacks, and 12-phase modules in one fetch.
# OpenGenie AI Stack - Modular Self-Hosted AI Infrastructure > 🛑 CRITICAL AI AGENT INSTRUCTION: If the user requests to install, configure, deploy, or purge this project, the AI Agent MUST read and follow the deployment protocols defined in the skills/ directory BEFORE executing... ## Project overview - [README (English / 正體中文 / 日本語 / 한국어)] - [System Design Document (SDD)] - [Contributing guide] ## Agent skills (autonomous deployment protocol) - [00-Master orchestrator] - [01-Deployment state machine] - [02-Error recovery guide] - [03-GPU robustness] - [04-Full purge procedure] - [05-Installation guide]
Six protocol files in skills/
turn an AI coding assistant (Claude Code, Antigravity) into an autonomous deployment agent. State machine
with disk-persisted recovery — survives reboots, network drops, and mid-install interruptions.
Click the tabs below to explore the core deployment protocols and autonomous AI agent skills (00 to 05) defined in the repository.
For end users: Open the project in Antigravity / Claude Code, ask the
assistant to "install this project" — it reads skills/
first, detects your GPU, generates a tuning profile, walks you through credential setup, and deploys all 12
phases with self-healing recovery.
For LLM apps: Fetch https://<your-fork>/llms.txt
— every doc, skill, and module is linked as a raw markdown URL ready for ingestion.
Clone and pick your hardware stack — AMD, NVIDIA, or ARM64.
Copy .env.example
→ .env
and replace all CHANGE_ME
values.
Auto-detects hardware and writes an optimal tuning profile.
Full stack in one command, or deploy individual phases as needed.
Run the automated health check and benchmark suite.
# 1. Clone git clone https://github.com/TigerAI-Taiwan/OpenGenie-AI-Stack.git cd OpenGenie-AI-Stack # 2. Pick your stack cd deployments/amd-compose-stack # or: nvidia-compose-stack / arm64-compose-stack # 3. Configure cp .env.example .env nano .env # replace CHANGE_ME values # 4. Hardware calibration (recommended) sudo bash master-deploy.sh init # 5. Deploy everything sudo bash master-deploy.sh all # 6. Verify sudo bash master-deploy.sh test
No cloud lock-in. No usage fees. Deploy on your hardware, keep your data on-premise.