A modular, production-grade AI infrastructure framework for AMD, NVIDIA, and ARM64 hardware. LLM inference · RAG pipeline · Workflow automation · Full observability.
Each component is independently deployable. Start with LLM inference, add RAG when ready, bolt on observability later.
AMD ROCm, NVIDIA CUDA, and ARM64 stacks included. Same 12-phase workflow across all hardware targets.
Ollama + OpenWebUI with always-on VRAM optimization. Lemonade native engine for AMD high-performance inference.
Qdrant vector database, Docling document processor, and Mosquitto MQTT broker — fully wired and ready.
n8n in queue mode with Redis and distributed workers. Enterprise-grade orchestration on your own hardware.
Grafana + Prometheus + Loki + cAdvisor. DCGM Exporter for GPU telemetry and SLA dashboards out of the box.
HWI Advisor auto-detects your CPU and GPU, then writes an optimized tuning profile before first deploy.
Timestamped backup and restore for all persistent data. VRAM purge included. No scripting required.
Structured, independently deployable modules — from driver setup to lifecycle management. Deploy what you need, skip the rest.
Each phase is a self-contained Docker Compose module with its own deploy.sh. Roll forward one layer at a time, or deploy everything in a single command.
Clone and pick your hardware stack — AMD, NVIDIA, or ARM64.
Copy .env.example → .env and replace all CHANGE_ME values.
Auto-detects hardware and writes an optimal tuning profile.
Full stack in one command, or deploy individual phases as needed.
Run the automated health check and benchmark suite.
# 1. Clone git clone https://github.com/TigerAI-Taiwan/OpenGenie-AI-Stack.git cd OpenGenie-AI-Stack # 2. Pick your stack cd deployments/amd-compose-stack # or: nvidia-compose-stack / arm64-compose-stack # 3. Configure cp .env.example .env nano .env # replace CHANGE_ME values # 4. Hardware calibration (recommended) sudo bash master-deploy.sh init # 5. Deploy everything sudo bash master-deploy.sh all # 6. Verify sudo bash master-deploy.sh test
No cloud lock-in. No usage fees. Deploy on your hardware, keep your data on-premise.