EdgeOps Guardian

Inspiration

Every on-call engineer knows the 3 a.m. panic: logs are red-lining, dashboards are useless noise, and the fix lives somewhere deep in a dusty runbook. EdgeOps Guardian shrinks MTTR by giving SREs a local AI copilot that detects spikes, retrieves the right runbook, and explains the root cause—all on the same workstation.

What it does

Detects anomalies in real-time logs/metrics.
Retrieves the exact runbook snippet via a RAG pipeline.
Explains the incident in plain English with an on-prem LLM.
(Roadmap) Routes multimodal data (images, audio) through a Mixture-of-Experts layer and speaks a 15-second summary.

How we built it

Layer	Tech / Model	Notes
Anomaly detection	Tiny Time-Series Transformer (PyTorch)	MLflow-logged
Retrieval (RAG)	LangChain + FAISS	Embeddings: `text-embedding-3-small`
LLM core	Meta Llama-2 7B-Chat NIM from NVIDIA NGC	Container: `nvcr.io/nim/meta/llama-2-7b-chat` → pulled & registered
API / UI	FastAPI (Swagger) & Streamlit	Runs with HP AI Studio
Dev & Ops glue	HP AI Studio, GitHub Actions CI	One-command bootstrap

Why this model?
Small enough (≈13 GB in VRAM, 4 GB compressed image) to run on a laptop RTX 4090, instruction-tuned for helpful chat, and officially hosted on NGC, satisfying the requirement with zero license friction.

Challenges we ran into

LangChain 0.2 module split (fixed via langchain-community wheels).
DirectoryLoader pulled the heavy unstructured stack (replaced with TextLoader).
Needed a quantized model that fit local VRAM—NGC’s 7B-Chat NIM solved it.

Accomplishments that we're proud of

End-to-end on-device RAG loop working in under two hours.
Zero cloud calls - embeddings, LLM, FAISS all local.
Clean GitHub repo + MLflow “production” tag pipeline.

What we learned

HP AI Studio is super cool! would love continued access.
NGC NIM images plug straight into MLflow with a single mlflow models register call.
Modular LangChain routers prep us for painless multimodal MoE expansion.

What's next for EdgeOps Guardian

Finish MoE layer: BLIP-2 (image caption), faster-whisper or Parakeet (ASR), Coqui XTTS (TTS).
Auto-patch generation with the same Llama-2 model.
User testing with real SRE teams to benchmark MTTR reduction.
Open-source release (Apache-2) so any edge site can adopt privacy-first AIOps.

Built With

hp
hp-ai
ngc
python

Updates

Muntaser Syed started this project — Jun 09, 2025 08:01 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.