Inspiration

Every on-call engineer knows the 3 a.m. panic: logs are red-lining, dashboards are useless noise, and the fix lives somewhere deep in a dusty runbook. EdgeOps Guardian shrinks MTTR by giving SREs a local AI copilot that detects spikes, retrieves the right runbook, and explains the root cause—all on the same workstation.

What it does

  1. Detects anomalies in real-time logs/metrics.
  2. Retrieves the exact runbook snippet via a RAG pipeline.
  3. Explains the incident in plain English with an on-prem LLM.
  4. (Roadmap) Routes multimodal data (images, audio) through a Mixture-of-Experts layer and speaks a 15-second summary.

How we built it

Layer Tech / Model Notes
Anomaly detection Tiny Time-Series Transformer (PyTorch) MLflow-logged
Retrieval (RAG) LangChain + FAISS Embeddings: text-embedding-3-small
LLM core Meta Llama-2 7B-Chat NIM from NVIDIA NGC Container: nvcr.io/nim/meta/llama-2-7b-chat → pulled & registered
API / UI FastAPI (Swagger) & Streamlit Runs with HP AI Studio
Dev & Ops glue HP AI Studio, GitHub Actions CI One-command bootstrap

Why this model?
Small enough (≈13 GB in VRAM, 4 GB compressed image) to run on a laptop RTX 4090, instruction-tuned for helpful chat, and officially hosted on NGC, satisfying the requirement with zero license friction.

Challenges we ran into

  • LangChain 0.2 module split (fixed via langchain-community wheels).
  • DirectoryLoader pulled the heavy unstructured stack (replaced with TextLoader).
  • Needed a quantized model that fit local VRAM—NGC’s 7B-Chat NIM solved it.

Accomplishments that we're proud of

  • End-to-end on-device RAG loop working in under two hours.
  • Zero cloud calls - embeddings, LLM, FAISS all local.
  • Clean GitHub repo + MLflow “production” tag pipeline.

What we learned

  • HP AI Studio is super cool! would love continued access.
  • NGC NIM images plug straight into MLflow with a single mlflow models register call.
  • Modular LangChain routers prep us for painless multimodal MoE expansion.

What's next for EdgeOps Guardian

  • Finish MoE layer: BLIP-2 (image caption), faster-whisper or Parakeet (ASR), Coqui XTTS (TTS).
  • Auto-patch generation with the same Llama-2 model.
  • User testing with real SRE teams to benchmark MTTR reduction.
  • Open-source release (Apache-2) so any edge site can adopt privacy-first AIOps.

Built With

Share this project:

Updates