Inspiration
Every on-call engineer knows the 3 a.m. panic: logs are red-lining, dashboards are useless noise, and the fix lives somewhere deep in a dusty runbook. EdgeOps Guardian shrinks MTTR by giving SREs a local AI copilot that detects spikes, retrieves the right runbook, and explains the root cause—all on the same workstation.
What it does
- Detects anomalies in real-time logs/metrics.
- Retrieves the exact runbook snippet via a RAG pipeline.
- Explains the incident in plain English with an on-prem LLM.
- (Roadmap) Routes multimodal data (images, audio) through a Mixture-of-Experts layer and speaks a 15-second summary.
How we built it
| Layer | Tech / Model | Notes |
|---|---|---|
| Anomaly detection | Tiny Time-Series Transformer (PyTorch) | MLflow-logged |
| Retrieval (RAG) | LangChain + FAISS | Embeddings: text-embedding-3-small |
| LLM core | Meta Llama-2 7B-Chat NIM from NVIDIA NGC | Container: nvcr.io/nim/meta/llama-2-7b-chat → pulled & registered |
| API / UI | FastAPI (Swagger) & Streamlit | Runs with HP AI Studio |
| Dev & Ops glue | HP AI Studio, GitHub Actions CI | One-command bootstrap |
Why this model?
Small enough (≈13 GB in VRAM, 4 GB compressed image) to run on a laptop RTX 4090, instruction-tuned for helpful chat, and officially hosted on NGC, satisfying the requirement with zero license friction.
Challenges we ran into
- LangChain 0.2 module split (fixed via
langchain-communitywheels). DirectoryLoaderpulled the heavyunstructuredstack (replaced withTextLoader).- Needed a quantized model that fit local VRAM—NGC’s 7B-Chat NIM solved it.
Accomplishments that we're proud of
- End-to-end on-device RAG loop working in under two hours.
- Zero cloud calls - embeddings, LLM, FAISS all local.
- Clean GitHub repo + MLflow “production” tag pipeline.
What we learned
- HP AI Studio is super cool! would love continued access.
- NGC NIM images plug straight into MLflow with a single
mlflow models registercall. - Modular LangChain routers prep us for painless multimodal MoE expansion.
What's next for EdgeOps Guardian
- Finish MoE layer: BLIP-2 (image caption), faster-whisper or Parakeet (ASR), Coqui XTTS (TTS).
- Auto-patch generation with the same Llama-2 model.
- User testing with real SRE teams to benchmark MTTR reduction.
- Open-source release (Apache-2) so any edge site can adopt privacy-first AIOps.
Built With
- hp
- hp-ai
- ngc
- python
Log in or sign up for Devpost to join the conversation.