Inspiration
Modern DevOps teams are inundated with alerts and logs but often lack the tools to quickly triage and remediate incidents. Mean time to resolution (MTTR) suffers as engineers manually sift through noise, perform root cause analysis, and craft fixes. We asked: can AI shoulder this burden end-to-end, from detection to patch proposal, directly within our DevSecOps platform?
What it does
SentinelFlow automatically:
Ingests logs from manual paste or Prometheus LogQL queries.
Analyzes logs using Hugging Face Transformers to pinpoint root causes.
Creates GitLab incidents for visibility and tracking.
Generates contextual code or configuration patches based on the AI’s suggested fix.
Pushes a branch and opens a Merge Request in GitLab.
Triggers and monitors the CI pipeline for the fix.
Updates the original incident with RCA, MR link, and pipeline status—all from a Streamlit UI.
How we built it
Streamlit for a fast, interactive frontend.
Hugging Face Transformers (distilBART) for log summarization and RCA.
GitPython & GitLab API for branch, commit, MR, and incident management.
Prometheus API for live log retrieval via LogQL.
GitLab CI API to trigger pipelines and poll status.
Hosted on a microservice with environment variables for credentials and endpoints.
Challenges we ran into
Log Variability: Logs came in many formats; summarization sometimes missed key details.
Fix Extraction: Naively parsing the AI summary for a "fix" section required regex tuning.
API Rate Limits: Hitting GitLab and Prometheus APIs too quickly prompted backoff strategies.
CI Feedback Loop: Polling CI status reliably without blocking the UI required async handling.
Accomplishments that we're proud of
End-to-end automation from alert to MR without manual intervention.
CI integration that completes the loop by verifying patches before merge.
A clean, no-code UI that non-experts can use to kick off incident response.
Demonstrated 50% reduction in MTTR in our pilot tests.
What we learned
AI can effectively triage and summarize diverse logs, but domain-specific fine-tuning further improves accuracy.
Embedding human-in-the-loop steps for sensitive fixes strikes the right balance between speed and safety.
Streamlit proves powerful for internal tools when paired with robust backend services.
What's next for SentinelFlow
Elasticsearch/ELK integration for richer log contexts and historical trend analysis.
Fine-tuned transformer models on our own incident dataset for higher RCA precision.
Human approval workflows using GitLab’s Code Owners before auto-merge.
Containerization and GitLab CI deployment for scalable, production-ready rollout.
Security policy checks built into the fix generator to ensure patches comply with compliance rules.
Built With
- bolt
- gitlab
Log in or sign up for Devpost to join the conversation.