FLAKYTESTX

Inspiration

Flaky tests are one of the most frustrating problems in modern software development. They waste time, break pipelines, and cause teams to lose trust in their testing process. We’ve all seen that one test that randomly fails for no apparent reason — and that pain is what inspired us to create FlakyTestX. We wanted to build a tool that not only detects flaky tests but helps teams understand and fix them — intelligently, efficiently, and automatically.

What it does

FlakyTestX is an AI-powered test quality tool that detects, analyzes, and recommends fixes for flaky tests in your codebase. It works in three core stages:

Detection: Runs your test suite multiple times, calculates a flakiness score for each test, and logs patterns of instability.
AI-Powered Analysis: Sends flaky test logs to an AI model that determines likely causes such as race conditions, shared state, or timing issues.
Actionable Recommendations: Returns specific insights and even code suggestions to help developers resolve test instability quickly.

It also features a Streamlit dashboard that displays flakiness metrics, charts, test logs, and AI insights.

How we built it

FlakyTestX is built using Python and structured around three major modules:

flaky_detector.py: Executes pytest-based test suites multiple times and logs their stability behavior.
ai_insight_generator.py: Uses OpenAI (or mock responses) to analyze flaky test failures and provide debugging advice.
dashboard.py: Built with Streamlit to visualize results in an interactive and presentable format.

We also included:

CLI support
.env configuration
JSON-based result files for CI/CD integration

Challenges we ran into

Simulating real-world flaky test conditions in a controlled environment
Ensuring AI responses were context-aware and actionable
Optimizing test run performance
Designing a clear and interactive dashboard
Differentiating between true flaky behavior and consistent failures

Accomplishments that we're proud of

Fully automated flaky test detection and analysis pipeline
AI-generated insights and fix suggestions
Streamlit dashboard for interactive visualization
Offline-ready fallback using mock AI responses
CI/CD-friendly CLI tools

What we learned

Flaky test behavior often hides in subtle patterns
AI can be a powerful ally in debugging and QA workflows
Combining traditional statistics with LLM insights can reduce dev time
A stable test suite is just as important as passing one

What's next for FLAKYTESTX

Add support for JavaScript frameworks like Jest and Mocha
Integrate with GitHub to comment on PRs with flaky test info
Visualize flaky test trends over time
Add notifications for flakiness spikes in nightly CI runs
Dockerized version for enterprise use

Built With

code
dotenv
git
json
openai-api
pytest
python
streamlit
vs

Updates

Adithya N started this project — Apr 02, 2025 02:44 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.