Dashboard view - Dark Mode
Dashboard view - Light Mode + Signed In
Summarized Thread

Kernel Digest (LKML Dashboard)

Inspiration

The Linux kernel powers 96.3% of the world's top web servers, 71% of mobile devices, and 90% of cloud infrastructure. Yet contributing to it has a massive barrier: the Linux Kernel Mailing List (LKML) receives 500-600+ emails daily, making it nearly impossible for new contributors, students, and developers in underserved regions to follow development discussions.

We were inspired by the realization that this information bottleneck excludes thousands of potential contributors who could make Linux better. If Linux is truly open source, shouldn't its development process be accessible to everyone?

What it does

Kernel Digest transforms chaos into clarity.

AI-Powered Summarization: Uses Google Gemini to generate TL;DR summaries, extract key points, identify affected subsystems, and track discussion outcomes
Intelligent Organization: Automatically reconstructs email threads, sorts by activity level, and enables full-text search across all discussions
Smart Filtering: Browse by subsystem (networking, memory management, filesystems, etc.) or search by keywords
Accessibility: Reduces the time to understand LKML from 3 hours → 15 minutes daily
Auth0 Security: Secure authentication with social login for personalized experiences

The result? Anyone can now follow kernel development, regardless of their time constraints or technical background.

How we built it

Architecture

Frontend (React + Tailwind) → REST API (Flask) → SQLite Database
                                    ↓
                            Gemini AI Summarizer

Backend Stack

Email Parsing: Built custom parsers for EML, Atom feeds, and mbox formats to ingest LKML archives from lore.kernel.org
Thread Reconstruction: Implemented algorithm using In-Reply-To and References headers to build conversation trees
Database Layer: SQLite with FTS5 full-text search indexes for instant querying across 100,000+ emails
AI Integration:
- Integrated Google Gemini 2.5 Flash for cost-effective summarization
- Implemented 90% cost reduction through intelligent caching (cache hit rate: 80-90%)
- Added exponential backoff retry logic for API reliability
- Smart context truncation keeps patches, root emails, and latest replies
REST API: 10 Flask endpoints with CORS, pagination, and filtering support

Frontend Stack

React 18 with functional components and hooks
Tailwind CSS for responsive, accessible design
Auth0 SDK for secure authentication
Lucide React for crisp, modern icons
Dark mode with localStorage persistence

Key Innovation: Smart Caching

We developed a caching system that saves API costs dramatically:

$$\text{Cost Reduction} = 1 - \frac{\text{Cached Requests}}{\text{Total Requests}} \approx 90\%$$

This means processing 82 threads costs ~$0.05 instead of $0.50, making the project sustainable.

Challenges we ran into

1. Email Thread Reconstruction is Hard

Email threading isn't standardized. We had to handle:

Missing or malformed Message-ID headers
Circular references
Out-of-order email delivery
Multiple root emails claiming to be the thread start

Solution: Built a robust algorithm that walks In-Reply-To chains backward, with visited-set cycle detection and fallback to References headers.

2. Gemini API Rate Limits & Costs

Initial testing hit quota limits within minutes, and costs were projected at $9/month for daily processing.

Solution:

Implemented intelligent caching with MD5 hash keys
Added exponential backoff retry (2s, 4s, 8s delays)
Rate limiting: max 15 requests/minute
Result: 90% cost reduction and zero quota errors

3. Tailwind CSS v4 Breaking Changes

Frontend wouldn't compile due to PostCSS plugin changes in Tailwind v4.

Solution: Downgraded to stable Tailwind v3.4.1 and restructured build configuration.

4. UI Showing AI Summary Twice

A data mapping bug caused the same AI summary to appear in both the card preview and the "AI Explanation" box.

Solution: Separated excerpt (generic preview) from simplified (AI summary), ensuring clean data flow from API → state → components.

5. Dark Mode Theme Consistency

Managing theme state across deeply nested components without prop drilling.

Solution: Single darkMode state at App level with localStorage persistence, passed naturally through component tree.

Accomplishments that we're proud of

Technical Achievements

✅ 90% AI cost reduction through intelligent caching architecture
✅ Processed 101 emails into 82 threads with 100% AI coverage
✅ Sub-second full-text search across all emails using SQLite FTS5
✅ Zero-downtime pagination handling 1000+ threads efficiently
✅ Production-ready API with 10 documented endpoints

Social Impact

🌍 Democratizing kernel development for underrepresented groups
📚 Preserving tribal knowledge through AI-powered summaries
⏰ Saving developers 2+ hours daily in email triage
🎓 Lowering barrier to entry for students and new contributors

Code Quality

📖 Comprehensive documentation: 8 detailed guides (Setup, API, Troubleshooting, etc.)
🧪 Error handling: Graceful degradation with retry mechanisms
♿ Accessibility: WCAG-compliant color contrast, keyboard navigation
📱 Responsive design: Mobile-first approach with breakpoints

What we learned

Technical Skills

AI Integration at Scale: How to build production-ready LLM pipelines with caching, retry logic, and cost optimization
Email Standards: Deep dive into RFC 2822, MIME types, threading algorithms
Full-Text Search: SQLite FTS5 virtual tables and ranking algorithms
API Design: RESTful principles, pagination strategies, CORS handling
React State Management: When to lift state, prop drilling alternatives, localStorage for persistence

Soft Skills

Problem Decomposition: Breaking "summarize LKML" into parsers → threads → summaries → API → UI
Debugging Methodology: Systematic troubleshooting from error messages → root cause → fix → test
Documentation: Writing for multiple audiences (users, contributors, judges)
Time Management: Prioritizing MVP features vs. nice-to-haves under hackathon constraints

Surprising Insights

Caching is magical: 90% cost reduction from one smart design decision
Email threading is an NP-hard problem: No perfect solution exists
Users don't read docs: Good error messages > extensive documentation
Dark mode is hard: Consistent theming requires upfront architecture

What's next for Kernel Digest (LKML Dashboard)

Immediate (Post-Hackathon)

[ ] Real-time updates: Webhook integration with lore.kernel.org for instant notifications
[ ] Weekly digests: Automated email summaries of week's hottest discussions
[ ] Contributor analytics: Track who's most active in each subsystem
[ ] Email classification: ML model to categorize patch/bug/RFC/discussion

Short-term (3 months)

[ ] Sentiment analysis: Detect controversial threads using NLP
[ ] Maintainer dashboard: Personalized views for subsystem maintainers
[ ] Mobile app: React Native port for on-the-go access
[ ] Export功能: PDF reports of threads for offline reading
[ ] GitHub integration: Link patches to merged commits

Long-term (6-12 months)

[ ] Multi-list support: Expand beyond LKML to netdev, linux-mm, linux-fsdevel
[ ] Advanced search: Natural language queries ("show me all v6.8 memory leaks")
[ ] Recommendation engine: "You might be interested in these threads"
[ ] Contribution matching: Connect newbies with beginner-friendly patches
[ ] API marketplace: Let others build tools on our infrastructure

Scaling Plan

Docker deployment with docker-compose for easy self-hosting
PostgreSQL migration for production-scale data (1M+ emails)
Redis caching layer for sub-millisecond API responses
CDN integration for global low-latency access
Rate limiting to protect against abuse

Our ultimate goal: Make kernel development accessible to the next generation of contributors, regardless of their background, location, or available time. Because open source should be truly open.

Built With

Submitted to

Open Source Hackfest

Created by

I worked on the backend code in the following ways:
- Create backend that parses and stores email files containing 200+ messages
- Create Gemini API integration to create smart and context aware summaries
- Store the parsed entries + summaries in a SQLite database
- Create a Flask API for accessing the database.

Alp Efe Karalar
I worked on the Front end, and linking the backend with the front end. I helped with download the mbox files in the back end as well. I set up the front end in react, and making sure auth0 was incorporated in the frontend.

Eshaan Arakoni

Updates

Alp Efe Karalar started this project — Oct 19, 2025 09:56 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.