Inspiration

In the fast-paced Traditional Trade Channel (Canal Tradicional), a "tendero" dropping their order volume is a silent alarm. By the time a sales representative notices the trend, the client has already churned to a competitor. We defined the problem not just as a lost client, but as a severe, unaddressed drop in historical purchasing volume. We realized that predicting who will leave isn't enough; we need to empower the workforce with exactly what to do to save them, translating complex data into a clear narrative for a non-technical audience.

Churn Ejecter Dashboard

What it does

Churn Ejecter is a high-speed analytical pipeline and visualizer designed to predict and prevent B2B customer churn. It provides actionable business insights by linking risk to active assets (coolers) and geographic territories.

  • Independent Risk Assessment: It analyzes hundreds of thousands of clients, comparing each business strictly against its own historical baseline to extract true business insights.
  • AI Action Plans: For critical clients, the system injects the tabular data into the Google Gemini 2.5 Flash API. The AI acts as a Senior Commercial Strategist, generating 3 actionable bullets with estimated impact for the route promoter, enforcing strict ROI guardrails.

How we built it

Our core engine uses Python and Pandas for heavy data wrangling, connecting securely to MongoDB Atlas to ensure clean and reproducible code. To build a robust predictive engine, we engineered custom features:

  • Feature Engineering & Null Treatment: We bypassed destructive dropna() functions by conditionally filling missing values for new clients.
  • Handling Outliers & Imbalance: Instead of comparing a massive supermarket to a small corner store, we engineered an independent historical baseline for every single client. We calculated the exact percentage drop utilizing the formula: $$ \Delta\% = \left( \frac{V_{actual} - \mu_{historico}}{\mu_{historico}} \right) \times 100 $$ Where $\mu_{historico}$ represents the client's historical average.
# Dynamic scoring system calibrated with the real historical data of each business
condiciones = [
    (clientes_hoy['porcentaje_caida'] <= -75),  # Critical Collapse (Level 5)
    (clientes_hoy['porcentaje_caida'] <= -50),  # Red Alert (Level 4)
    (clientes_hoy['porcentaje_caida'] <= -25)   # Significant Drop (Level 3)
]
valores_riesgo = [5, 4, 3]
clientes_hoy['calificacion_riesgo'] = np.select(condiciones, valores_riesgo, default=1)

Challenges we ran into

  • Justifying the Modeling Approach: Initially, we trained a Random Forest classifier. However, the "black-box" nature and scale bias led us to pivot to a highly transparent, heuristic rules-engine. This was a strategic choice to prioritize operational precision and business interpretabilidad over unnecessary technical complexity.
  • Data Massacres: We struggled with data loss during the EDA phase when joining datasets, forcing us to rewrite our pipeline to conditionally rescue valid clients.
  • API Migration: We encountered errors because Google updated their Generative AI libraries, forcing us to migrate to the google-genai SDK under intense time pressure.

Accomplishments that we're proud of

  • Dynamic Scoring System: We successfully developed a 1-to-5 risk scoring system directly tied to our engineered $\Delta\%$ metric.
  • Mastering Prompt Engineering: We successfully "tamed" the LLM. The AI now rigorously evaluates assets and demands technical audits if volume drops $\Delta\% \le -50\%$ despite having equipment assigned.
  • Reproducibility & Speed: We managed to clean, join, evaluate, and export predictions for nearly 200,000 active clients in seconds, ensuring our codebase is fully reproducible step-by-step.

What we learned

  • Context is everything: An 80% drop in sales for a small business is just as critical as an equal drop in a massive supermarket.
  • RAG transforms LLMs: Injecting structured data completely changes the game, turning a text generator into a hyper-focused corporate strategist.

What's next for Churn Ejecter

The next step is deploying the impactful visualizer as a progressive web app (PWA) directly into the hands of Arca Continental's route promoters. We also plan to implement a feedback loop in the interface that tracks which AI-generated tactics successfully recovered the client, continuously refining the retention engine's metrics.

Share this project:

Updates