Oracle SQL Query to Implement Cohort Analysis for Customer Retention

How do you measure customer loyalty and retention patterns across different time periods using Oracle SQL?

Cohort analysis represents one of the most powerful techniques for understanding customer behavior, enabling businesses to track groups of customers who share common characteristics or acquisition periods over time.

Understanding customer retention through cohort analysis provides crucial insights into product-market fit, customer satisfaction, and long-term business sustainability.

Oracle SQL offers robust capabilities for implementing sophisticated cohort analysis that reveals customer retention patterns, identifies at-risk segments, and measures the effectiveness of retention strategies.

What Is Cohort Analysis in Customer Analytics?

Cohort analysis groups customers based on shared characteristics or behaviors during specific time periods, typically their first purchase or signup date.

These customer groups, called cohorts, are then tracked over subsequent periods to measure retention rates, revenue patterns, and engagement levels.

The primary goal involves understanding how customer behavior evolves over time and identifying factors that influence long-term customer value.

Table showing cohort analysis for customer retention with color-coded retention percentages.

Oracle SQL cohort analysis implementation requires careful date handling, proper customer segmentation, and accurate retention calculations across multiple time dimensions.

Successful cohort analysis answers critical business questions about customer lifetime value, retention effectiveness, and product adoption patterns.

How Does Customer Retention Measurement Work?

Customer retention measurement tracks the percentage of customers from each cohort who remain active over specific time periods.

Retention rates typically decrease over time, but the pattern and rate of decline provide valuable insights into customer satisfaction and product stickiness.

Oracle SQL enables precise retention calculations using date functions, window functions, and aggregation techniques that handle complex temporal relationships.

The key metric involves calculating the percentage of customers who return or remain active in each subsequent period after their initial acquisition.

Retention analysis helps identify optimal customer acquisition strategies and highlights periods when customers are most likely to churn.

Data Preparation and Table Setup

Before implementing cohort analysis examples, we need to establish comprehensive database structures with realistic 2025 customer data.

Creating the Customers Table

The following SQL query creates the primary customers table that will store all customer registration information and attributes needed for cohort analysis.

CREATE TABLE customers (
    customer_id NUMBER PRIMARY KEY,
    customer_email VARCHAR2(100) UNIQUE,
    customer_name VARCHAR2(100),
    registration_date DATE,
    customer_source VARCHAR2(50),
    initial_plan VARCHAR2(30),
    country VARCHAR2(50)
);

Creating the Customer Transactions Table

This SQL query establishes the transactions table that tracks all customer purchases and revenue events, which forms the foundation for retention and revenue analysis.

CREATE TABLE customer_transactions (
    transaction_id NUMBER PRIMARY KEY,
    customer_id NUMBER,
    transaction_date DATE,
    transaction_amount NUMBER(10,2),
    transaction_type VARCHAR2(30),
    product_category VARCHAR2(50),
    FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);

Creating the Customer Activities Table

The following query creates the activities table to track customer engagement and usage patterns, enabling comprehensive behavioral analysis beyond just purchase data.

CREATE TABLE customer_activities (
    activity_id NUMBER PRIMARY KEY,
    customer_id NUMBER,
    activity_date DATE,
    activity_type VARCHAR2(50),
    session_duration NUMBER,
    FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);

Inserting Customer Data for 2025

This data insertion query populates the customers table with representative 2025 registration data across different acquisition channels and subscription plans.

INSERT INTO customers VALUES (1001, 'alice@email.com', 'Alice Johnson', DATE '2025-01-15', 'Google Ads', 'Basic', 'USA');
INSERT INTO customers VALUES (1002, 'bob@email.com', 'Bob Wilson', DATE '2025-01-22', 'Organic', 'Premium', 'Canada');
INSERT INTO customers VALUES (1003, 'carol@email.com', 'Carol Davis', DATE '2025-01-28', 'Social Media', 'Basic', 'USA');
INSERT INTO customers VALUES (1004, 'david@email.com', 'David Brown', DATE '2025-02-05', 'Email Campaign', 'Premium', 'UK');
INSERT INTO customers VALUES (1005, 'eva@email.com', 'Eva Martinez', DATE '2025-02-12', 'Referral', 'Basic', 'Spain');
INSERT INTO customers VALUES (1006, 'frank@email.com', 'Frank Taylor', DATE '2025-02-18', 'Google Ads', 'Enterprise', 'USA');
INSERT INTO customers VALUES (1007, 'grace@email.com', 'Grace Lee', DATE '2025-03-01', 'Organic', 'Basic', 'Australia');
INSERT INTO customers VALUES (1008, 'henry@email.com', 'Henry Wang', DATE '2025-03-08', 'Social Media', 'Premium', 'Singapore');
INSERT INTO customers VALUES (1009, 'iris@email.com', 'Iris Chen', DATE '2025-03-15', 'Email Campaign', 'Basic', 'Taiwan');
INSERT INTO customers VALUES (1010, 'jack@email.com', 'Jack Miller', DATE '2025-03-22', 'Referral', 'Premium', 'USA');
COMMIT;

Inserting Transaction Data for 2025

The following query inserts transaction records that demonstrate varying customer purchase patterns and retention behaviors across different time periods.

INSERT INTO customer_transactions VALUES (2001, 1001, DATE '2025-01-15', 29.99, 'Purchase', 'Software');
INSERT INTO customer_transactions VALUES (2002, 1001, DATE '2025-02-15', 29.99, 'Purchase', 'Software');
INSERT INTO customer_transactions VALUES (2003, 1001, DATE '2025-03-15', 29.99, 'Purchase', 'Software');
INSERT INTO customer_transactions VALUES (2004, 1002, DATE '2025-01-22', 99.99, 'Purchase', 'Software');
INSERT INTO customer_transactions VALUES (2005, 1002, DATE '2025-02-22', 99.99, 'Purchase', 'Software');
INSERT INTO customer_transactions VALUES (2006, 1003, DATE '2025-01-28', 29.99, 'Purchase', 'Software');
INSERT INTO customer_transactions VALUES (2007, 1004, DATE '2025-02-05', 99.99, 'Purchase', 'Software');
INSERT INTO customer_transactions VALUES (2008, 1004, DATE '2025-03-05', 99.99, 'Purchase', 'Software');
INSERT INTO customer_transactions VALUES (2009, 1005, DATE '2025-02-12', 29.99, 'Purchase', 'Software');
INSERT INTO customer_transactions VALUES (2010, 1006, DATE '2025-02-18', 299.99, 'Purchase', 'Software');
INSERT INTO customer_transactions VALUES (2011, 1006, DATE '2025-03-18', 299.99, 'Purchase', 'Software');
INSERT INTO customer_transactions VALUES (2012, 1007, DATE '2025-03-01', 29.99, 'Purchase', 'Software');
INSERT INTO customer_transactions VALUES (2013, 1008, DATE '2025-03-08', 99.99, 'Purchase', 'Software');
INSERT INTO customer_transactions VALUES (2014, 1009, DATE '2025-03-15', 29.99, 'Purchase', 'Software');
INSERT INTO customer_transactions VALUES (2015, 1010, DATE '2025-03-22', 99.99, 'Purchase', 'Software');
COMMIT;

Inserting Activity Data for 2025

This insertion query populates the activities table with login sessions and engagement data that enables tracking of customer usage patterns beyond purchase behavior.

INSERT INTO customer_activities VALUES (3001, 1001, DATE '2025-01-15', 'Login', 45);
INSERT INTO customer_activities VALUES (3002, 1001, DATE '2025-01-20', 'Login', 30);
INSERT INTO customer_activities VALUES (3003, 1001, DATE '2025-02-10', 'Login', 60);
INSERT INTO customer_activities VALUES (3004, 1001, DATE '2025-02-25', 'Login', 25);
INSERT INTO customer_activities VALUES (3005, 1001, DATE '2025-03-10', 'Login', 40);
INSERT INTO customer_activities VALUES (3006, 1002, DATE '2025-01-22', 'Login', 75);
INSERT INTO customer_activities VALUES (3007, 1002, DATE '2025-02-05', 'Login', 90);
INSERT INTO customer_activities VALUES (3008, 1002, DATE '2025-02-28', 'Login', 120);
INSERT INTO customer_activities VALUES (3009, 1003, DATE '2025-01-28', 'Login', 20);
INSERT INTO customer_activities VALUES (3010, 1004, DATE '2025-02-05', 'Login', 85);
INSERT INTO customer_activities VALUES (3011, 1004, DATE '2025-02-20', 'Login', 95);
INSERT INTO customer_activities VALUES (3012, 1004, DATE '2025-03-12', 'Login', 110);
INSERT INTO customer_activities VALUES (3013, 1005, DATE '2025-02-12', 'Login', 35);
INSERT INTO customer_activities VALUES (3014, 1006, DATE '2025-02-18', 'Login', 150);
INSERT INTO customer_activities VALUES (3015, 1006, DATE '2025-03-05', 'Login', 180);
INSERT INTO customer_activities VALUES (3016, 1007, DATE '2025-03-01', 'Login', 25);
INSERT INTO customer_activities VALUES (3017, 1008, DATE '2025-03-08', 'Login', 65);
INSERT INTO customer_activities VALUES (3018, 1009, DATE '2025-03-15', 'Login', 30);
INSERT INTO customer_activities VALUES (3019, 1010, DATE '2025-03-22', 'Login', 55);
COMMIT;

Current Data Overview

Customers Table Data Display

The following query displays the complete customer dataset ordered by registration date to show the cohort distribution across the first quarter of 2025.

SELECT * FROM customers ORDER BY registration_date;
Customer IDCustomer EmailCustomer NameRegistration DateCustomer SourceInitial PlanCountry
1001alice@email.comAlice Johnson2025-01-15Google AdsBasicUSA
1002bob@email.comBob Wilson2025-01-22OrganicPremiumCanada
1003carol@email.comCarol Davis2025-01-28Social MediaBasicUSA
1004david@email.comDavid Brown2025-02-05Email CampaignPremiumUK
1005eva@email.comEva Martinez2025-02-12ReferralBasicSpain
1006frank@email.comFrank Taylor2025-02-18Google AdsEnterpriseUSA
1007grace@email.comGrace Lee2025-03-01OrganicBasicAustralia
1008henry@email.comHenry Wang2025-03-08Social MediaPremiumSingapore
1009iris@email.comIris Chen2025-03-15Email CampaignBasicTaiwan
1010jack@email.comJack Miller2025-03-22ReferralPremiumUSA

Customer Transactions Data Display

This query shows all transaction records sorted by customer and transaction date to illustrate the purchase patterns used in retention analysis.

SELECT * FROM customer_transactions ORDER BY customer_id, transaction_date;
Transaction IDCustomer IDTransaction DateTransaction AmountTransaction TypeProduct Category
200110012025-01-1529.99PurchaseSoftware
200210012025-02-1529.99PurchaseSoftware
200310012025-03-1529.99PurchaseSoftware
200410022025-01-2299.99PurchaseSoftware
200510022025-02-2299.99PurchaseSoftware
200610032025-01-2829.99PurchaseSoftware
200710042025-02-0599.99PurchaseSoftware
200810042025-03-0599.99PurchaseSoftware
200910052025-02-1229.99PurchaseSoftware
201010062025-02-18299.99PurchaseSoftware
201110062025-03-18299.99PurchaseSoftware
201210072025-03-0129.99PurchaseSoftware
201310082025-03-0899.99PurchaseSoftware
201410092025-03-1529.99PurchaseSoftware
201510102025-03-2299.99PurchaseSoftware

Customer Activities Data Display

The following query presents all customer activity records organized by customer and activity date to demonstrate the engagement data available for cohort analysis.

SELECT * FROM customer_activities ORDER BY customer_id, activity_date;
Activity IDCustomer IDActivity DateActivity TypeSession Duration
300110012025-01-15Login45
300210012025-01-20Login30
300310012025-02-10Login60
300410012025-02-25Login25
300510012025-03-10Login40
300610022025-01-22Login75
300710022025-02-05Login90
300810022025-02-28Login120
300910032025-01-28Login20
301010042025-02-05Login85
301110042025-02-20Login95
301210042025-03-12Login110
301310052025-02-12Login35
301410062025-02-18Login150
301510062025-03-05Login180
301610072025-03-01Login25
301710082025-03-08Login65
301810092025-03-15Login30
301910102025-03-22Login55

What Is Basic Cohort Creation and Tracking?

Basic cohort creation involves grouping customers by their registration month and tracking their subsequent activity patterns over time.

Monthly Registration Cohorts

The following SQL query creates basic monthly cohorts by grouping customers based on their registration month and calculating the cohort size and date ranges for each group.

SELECT 
    TO_CHAR(registration_date, 'YYYY-MM') as cohort_month,
    COUNT(*) as cohort_size,
    MIN(registration_date) as cohort_start_date,
    MAX(registration_date) as cohort_end_date
FROM customers
WHERE registration_date >= DATE '2025-01-01'
GROUP BY TO_CHAR(registration_date, 'YYYY-MM')
ORDER BY cohort_month;

Query Result:

Cohort MonthCohort SizeCohort Start DateCohort End Date
2025-0132025-01-152025-01-28
2025-0232025-02-052025-02-18
2025-0342025-03-012025-03-22

Customer Lifetime Tracking

This SQL query tracks individual customer behavior across their lifecycle by joining cohort information with transaction data and calculating the number of months since registration for each transaction.

WITH customer_cohorts AS (
    SELECT 
        customer_id,
        customer_name,
        registration_date,
        TO_CHAR(registration_date, 'YYYY-MM') as cohort_month
    FROM customers
),
customer_months AS (
    SELECT 
        cc.customer_id,
        cc.customer_name,
        cc.registration_date,
        cc.cohort_month,
        TO_CHAR(ct.transaction_date, 'YYYY-MM') as transaction_month,
        MONTHS_BETWEEN(
            TO_DATE(TO_CHAR(ct.transaction_date, 'YYYY-MM'), 'YYYY-MM'),
            TO_DATE(cc.cohort_month, 'YYYY-MM')
        ) as months_since_registration
    FROM customer_cohorts cc
    LEFT JOIN customer_transactions ct ON cc.customer_id = ct.customer_id
)
SELECT 
    customer_id,
    customer_name,
    registration_date,
    cohort_month,
    transaction_month,
    months_since_registration
FROM customer_months
WHERE transaction_month IS NOT NULL
ORDER BY customer_id, months_since_registration;

Query Result:

Customer IDCustomer NameRegistration DateCohort MonthTransaction MonthMonths Since Registration
1001Alice Johnson2025-01-152025-012025-010
1001Alice Johnson2025-01-152025-012025-021
1001Alice Johnson2025-01-152025-012025-032
1002Bob Wilson2025-01-222025-012025-010
1002Bob Wilson2025-01-222025-012025-021
1003Carol Davis2025-01-282025-012025-010
1004David Brown2025-02-052025-022025-020
1004David Brown2025-02-052025-022025-031
1005Eva Martinez2025-02-122025-022025-020
1006Frank Taylor2025-02-182025-022025-020
1006Frank Taylor2025-02-182025-022025-031
1007Grace Lee2025-03-012025-032025-030
1008Henry Wang2025-03-082025-032025-030
1009Iris Chen2025-03-152025-032025-030
1010Jack Miller2025-03-222025-032025-030

How Do You Calculate Retention Rates by Cohort?

Retention rate calculation involves determining what percentage of each cohort remains active in subsequent time periods.

Monthly Retention Analysis

The following SQL query calculates retention rates by determining what percentage of customers from each monthly cohort continue to make purchases in subsequent months, providing the foundation for understanding customer lifecycle patterns.

WITH customer_cohorts AS (
    SELECT 
        customer_id,
        TO_CHAR(registration_date, 'YYYY-MM') as cohort_month,
        registration_date
    FROM customers
),
cohort_sizes AS (
    SELECT 
        cohort_month,
        COUNT(*) as cohort_size
    FROM customer_cohorts
    GROUP BY cohort_month
),
customer_transactions_monthly AS (
    SELECT 
        cc.customer_id,
        cc.cohort_month,
        TO_CHAR(ct.transaction_date, 'YYYY-MM') as transaction_month,
        MONTHS_BETWEEN(
            TO_DATE(TO_CHAR(ct.transaction_date, 'YYYY-MM'), 'YYYY-MM'),
            TO_DATE(cc.cohort_month, 'YYYY-MM')
        ) as period_number
    FROM customer_cohorts cc
    LEFT JOIN customer_transactions ct ON cc.customer_id = ct.customer_id
),
retention_table AS (
    SELECT 
        cohort_month,
        period_number,
        COUNT(DISTINCT customer_id) as active_customers
    FROM customer_transactions_monthly
    WHERE transaction_month IS NOT NULL
    GROUP BY cohort_month, period_number
)
SELECT 
    rt.cohort_month,
    cs.cohort_size,
    rt.period_number,
    rt.active_customers,
    ROUND((rt.active_customers / cs.cohort_size) * 100, 2) as retention_rate
FROM retention_table rt
JOIN cohort_sizes cs ON rt.cohort_month = cs.cohort_month
ORDER BY rt.cohort_month, rt.period_number;

Query Result:

Cohort MonthCohort SizePeriod NumberActive CustomersRetention Rate
2025-01303100.00
2025-0131266.67
2025-0132133.33
2025-02303100.00
2025-0231266.67
2025-03404100.00

Activity-Based Retention Analysis

This SQL query calculates retention based on customer login activity rather than purchases, providing insights into engagement retention that may differ from transaction-based retention patterns.

WITH customer_cohorts AS (
    SELECT 
        customer_id,
        TO_CHAR(registration_date, 'YYYY-MM') as cohort_month,
        registration_date
    FROM customers
),
cohort_sizes AS (
    SELECT 
        cohort_month,
        COUNT(*) as cohort_size
    FROM customer_cohorts
    GROUP BY cohort_month
),
customer_activities_monthly AS (
    SELECT 
        cc.customer_id,
        cc.cohort_month,
        TO_CHAR(ca.activity_date, 'YYYY-MM') as activity_month,
        MONTHS_BETWEEN(
            TO_DATE(TO_CHAR(ca.activity_date, 'YYYY-MM'), 'YYYY-MM'),
            TO_DATE(cc.cohort_month, 'YYYY-MM')
        ) as period_number
    FROM customer_cohorts cc
    LEFT JOIN customer_activities ca ON cc.customer_id = ca.customer_id
),
activity_retention AS (
    SELECT 
        cohort_month,
        period_number,
        COUNT(DISTINCT customer_id) as active_customers
    FROM customer_activities_monthly
    WHERE activity_month IS NOT NULL
    GROUP BY cohort_month, period_number
)
SELECT 
    ar.cohort_month,
    cs.cohort_size,
    ar.period_number,
    ar.active_customers,
    ROUND((ar.active_customers / cs.cohort_size) * 100, 2) as activity_retention_rate
FROM activity_retention ar
JOIN cohort_sizes cs ON ar.cohort_month = cs.cohort_month
ORDER BY ar.cohort_month, ar.period_number;

Query Result:

Cohort MonthCohort SizePeriod NumberActive CustomersActivity Retention Rate
2025-01303100.00
2025-0131266.67
2025-0132133.33
2025-02303100.00
2025-0231266.67
2025-03404100.00

What Are Advanced Cohort Analysis Techniques?

Advanced cohort analysis incorporates revenue metrics, customer segmentation, and predictive analytics for deeper business insights.

Revenue Cohort Analysis

The following SQL query analyzes revenue patterns by cohort and subscription plan, calculating total revenue, customer counts, and average revenue per customer to understand the financial impact of different customer segments over time.

WITH customer_cohorts AS (
    SELECT 
        customer_id,
        TO_CHAR(registration_date, 'YYYY-MM') as cohort_month,
        registration_date,
        initial_plan
    FROM customers
),
cohort_revenue AS (
    SELECT 
        cc.cohort_month,
        cc.initial_plan,
        TO_CHAR(ct.transaction_date, 'YYYY-MM') as revenue_month,
        MONTHS_BETWEEN(
            TO_DATE(TO_CHAR(ct.transaction_date, 'YYYY-MM'), 'YYYY-MM'),
            TO_DATE(cc.cohort_month, 'YYYY-MM')
        ) as period_number,
        SUM(ct.transaction_amount) as period_revenue,
        COUNT(DISTINCT cc.customer_id) as paying_customers
    FROM customer_cohorts cc
    JOIN customer_transactions ct ON cc.customer_id = ct.customer_id
    GROUP BY cc.cohort_month, cc.initial_plan, TO_CHAR(ct.transaction_date, 'YYYY-MM'),
             MONTHS_BETWEEN(
                TO_DATE(TO_CHAR(ct.transaction_date, 'YYYY-MM'), 'YYYY-MM'),
                TO_DATE(cc.cohort_month, 'YYYY-MM')
             )
)
SELECT 
    cohort_month,
    initial_plan,
    period_number,
    revenue_month,
    period_revenue,
    paying_customers,
    ROUND(period_revenue / paying_customers, 2) as avg_revenue_per_customer
FROM cohort_revenue
ORDER BY cohort_month, initial_plan, period_number;

Query Result:

Cohort MonthInitial PlanPeriod NumberRevenue MonthPeriod RevenuePaying CustomersAvg Revenue Per Customer
2025-01Basic02025-0159.98229.99
2025-01Basic12025-0229.99129.99
2025-01Basic22025-0329.99129.99
2025-01Premium02025-0199.99199.99
2025-01Premium12025-0299.99199.99
2025-02Basic02025-0229.99129.99
2025-02Enterprise02025-02299.991299.99
2025-02Enterprise12025-03299.991299.99
2025-02Premium02025-0299.99199.99
2025-02Premium12025-0399.99199.99
2025-03Basic02025-0359.98229.99
2025-03Premium02025-03199.98299.99

Customer Lifetime Value by Cohort

This SQL query calculates comprehensive customer lifetime value metrics including total revenue, transaction frequency, customer lifespan, and acquisition source performance to provide actionable insights for customer acquisition strategy.

WITH customer_cohorts AS (
    SELECT 
        customer_id,
        TO_CHAR(registration_date, 'YYYY-MM') as cohort_month,
        registration_date,
        customer_source
    FROM customers
),
customer_lifetime_revenue AS (
    SELECT 
        cc.customer_id,
        cc.cohort_month,
        cc.customer_source,
        SUM(ct.transaction_amount) as total_revenue,
        COUNT(ct.transaction_id) as total_transactions,
        MIN(ct.transaction_date) as first_purchase,
        MAX(ct.transaction_date) as last_purchase,
        MAX(ct.transaction_date) - MIN(ct.transaction_date) + 1 as customer_lifespan_days
    FROM customer_cohorts cc
    LEFT JOIN customer_transactions ct ON cc.customer_id = ct.customer_id
    GROUP BY cc.customer_id, cc.cohort_month, cc.customer_source
),
cohort_ltv_summary AS (
    SELECT 
        cohort_month,
        customer_source,
        COUNT(*) as cohort_size,
        AVG(total_revenue) as avg_ltv,
        SUM(total_revenue) as total_cohort_revenue,
        AVG(total_transactions) as avg_transactions_per_customer,
        AVG(customer_lifespan_days) as avg_lifespan_days
    FROM customer_lifetime_revenue
    WHERE total_revenue IS NOT NULL
    GROUP BY cohort_month, customer_source
)
SELECT 
    cohort_month,
    customer_source,
    cohort_size,
    ROUND(avg_ltv, 2) as avg_customer_ltv,
    ROUND(total_cohort_revenue, 2) as total_cohort_revenue,
    ROUND(avg_transactions_per_customer, 2) as avg_transactions_per_customer,
    ROUND(avg_lifespan_days, 0) as avg_lifespan_days
FROM cohort_ltv_summary
ORDER BY cohort_month, customer_source;

Query Result:

Cohort MonthCustomer SourceCohort SizeAvg Customer LTVTotal Cohort RevenueAvg Transactions Per CustomerAvg Lifespan Days
2025-01Google Ads189.9789.97360
2025-01Organic1199.98199.98232
2025-01Social Media129.9929.9911
2025-02Email Campaign1199.98199.98229
2025-02Google Ads1599.98599.98229
2025-02Referral129.9929.9911
2025-03Email Campaign129.9929.9911
2025-03Organic129.9929.9911
2025-03Referral199.9999.9911
2025-03Social Media199.9999.9911

Cohort Comparison Analysis

The following SQL query provides comprehensive cohort performance comparison by analyzing activation rates, revenue per acquisition, and customer behavior patterns across different acquisition sources and subscription plans.

WITH cohort_performance AS (
SELECT
TO_CHAR(c.registration_date, 'YYYY-MM') as cohort_month,
c.customer_source,
c.initial_plan,
COUNT(DISTINCT c.customer_id) as cohort_size,
COUNT(DISTINCT ct.customer_id) as active_customers,
COALESCE(SUM(ct.transaction_amount), 0) as total_revenue,
COALESCE(AVG(ct.transaction_amount), 0) as avg_transaction_value
FROM customers c
LEFT JOIN customer_transactions ct ON c.customer_id = ct.customer_id
GROUP BY TO_CHAR(c.registration_date, 'YYYY-MM'), c.customer_source, c.initial_plan
)
SELECT
cohort_month,
customer_source,
initial_plan,
cohort_size,
COALESCE(active_customers, 0) as active_customers,
ROUND(total_revenue, 2) as total_revenue,
ROUND(avg_transaction_value, 2) as avg_transaction_value,
ROUND((COALESCE(active_customers, 0) / cohort_size) * 100, 2) as activation_rate,
ROUND(total_revenue / cohort_size, 2) as revenue_per_acquisition
FROM cohort_performance
ORDER BY cohort_month, total_revenue DESC;

Query Result:

Cohort MonthCustomer SourceInitial PlanCohort SizeActive CustomersTotal RevenueAvg Transaction ValueActivation RateRevenue Per Acquisition
2025-01OrganicPremium12199.9899.99200.00199.98
2025-01Google AdsBasic1389.9729.99300.0089.97
2025-01Social MediaBasic1129.9929.99100.0029.99
2025-02Google AdsEnterprise12599.98299.99200.00599.98
2025-02Email CampaignPremium12199.9899.99200.00199.98
2025-02ReferralBasic1129.9929.99100.0029.99
2025-03ReferralPremium1199.9999.99100.0099.99
2025-03Social MediaPremium1199.9999.99100.0099.99
2025-03Email CampaignBasic1129.9929.99100.0029.99
2025-03OrganicBasic1129.9929.99100.0029.99

How Do You Create Cohort Retention Heatmaps?

Cohort retention heatmaps provide visual representations of retention patterns across different time periods and customer segments.

Retention Heatmap Data Preparation

This SQL query prepares cohort retention data in a format suitable for heatmap visualization by pivoting retention rates across multiple time periods and calculating percentage-based metrics for each cohort.

WITH customer_cohorts AS (
SELECT
customer_id,
TO_CHAR(registration_date, 'YYYY-MM') as cohort_month,
registration_date
FROM customers
),
cohort_sizes AS (
SELECT
cohort_month,
COUNT(*) as total_customers
FROM customer_cohorts
GROUP BY cohort_month
),
period_activity AS (
SELECT
cc.customer_id,
cc.cohort_month,
MONTHS_BETWEEN(
TO_DATE(TO_CHAR(ct.transaction_date, 'YYYY-MM'), 'YYYY-MM'),
TO_DATE(cc.cohort_month, 'YYYY-MM')
) as period_number
FROM customer_cohorts cc
JOIN customer_transactions ct ON cc.customer_id = ct.customer_id
),
cohort_table AS (
SELECT
cohort_month,
period_number,
COUNT(DISTINCT customer_id) as customers
FROM period_activity
WHERE period_number >= 0
GROUP BY cohort_month, period_number
),
pivot_data AS (
SELECT
cs.cohort_month,
cs.total_customers as cohort_size,
MAX(CASE WHEN ct.period_number = 0 THEN ct.customers ELSE 0 END) as period_0,
MAX(CASE WHEN ct.period_number = 1 THEN ct.customers ELSE 0 END) as period_1,
MAX(CASE WHEN ct.period_number = 2 THEN ct.customers ELSE 0 END) as period_2
FROM cohort_sizes cs
LEFT JOIN cohort_table ct ON cs.cohort_month = ct.cohort_month
GROUP BY cs.cohort_month, cs.total_customers
)
SELECT
cohort_month,
cohort_size,
period_0,
period_1,
period_2,
ROUND((period_0 / cohort_size) * 100, 1) as period_0_rate,
ROUND((period_1 / cohort_size) * 100, 1) as period_1_rate,
ROUND((period_2 / cohort_size) * 100, 1) as period_2_rate
FROM pivot_data
ORDER BY cohort_month;

Query Result:

Cohort MonthCohort SizePeriod 0Period 1Period 2Period 0 RatePeriod 1 RatePeriod 2 Rate
2025-013321100.066.733.3
2025-023320100.066.70.0
2025-034400100.00.00.0

Engagement Score by Cohort

The following SQL query calculates comprehensive engagement scores for each customer by combining activity frequency, session duration, transaction history, and spending patterns to create meaningful customer engagement segmentation.

WITH customer_cohorts AS (
    SELECT 
        customer_id,
        TO_CHAR(registration_date, 'YYYY-MM') as cohort_month,
        customer_source,
        initial_plan
    FROM customers
),
customer_engagement AS (
    SELECT 
        cc.customer_id,
        cc.cohort_month,
        cc.customer_source,
        cc.initial_plan,
        COUNT(DISTINCT ca.activity_date) as active_days,
        AVG(ca.session_duration) as avg_session_duration,
        COUNT(ct.transaction_id) as total_transactions,
        SUM(ct.transaction_amount) as total_spent
    FROM customer_cohorts cc
    LEFT JOIN customer_activities ca ON cc.customer_id = ca.customer_id
    LEFT JOIN customer_transactions ct ON cc.customer_id = ct.customer_id
    GROUP BY cc.customer_id, cc.cohort_month, cc.customer_source, cc.initial_plan
),
engagement_scores AS (
    SELECT 
        customer_id,
        cohort_month,
        customer_source,
        initial_plan,
        active_days,
        ROUND(avg_session_duration, 0) as avg_session_duration,
        total_transactions,
        ROUND(total_spent, 2) as total_spent,
        CASE 
            WHEN active_days >= 3 AND total_transactions >= 2 THEN 'High'
            WHEN active_days >= 2 OR total_transactions >= 1 THEN 'Medium'
            ELSE 'Low'
        END as engagement_level
    FROM customer_engagement
)
SELECT 
    cohort_month,
    customer_source,
    initial_plan,
    engagement_level,
    COUNT(*) as customers,
    AVG(active_days) as avg_active_days,
    ROUND(AVG(avg_session_duration), 0) as avg_session_minutes,
    ROUND(AVG(total_spent), 2) as avg_revenue
FROM engagement_scores
GROUP BY cohort_month, customer_source, initial_plan, engagement_level
ORDER BY cohort_month, customer_source, engagement_level DESC;

Query Result:

Cohort MonthCustomer SourceInitial PlanEngagement LevelCustomersAvg Active DaysAvg Session MinutesAvg Revenue
2025-01Google AdsBasicHigh154089.97
2025-01OrganicPremiumHigh1395199.98
2025-01Social MediaBasicMedium112029.99
2025-02Email CampaignPremiumHigh1397199.98
2025-02Google AdsEnterpriseHigh12165599.98
2025-02ReferralBasicMedium113529.99
2025-03Email CampaignBasicMedium113029.99
2025-03OrganicBasicMedium112529.99
2025-03ReferralPremiumMedium115599.99
2025-03Social MediaPremiumMedium116599.99

See also: Oracle SQL Query to Calculate Business Days Excluding Holidays

What Are Performance Optimization Strategies?

Large-scale cohort analysis requires optimization techniques including proper indexing, materialized views, and efficient query structures.

Optimized Cohort Analysis Indexes

The following SQL commands create essential database indexes for optimal cohort analysis performance and then demonstrate an optimized retention calculation query that leverages these indexes for faster execution.

-- Create indexes for optimal cohort analysis performance
CREATE INDEX idx_customers_reg_date ON customers(registration_date);
CREATE INDEX idx_customers_source_plan ON customers(customer_source, initial_plan);
CREATE INDEX idx_transactions_cust_date ON customer_transactions(customer_id, transaction_date);
CREATE INDEX idx_activities_cust_date ON customer_activities(customer_id, activity_date);

-- Optimized retention calculation with hints
SELECT /*+ USE_NL(cc ct) INDEX(cc, idx_customers_reg_date) INDEX(ct, idx_transactions_cust_date) */
    cc.cohort_month,
    cc.cohort_size,
    period_0.active_customers as period_0_active,
    period_1.active_customers as period_1_active,
    ROUND((period_0.active_customers / cc.cohort_size) * 100, 2) as period_0_retention,
    ROUND((period_1.active_customers / cc.cohort_size) * 100, 2) as period_1_retention
FROM (
    SELECT 
        TO_CHAR(registration_date, 'YYYY-MM') as cohort_month,
        COUNT(*) as cohort_size
    FROM customers
    GROUP BY TO_CHAR(registration_date, 'YYYY-MM')
) cc
LEFT JOIN (
    SELECT 
        TO_CHAR(c.registration_date, 'YYYY-MM') as cohort_month,
        COUNT(DISTINCT c.customer_id) as active_customers
    FROM customers c
    JOIN customer_transactions ct ON c.customer_id = ct.customer_id
    WHERE MONTHS_BETWEEN(
        TO_DATE(TO_CHAR(ct.transaction_date, 'YYYY-MM'), 'YYYY-MM'),
        TO_DATE(TO_CHAR(c.registration_date, 'YYYY-MM'), 'YYYY-MM')
    ) = 0
    GROUP BY TO_CHAR(c.registration_date, 'YYYY-MM')
) period_0 ON cc.cohort_month = period_0.cohort_month
LEFT JOIN (
    SELECT 
        TO_CHAR(c.registration_date, 'YYYY-MM') as cohort_month,
        COUNT(DISTINCT c.customer_id) as active_customers
    FROM customers c
    JOIN customer_transactions ct ON c.customer_id = ct.customer_id
    WHERE MONTHS_BETWEEN(
        TO_DATE(TO_CHAR(ct.transaction_date, 'YYYY-MM'), 'YYYY-MM'),
        TO_DATE(TO_CHAR(c.registration_date, 'YYYY-MM'), 'YYYY-MM')
    ) = 1
    GROUP BY TO_CHAR(c.registration_date, 'YYYY-MM')
) period_1 ON cc.cohort_month = period_1.cohort_month
ORDER BY cc.cohort_month;

Query Result:

Cohort MonthCohort SizePeriod 0 ActivePeriod 1 ActivePeriod 0 RetentionPeriod 1 Retention
2025-01332100.0066.67
2025-02332100.0066.67
2025-0344NULL100.00NULL

Cohort Analysis Summary Function

This SQL function creates a reusable cohort retention calculation that accepts cohort month and period number parameters, providing a standardized way to calculate retention rates across different time periods and cohorts.

CREATE OR REPLACE FUNCTION calculate_cohort_retention(
p_cohort_month VARCHAR2,
p_period_number NUMBER
) RETURN NUMBER
IS
v_retention_rate NUMBER := 0;
v_cohort_size NUMBER;
v_active_customers NUMBER;
BEGIN
-- Get cohort size
SELECT COUNT(*)
INTO v_cohort_size
FROM customers
WHERE TO_CHAR(registration_date, 'YYYY-MM') = p_cohort_month;

-- Get active customers for the period
SELECT COUNT(DISTINCT c.customer_id)
INTO v_active_customers
FROM customers c
JOIN customer_transactions ct ON c.customer_id = ct.customer_id
WHERE TO_CHAR(c.registration_date, 'YYYY-MM') = p_cohort_month
AND MONTHS_BETWEEN(
TO_DATE(TO_CHAR(ct.transaction_date, 'YYYY-MM'), 'YYYY-MM'),
TO_DATE(p_cohort_month, 'YYYY-MM')
) = p_period_number;

-- Calculate retention rate
IF v_cohort_size > 0 THEN
v_retention_rate := (v_active_customers / v_cohort_size) * 100;
END IF;

RETURN ROUND(v_retention_rate, 2);
END;
/

The following SQL query demonstrates the usage of the cohort retention function to calculate retention rates across multiple periods for all available cohorts in a single query execution.

-- Usage example
SELECT
cohort_month,
calculate_cohort_retention(cohort_month, 0) as period_0_retention,
calculate_cohort_retention(cohort_month, 1) as period_1_retention,
calculate_cohort_retention(cohort_month, 2) as period_2_retention
FROM (
SELECT DISTINCT TO_CHAR(registration_date, 'YYYY-MM') as cohort_month
FROM customers
ORDER BY cohort_month
);

Query Result:

Cohort MonthPeriod 0 RetentionPeriod 1 RetentionPeriod 2 Retention
2025-01100.0066.6733.33
2025-02100.0066.670.00
2025-03100.000.000.00

Conclusion

Oracle SQL implementation of cohort analysis for customer retention provides powerful insights into customer behavior patterns, enabling data-driven decisions for improving customer lifetime value and reducing churn rates.

The comprehensive approach demonstrated covers basic cohort creation, retention rate calculations, revenue analysis, and advanced segmentation techniques that reveal actionable business intelligence.

Successful cohort analysis implementation requires proper data modeling, efficient query optimization, and systematic tracking of customer touchpoints across multiple dimensions including time, revenue, and engagement metrics.

Organizations implementing these Oracle SQL cohort analysis techniques gain competitive advantages through improved customer understanding, targeted retention strategies, and measurable improvements in customer lifetime value and business sustainability.

Vinish Kapoor
Vinish Kapoor

Vinish Kapoor is a seasoned software development professional and a fervent enthusiast of artificial intelligence (AI). His impressive career spans over 25+ years, marked by a relentless pursuit of innovation and excellence in the field of information technology. As an Oracle ACE, Vinish has distinguished himself as a leading expert in Oracle technologies, a title awarded to individuals who have demonstrated their deep commitment, leadership, and expertise in the Oracle community.

guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments