Most databases end up with “business meaning” hiding inside raw codes: status integers, plan tiers, region codes, risk flags, and one-off exceptions someone typed into a free-text column at 2 a.m. If you’ve ever shipped a report where stakeholders keep asking “what does 3 mean again?”, you’ve felt the pain.
In PostgreSQL, CASE is my go-to tool for turning those raw values into decisions and labels close to the data: in SELECT lists, ORDER BY, aggregates, and even inside PL/pgSQL functions. Done well, CASE keeps logic readable, reduces duplicated rules across apps, and makes your queries tell a clearer story. Done poorly, it becomes a hard-to-test thicket of branches that breaks indexes and surprises you with NULL behavior.
I’m going to show you how I use CASE in real work: the two flavors (simple vs searched), how SQL’s CASE differs from PL/pgSQL’s CASE statement, the patterns that scale in reporting and ETL, and the mistakes I still see in production code reviews.
CASE is an expression in SQL (and that matters)
In plain SQL, CASE is an expression: it returns a value. That single idea explains why it fits almost everywhere a value can appear.
There are two forms you’ll use:
- Simple
CASEexpression: compares one expression to candidate values with equality semantics. - Searched
CASEexpression: evaluates boolean conditions top-to-bottom.
A mental model that stays useful: CASE is the SQL equivalent of a switchboard operator. You hand it either (a) a value to match, or (b) a list of questions to ask. It routes you to the first matching branch.
One important detail: in SQL, if no WHEN matches and you omit ELSE, the result is NULL (not an exception). That’s different from PL/pgSQL’s CASE statement (we’ll get there).
How evaluation actually works (and why it saves you)
Two practical behaviors matter when you’re using CASE in real queries:
1) Top-to-bottom matching: PostgreSQL evaluates WHEN clauses in order and stops at the first match.
2) Only the chosen result expression is evaluated: This is huge for safety. You can write branches that would error if evaluated (division by zero, invalid casts) as long as those branches are never chosen.
That gives you patterns like “safe division”:
SELECT
revenue_cents,
visits,
CASE
WHEN visits = 0 OR visits IS NULL THEN NULL
ELSE revenue_cents::numeric / visits
END AS revenuepervisit
FROM metrics;
If you tried to do this without CASE, you’d end up with a divide-by-zero risk or a lot of NULLIF gymnastics. (NULLIF is still great; I just want you to notice how CASE can encode the policy clearly: “if visits is zero, return NULL.”)
CASE vs COALESCE vs NULLIF (when I reach for which)
These tools overlap, but they’re not the same:
COALESCE(a, b, c)is “pick first non-NULL.” It’s not conditional beyond NULL-ness.NULLIF(a, b)is “return NULL if equal, else return a.” It’s a tiny conditional.CASEis “full conditional routing.”
If you’re only dealing with NULL defaults, I prefer COALESCE for readability.
If you’re preventing specific errors like dividing by zero, NULLIF is compact:
SELECT revenuecents::numeric / NULLIF(visits, 0) AS revenueper_visit
FROM metrics;
But once the logic has multiple rules, overrides, or data-quality handling, CASE keeps the intent obvious.
Simple CASE: equality routing for discrete values
Simple CASE shines when you have a single thing you’re classifying and a set of exact matches.
Syntax:
CASE
WHEN <value1> THEN <result1>
WHEN <value2> THEN <result2>
ELSE
END
Runnable example: label order states
Here’s a complete example you can paste into psql.
DROP TABLE IF EXISTS orders;
CREATE TABLE orders (
order_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
customer_id bigint NOT NULL,
state_code text NOT NULL,
total_cents integer NOT NULL,
created_at timestamptz NOT NULL DEFAULT now()
);
INSERT INTO orders (customerid, statecode, totalcents, createdat) VALUES
(101, ‘NEW‘, 2599, now() – interval ‘3 days‘),
(102, ‘PAID‘, 8999, now() – interval ‘2 days‘),
(103, ‘SHIP‘, 4599, now() – interval ‘1 day‘),
(104, ‘CANC‘, 1200, now() – interval ‘6 hours‘);
SELECT
order_id,
state_code,
CASE state_code
WHEN ‘NEW‘ THEN ‘Awaiting payment‘
WHEN ‘PAID‘ THEN ‘Ready to ship‘
WHEN ‘SHIP‘ THEN ‘On the way‘
WHEN ‘CANC‘ THEN ‘Canceled‘
ELSE ‘Unknown‘
END AS state_label,
totalcents / 100.0 AS totalusd
FROM orders
ORDER BY order_id;
NULL gotcha with simple CASE
Simple CASE uses equality-style matching. That means NULL never matches:
SELECT
NULL::text AS state_code,
CASE NULL::text
WHEN NULL THEN ‘this will not happen‘
ELSE ‘falls through‘
END AS result;
If you need to handle NULL, use searched CASE with IS NULL:
SELECT
NULL::text AS state_code,
CASE
WHEN NULL::text IS NULL THEN ‘missing‘
ELSE ‘present‘
END AS result;
This is really just SQL’s three-valued logic showing up: comparisons to NULL yield NULL (unknown), not true.
Simple CASE supports multiple values per branch
This is handy for grouping “aliases”:
SELECT
state_code,
CASE state_code
WHEN ‘CANC‘ THEN ‘Closed‘
WHEN ‘REFD‘ THEN ‘Closed‘
WHEN ‘SHIP‘ THEN ‘Active‘
WHEN ‘PAID‘ THEN ‘Active‘
ELSE ‘Other‘
END AS lifecycle_bucket
FROM (VALUES (‘CANC‘), (‘REFD‘), (‘SHIP‘), (‘PAID‘), (‘NEW‘)) AS v(state_code);
When I see people repeat the same THEN 6 times, I refactor to this shape.
When simple CASE is a smell
Simple CASE is great for stable, discrete mappings. But it becomes a smell when:
- You start adding many branches that really belong to a lookup table.
- You need ranges,
LIKE, or compound conditions (that’s searchedCASE). - You’re matching floats/numerics that come from computation (exact equality gets fragile).
If your mapping is long and business-owned, it’s usually time to promote the rule from code to data.
Searched CASE: ranges, predicates, and priority
Searched CASE is the one I reach for most often because real systems rarely cleanly map one input value to one output. With searched CASE, each WHEN is a boolean condition evaluated in order.
Syntax:
CASE
WHEN <condition1> THEN <result1>
WHEN <condition2> THEN <result2>
ELSE
END
Runnable example: price bands and edge cases
This example classifies orders into tiers. Note the careful treatment of NULL and negative values.
SELECT
order_id,
total_cents,
CASE
WHEN total_cents IS NULL THEN ‘missing total‘
WHEN total_cents < 0 THEN 'data error'
WHEN total_cents < 2500 THEN 'budget'
WHEN total_cents < 7500 THEN 'standard'
WHEN total_cents < 15000 THEN 'premium'
ELSE ‘enterprise‘
END AS spend_tier
FROM orders
ORDER BY order_id;
Two rules I follow:
- Put “data quality” branches first (
IS NULL, out-of-range, impossible states). They should win before anything else. - Order your thresholds so they’re unambiguous. If two conditions can be true, the first match wins.
Priority is explicit (and that’s a feature)
Because searched CASE is evaluated top-to-bottom, you can express “override rules” without changing your base logic.
SELECT
order_id,
state_code,
total_cents,
CASE
WHEN state_code = ‘CANC‘ THEN ‘do not fulfill‘
WHEN total_cents >= 10000 THEN ‘manual review‘
WHEN state_code IN (‘PAID‘, ‘SHIP‘) THEN ‘fulfill‘
ELSE ‘hold‘
END AS fulfillment_action
FROM orders
ORDER BY order_id;
I treat this as a readable policy: exceptions first, general rules later.
Guard clauses: CASE as a data-quality firewall
In production reporting, I like to make “bad data” visible instead of silently categorizing it. A simple pattern is to reserve a bucket like ‘data_error‘ or ‘unknown‘ and push those rows into a QA dashboard.
For example, imagine orders.state_code is supposed to be one of a small set, but you’ve seen surprise values.
SELECT
state_code,
COUNT(*) AS n
FROM orders
GROUP BY state_code
ORDER BY n DESC;
Once you see the surprise values, you can decide whether to fold them into existing buckets or explicitly label them. I’d rather label them explicitly first (so they’re measurable) and then decide.
Avoid overlapping conditions (or make overlap intentional)
Overlaps aren’t inherently wrong; they’re sometimes exactly what you want.
- Wrong overlap: two branches that accidentally match the same records.
- Intentional overlap: an “override” branch that should win.
If overlap is accidental, the easiest fix is to reorder or tighten conditions. If overlap is intentional, I’ll often add a “commentary label” by splitting the logic into two columns in a CTE: one for “base tier” and another for “override reason.” That makes it easier to explain.
CASE in reporting: aggregates, pivoting, and selective counting
Once you’re comfortable with CASE returning values, you start seeing it as a building block for reports.
Conditional aggregates (my most-used pattern)
Want counts per category without multiple passes? Put CASE inside COUNT or SUM.
SELECT
COUNT(*) AS total_orders,
COUNT(*) FILTER (WHERE statecode = ‘CANC‘) AS canceledorders,
SUM(CASE WHEN statecode = ‘PAID‘ THEN 1 ELSE 0 END) AS paidorders,
SUM(CASE WHEN totalcents >= 10000 THEN 1 ELSE 0 END) AS highvalue_orders,
SUM(totalcents) / 100.0 AS grossusd
FROM orders;
A modern PostgreSQL note: FILTER often reads cleaner than SUM(CASE ...), but both are valid. I still use SUM(CASE ...) when I want the same expression to feed multiple aggregates or when portability to older engines matters.
Conditional sums: money, weights, durations
Counts are the gateway drug. The bigger win is conditional sums:
SELECT
SUM(totalcents) / 100.0 AS grossusd,
SUM(CASE WHEN statecode = ‘CANC‘ THEN totalcents ELSE 0 END) / 100.0 AS canceled_usd,
SUM(CASE WHEN statecode ‘CANC‘ THEN totalcents ELSE 0 END) / 100.0 AS net_usd
FROM orders;
Two tips:
- Use
ELSE 0when the aggregate expects numeric input. - Prefer consistent units (cents as integer is often easier than floating money).
Pivot-like outputs with CASE
You can create “columns per bucket” without a separate pivot feature.
SELECT
datetrunc(‘day‘, createdat) AS day,
SUM(CASE WHEN statecode = ‘NEW‘ THEN 1 ELSE 0 END) AS neworders,
SUM(CASE WHEN statecode = ‘PAID‘ THEN 1 ELSE 0 END) AS paidorders,
SUM(CASE WHEN statecode = ‘SHIP‘ THEN 1 ELSE 0 END) AS shippedorders,
SUM(CASE WHEN statecode = ‘CANC‘ THEN 1 ELSE 0 END) AS canceledorders
FROM orders
GROUP BY 1
ORDER BY 1;
This is a pattern you’ll see in dbt models and BI-friendly views because it produces stable schemas.
CASE inside AVG, MIN, MAX
When you want a metric “only for rows that qualify,” you can feed NULL for non-qualifying rows:
SELECT
AVG(CASE WHEN statecode ‘CANC‘ THEN totalcents END) / 100.0 AS avgnoncanceled_usd,
MAX(CASE WHEN statecode = ‘PAID‘ THEN createdat END) AS lastpaidat
FROM orders;
Why this works: aggregates like AVG, MAX, MIN ignore NULL inputs.
Mapping to human labels: join table vs CASE
When the mapping is small, stable, and truly internal, CASE is fine. When the mapping is business-owned, changes often, or needs translation, a reference table is better.
Here’s the decision rule I use:
- Use
CASEwhen the mapping is “code-level” and rarely changes. - Use a table when non-engineers need to change it, when you need audit history, or when multiple systems share the mapping.
If you move from CASE to a table later, you’re not failing; you’re acknowledging the rule became data.
CASE for ordering and ranking (without contortions)
People often think of CASE only as a labeling tool. I use it just as much to control sort order and ranking.
Custom sort order in ORDER BY
If you sort state_code alphabetically, you get nonsense ordering. CASE gives you explicit ordering.
SELECT orderid, statecode, created_at
FROM orders
ORDER BY
CASE state_code
WHEN ‘NEW‘ THEN 10
WHEN ‘PAID‘ THEN 20
WHEN ‘SHIP‘ THEN 30
WHEN ‘CANC‘ THEN 90
ELSE 99
END,
created_at DESC;
I prefer numeric ranks here because they’re easy to extend (leave gaps: 10, 20, 30).
DISTINCT ON with a policy-driven “best row”
PostgreSQL’s DISTINCT ON picks the first row per group according to ORDER BY. That makes it perfect with CASE.
Example: pick the “best” order per customer (prefer non-canceled, prefer newest).
SELECT DISTINCT ON (customer_id)
customer_id,
order_id,
state_code,
created_at
FROM orders
ORDER BY
customer_id,
CASE WHEN state_code = ‘CANC‘ THEN 2 ELSE 1 END,
created_at DESC;
That CASE turns an implicit business rule into an explicit, reviewable line.
Ranking with window functions + CASE
When you need “top N per group” with business priority, I’ll combine a CASE rank key with row_number().
Example: pick one “best” order per customer, but make “paid and shipped” outrank “new,” and push cancels to the bottom.
WITH ranked AS (
SELECT
o.*,
row_number() OVER (
PARTITION BY customer_id
ORDER BY
CASE
WHEN state_code IN (‘PAID‘, ‘SHIP‘) THEN 1
WHEN state_code = ‘NEW‘ THEN 2
WHEN state_code = ‘CANC‘ THEN 9
ELSE 5
END,
created_at DESC
) AS rn
FROM orders o
)
SELECT customerid, orderid, statecode, createdat
FROM ranked
WHERE rn = 1
ORDER BY customer_id;
I like this because the ordering policy is visible and easy to change.
SQL CASE expression vs PL/pgSQL CASE statement
This distinction trips up even experienced devs because it’s the same keyword doing two related jobs.
- In SQL queries,
CASEis an expression returning a value. - In PL/pgSQL, there is also a
CASEstatement that controls procedural flow.
The biggest behavior difference: in PL/pgSQL, if you write a CASE statement without ELSE and nothing matches, PostgreSQL raises CASENOTFOUND.
Runnable PL/pgSQL example: pricing segment function
I’ll build a fresh example (no external sample tables needed) by creating a small products table.
DROP TABLE IF EXISTS products;
CREATE TABLE products (
product_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
sku text NOT NULL UNIQUE,
price_usd numeric(10,2) NOT NULL
);
INSERT INTO products (sku, price_usd) VALUES
(‘coffee-beans-1lb‘, 14.99),
(‘kettle-gooseneck‘, 79.00),
(‘espresso-machine‘, 599.00),
(‘paper-filters-100‘, 7.50);
CREATE OR REPLACE FUNCTION getpricesegment(pproductid bigint)
RETURNS text
LANGUAGE plpgsql
AS $$
DECLARE
v_price numeric(10,2);
v_segment text;
BEGIN
SELECT price_usd
INTO v_price
FROM products
WHERE productid = pproduct_id;
IF v_price IS NULL THEN
RETURN ‘missing product‘;
END IF;
— PL/pgSQL CASE statement (procedural), not the SQL CASE expression.
CASE v_price
WHEN 7.50 THEN v_segment := ‘accessory‘;
WHEN 14.99 THEN v_segment := ‘core‘;
WHEN 79.00 THEN v_segment := ‘premium‘;
WHEN 599.00 THEN v_segment := ‘flagship‘;
ELSE v_segment := ‘other‘;
END CASE;
RETURN v_segment;
END;
$$;
SELECT productid, sku, priceusd, getpricesegment(product_id) AS segment
FROM products
ORDER BY product_id;
This uses a simple CASE statement because I’m matching exact prices (not something I’d usually do in a real store, but it’s a clean demo).
Runnable PL/pgSQL example: service level with searched CASE
Now for a more realistic “ranges and thresholds” function.
DROP TABLE IF EXISTS customer_payments;
CREATE TABLE customer_payments (
payment_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
customer_id bigint NOT NULL,
amount_usd numeric(10,2) NOT NULL,
paid_at timestamptz NOT NULL DEFAULT now()
);
INSERT INTO customerpayments (customerid, amountusd, paidat) VALUES
(201, 19.99, now() – interval ‘60 days‘),
(201, 49.00, now() – interval ‘30 days‘),
(202, 5.00, now() – interval ‘10 days‘),
(203, 120.00, now() – interval ‘20 days‘),
(203, 120.00, now() – interval ‘5 days‘);
CREATE OR REPLACE FUNCTION getcustomerservicelevel(pcustomer_id bigint)
RETURNS text
LANGUAGE plpgsql
AS $$
DECLARE
v_total numeric(10,2);
BEGIN
SELECT COALESCE(SUM(amount_usd), 0)
INTO v_total
FROM customer_payments
WHERE customerid = pcustomer_id;
CASE
WHEN v_total >= 200 THEN
RETURN ‘platinum‘;
WHEN v_total >= 100 THEN
RETURN ‘gold‘;
WHEN v_total > 0 THEN
RETURN ‘silver‘;
ELSE
RETURN ‘free‘;
END CASE;
END;
$$;
SELECT
customer_id,
getcustomerservicelevel(customerid) AS service_level
FROM (VALUES (201), (202), (203), (204)) AS v(customer_id)
ORDER BY customer_id;
Notice two things I intentionally did:
COALESCE(SUM(...), 0)so customers with no payments get a deterministic tier.- Thresholds go from highest to lowest so only one branch can match.
The CASENOTFOUND trap in PL/pgSQL
If you omit ELSE in a PL/pgSQL CASE statement and none of the WHEN branches match, you’ll get an exception.
This snippet is meant to fail so you can see it:
DO $$
DECLARE
v text;
BEGIN
CASE ‘Z‘
WHEN ‘A‘ THEN v := ‘alpha‘;
WHEN ‘B‘ THEN v := ‘beta‘;
END CASE;
RAISE NOTICE ‘value: %‘, v;
END;
$$;
In SQL queries, an equivalent CASE expression without ELSE would return NULL instead. When I’m writing PL/pgSQL, I almost always include ELSE unless I explicitly want the function to error.
A practical bridge: SQL CASE inside PL/pgSQL
One more nuance: inside PL/pgSQL you can still use SQL CASE expressions in SELECT statements, and I do that constantly. The key is whether you’re writing:
- a procedural
CASE ... END CASE;block, or - a SQL expression
CASE ... ENDinside a query.
When a rule is “data in → value out,” I usually prefer the SQL expression inside a SELECT, because it stays set-based and composable.
Performance and correctness habits I use in 2026
CASE is often “free enough” in practice, but it can still affect plans and correctness. These are the checks I run when a query becomes important.
1) Keep indexes usable: avoid wrapping indexed columns in CASE inside WHERE
A common anti-pattern is using CASE to build a predicate value and then comparing against it. It can block index use.
Risky pattern:
SELECT *
FROM orders
WHERE CASE WHEN state_code = ‘CANC‘ THEN ‘closed‘ ELSE ‘open‘ END = ‘open‘;
Clearer and often easier for the planner:
SELECT *
FROM orders
WHERE state_code ‘CANC‘;
If you need complex logic, prefer plain boolean expressions (AND, OR, IN, IS NULL) in WHERE, and reserve CASE for computed output columns.
2) Make CASE type resolution explicit when mixing types
All THEN/ELSE branches must resolve to a compatible type.
This fails or produces surprising casts:
SELECT CASE WHEN true THEN 1 ELSE ‘2‘ END;
Make it explicit:
SELECT CASE WHEN true THEN 1 ELSE 2 END AS n;
SELECT CASE WHEN true THEN ‘1‘::text ELSE ‘2‘::text END AS s;
In code review, I flag “mixed-type CASE” because it becomes a bug farm when someone edits a branch later.
3) Use EXPLAIN (ANALYZE, BUFFERS) on the final query shape
When a CASE-heavy query starts taking, say, hundreds of milliseconds under load, I don’t guess. I run:
EXPLAIN (ANALYZE, BUFFERS)locally on representative datapgstatstatementsin staging/production to confirm frequency and total time
In modern workflows, I’ll also ask an AI assistant to propose rewrites, but I never accept them without looking at the plan. The planner is the truth serum.
4) Prefer testable building blocks: views, generated columns, or lookup tables
If a CASE expression becomes a core business rule, I try to give it a home:
- A view for reporting rules
- A generated column if you need it indexed and derived from row state
- A reference table if the mapping is data-owned
- A function when the logic needs reuse across multiple queries
This is less about syntax and more about making changes safe.
5) Consider partial indexes or expression indexes instead of CASE in filters
If you’re tempted to write WHERE CASE ... END = ‘something‘, that’s often a sign you want one of these:
- A plain boolean predicate
- A partial index for a specific subset
- An expression index for a derived key
Example: if the business constantly queries “open orders,” define “open” as state_code ‘CANC‘ and use a partial index:
CREATE INDEX ON orders (created_at)
WHERE state_code ‘CANC‘;
Then your query stays readable and index-friendly:
SELECT *
FROM orders
WHERE state_code ‘CANC‘
ORDER BY created_at DESC
LIMIT 100;
I’m not saying “always index everything.” I’m saying: don’t make CASE carry an indexing problem.
Using CASE in writes: UPDATE, INSERT, and migrations
Most people learn CASE in SELECT. In real systems, it’s equally valuable during data cleanup and backfills.
UPDATE ... SET column = CASE ... for bulk corrections
Suppose you inherited messy state_code values and you want to normalize them. You can do it in one statement.
UPDATE orders
SET statecode = CASE statecode
WHEN ‘CANCELLED‘ THEN ‘CANC‘
WHEN ‘CANCELED‘ THEN ‘CANC‘
WHEN ‘SHIPPED‘ THEN ‘SHIP‘
WHEN ‘PAYED‘ THEN ‘PAID‘
ELSE state_code
END;
A few habits I like here:
- Include
ELSE state_codeso you don’t accidentally set unknowns toNULL. - Run a
SELECTfirst to preview how many rows will change. - If it’s risky, wrap the change in a transaction and verify counts.
Backfilling derived columns with CASE
Sometimes you add a new column like lifecycle_bucket and backfill it based on existing state.
ALTER TABLE orders ADD COLUMN lifecycle_bucket text;
UPDATE orders
SET lifecycle_bucket = CASE
WHEN state_code IN (‘CANC‘) THEN ‘closed‘
WHEN state_code IN (‘SHIP‘, ‘PAID‘) THEN ‘active‘
WHEN state_code = ‘NEW‘ THEN ‘pending‘
ELSE ‘other‘
END;
After backfilling, you can add a NOT NULL constraint (if appropriate) and even an index if you query by this bucket.
INSERT ... SELECT for ETL classification
In ETL pipelines, it’s common to load raw events and then insert into a modeled table with derived fields.
INSERT INTO modeledorders (orderid, customerid, statecode, spend_tier)
SELECT
order_id,
customer_id,
state_code,
CASE
WHEN totalcents < 0 THEN 'dataerror‘
WHEN total_cents < 2500 THEN 'budget'
WHEN total_cents < 7500 THEN 'standard'
WHEN total_cents < 15000 THEN 'premium'
ELSE ‘enterprise‘
END AS spend_tier
FROM raw_orders;
This keeps your modeled table stable even if the raw table is messy.
CASE inside constraints (use carefully)
You can use CASE in a CHECK constraint, but I’m cautious: constraints should be obvious and hard to misunderstand.
Sometimes it’s perfectly fine, like enforcing a relationship:
ALTER TABLE orders
ADD CONSTRAINT ordersstatetotal_check
CHECK (
CASE
WHEN statecode = ‘CANC‘ THEN totalcents >= 0
ELSE total_cents > 0
END
);
That said, if a rule is too complex, I’ll consider:
- normalizing the data model,
- using a trigger for complex validation (with clear error messages), or
- validating upstream in the application.
CASE with joins: choosing between a lookup table and inline logic
This is where maintainability decisions show up.
Pattern: small mapping inline, large mapping as data
If I can write the mapping in under ~10 lines and it’s truly stable, I’ll keep it in CASE. Otherwise I create a reference table.
When I do use a reference table, I’ll still sometimes keep a tiny CASE for “special overrides.” Example: a general mapping table plus one override for a legacy client.
Avoid duplicating the same CASE in multiple places
If you paste the same CASE block into 5 queries, you’ve created 5 future bugs. My go-to refactors:
- Put the
CASEinto a view. - Put the
CASEinto a SQL function (immutable/stable if appropriate). - Use a generated column if it’s row-derived and you need indexing.
A simple view approach:
CREATE OR REPLACE VIEW orders_labeled AS
SELECT
o.*,
CASE o.state_code
WHEN ‘NEW‘ THEN ‘Awaiting payment‘
WHEN ‘PAID‘ THEN ‘Ready to ship‘
WHEN ‘SHIP‘ THEN ‘On the way‘
WHEN ‘CANC‘ THEN ‘Canceled‘
ELSE ‘Unknown‘
END AS state_label
FROM orders o;
Now every report can SELECT ... FROM orders_labeled and you’ve got one place to edit.
Traditional vs modern patterns (and what I actually use now)
There’s a “traditional SQL” way to do almost everything with CASE, and there are newer PostgreSQL features that sometimes read cleaner.
SUM(CASE...) vs FILTER
Traditional:
SELECT SUM(CASE WHEN statecode = ‘PAID‘ THEN 1 ELSE 0 END) AS paidorders
FROM orders;
Modern PostgreSQL alternative:
SELECT COUNT(*) FILTER (WHERE statecode = ‘PAID‘) AS paidorders
FROM orders;
My rule: if it’s a straight conditional aggregate, FILTER is usually clearer. If the conditional logic is reused in multiple aggregates, a single CASE expression (or a CTE that computes the classification once) can be better.
CASE vs boolean expressions (don’t over-CASE)
If you want a boolean, return a boolean. Don’t return text labels unless you truly need them.
Instead of:
SELECT CASE WHEN totalcents >= 10000 THEN ‘true‘ ELSE ‘false‘ END AS ishigh_value
FROM orders;
Prefer:
SELECT (totalcents >= 10000) AS ishigh_value
FROM orders;
This matters because booleans compose well: you can WHERE ishighvalue without extra comparisons.
CASE vs enum types
If state_code is a closed set, consider a PostgreSQL ENUM or a foreign key to a reference table.
ENUMgives strong constraints and readability, but changing the set of values is a schema change.- A reference table gives flexibility and metadata (labels, translations), but requires a join.
Even if you keep a text code, CASE still has a place for derived buckets and policies.
Debugging and testing CASE logic (so it doesn’t rot)
The biggest risk with CASE isn’t syntax—it’s silent drift.
1) Make unknowns visible
If “unknown” should never happen, don’t hide it. Label it and count it.
SELECT
CASE
WHEN state_code IN (‘NEW‘, ‘PAID‘, ‘SHIP‘, ‘CANC‘) THEN ‘known‘
ELSE ‘unknown‘
END AS code_quality,
COUNT(*)
FROM orders
GROUP BY 1;
If unknown is non-zero, you have something to investigate.
2) Build a “test harness” with VALUES
When I’m reviewing a tricky CASE, I often create a tiny table of inputs and expected outputs using VALUES.
WITH cases(inputtotalcents) AS (
VALUES (NULL::integer), (-5), (0), (2499), (2500), (7499), (7500), (15000)
)
SELECT
inputtotalcents,
CASE
WHEN inputtotalcents IS NULL THEN ‘missing total‘
WHEN inputtotalcents < 0 THEN 'data error'
WHEN inputtotalcents < 2500 THEN 'budget'
WHEN inputtotalcents < 7500 THEN 'standard'
WHEN inputtotalcents < 15000 THEN 'premium'
ELSE ‘enterprise‘
END AS spend_tier
FROM cases
ORDER BY inputtotalcents NULLS FIRST;
This is fast, deterministic, and perfect for catching off-by-one mistakes at boundaries.
3) Avoid repeated computation by factoring with a CTE
If a classification is used in multiple places, compute it once.
WITH classified AS (
SELECT
o.*,
CASE
WHEN total_cents IS NULL THEN ‘missing total‘
WHEN total_cents < 0 THEN 'data error'
WHEN total_cents < 2500 THEN 'budget'
WHEN total_cents < 7500 THEN 'standard'
WHEN total_cents < 15000 THEN 'premium'
ELSE ‘enterprise‘
END AS spend_tier
FROM orders o
)
SELECT
spend_tier,
COUNT(*) AS n,
SUM(totalcents) / 100.0 AS grossusd
FROM classified
GROUP BY spend_tier
ORDER BY n DESC;
This helps performance (sometimes) but mostly helps humans: one definition, many uses.
Advanced patterns: CASE as a policy language
Once you’re comfortable, you can write policies that read like decision tables.
Pattern: explainable decisions (action + reason)
I like returning both an action and a reason so you can debug outcomes.
SELECT
order_id,
state_code,
total_cents,
CASE
WHEN state_code = ‘CANC‘ THEN ‘block‘
WHEN total_cents >= 10000 THEN ‘review‘
WHEN state_code IN (‘PAID‘, ‘SHIP‘) THEN ‘approve‘
ELSE ‘hold‘
END AS decision,
CASE
WHEN state_code = ‘CANC‘ THEN ‘canceled‘
WHEN totalcents >= 10000 THEN ‘highvalue‘
WHEN statecode IN (‘PAID‘, ‘SHIP‘) THEN ‘paidor_shipped‘
ELSE ‘not_ready‘
END AS decision_reason
FROM orders;
This looks redundant, but it’s excellent for operational clarity.
Pattern: safe casting with CASE
If you have messy text input and only some rows can be cast safely, you can guard the cast.
SELECT
raw_value,
CASE
WHEN rawvalue ~ ‘^[0-9]+$‘ THEN rawvalue::int
ELSE NULL
END AS parsed_int
FROM raw_inputs;
That avoids runtime cast errors and makes the parsing rule explicit.
Pattern: multi-column classification
Sometimes classification depends on more than one input. Searched CASE handles it naturally.
SELECT
order_id,
CASE
WHEN state_code = ‘CANC‘ THEN ‘closed‘
WHEN statecode = ‘SHIP‘ AND totalcents >= 10000 THEN ‘vip_active‘
WHEN state_code IN (‘PAID‘, ‘SHIP‘) THEN ‘active‘
WHEN state_code = ‘NEW‘ THEN ‘pending‘
ELSE ‘other‘
END AS customer_visibility
FROM orders;
The trick is ordering: put the specific multi-column rules before the general ones.
Common pitfalls (the ones I still see in code review)
Pitfall 1: forgetting ELSE in SQL when NULL is not acceptable
In a SELECT, no ELSE means NULL. If downstream tooling expects a label, you can end up with confusing blank fields.
If NULL is meaningful, fine. If not, include an explicit ELSE.
Pitfall 2: mixing types across branches
I mentioned this earlier, but it’s worth repeating. If one branch returns int and another returns text, you’re creating a fragile cast situation.
Pitfall 3: putting complex classification logic inside WHERE
This often harms index usage and readability. Keep WHERE boolean. Put CASE in the SELECT list, then filter on the original fields—or compute the classification in a CTE and filter on the derived column.
Pitfall 4: “magic numbers” without a comment or a name
If you use threshold constants like 7500, consider making them self-documenting:
- put them in a view with a descriptive name,
- put them in a function parameter,
- or at least use a CTE to name them.
Example naming via CTE:
WITH params AS (SELECT 2500 AS budgetmax, 7500 AS standardmax, 15000 AS premium_max)
SELECT
o.order_id,
CASE
WHEN o.total_cents < 0 THEN 'data error'
WHEN o.totalcents < (SELECT budgetmax FROM params) THEN ‘budget‘
WHEN o.totalcents < (SELECT standardmax FROM params) THEN ‘standard‘
WHEN o.totalcents < (SELECT premiummax FROM params) THEN ‘premium‘
ELSE ‘enterprise‘
END AS spend_tier
FROM orders o;
Is this always necessary? No. But for “policy thresholds,” it can be a lifesaver.
When not to use CASE
I love CASE, but I don’t use it everywhere.
Don’t use CASE as a substitute for proper modeling
If you have a long list of codes and labels that change, a reference table is usually the right tool.
Don’t use CASE to hide bad data
If you have values that are impossible, label them as errors (or fix them at ingest). Quietly mapping them to a normal category makes reporting lie.
Don’t use CASE when a boolean expression is enough
If the output is boolean, return boolean.
Don’t use CASE when constraints should enforce the rule
If a value should never exist, a constraint (or enum/table FK) may be a better defense than a CASE branch.
Quick cheat sheet
- Use simple
CASEfor exact value mappings. - Use searched
CASEfor ranges, predicates, and override priority. - In SQL, missing
ELSEreturnsNULL. - In PL/pgSQL
CASEstatements, missingELSEcan raiseCASENOTFOUND. - Keep
WHEREboolean; don’t hide predicates insideCASEunless you’ve measured and you know what you’re doing. - For reporting, combine
CASEwith aggregates and considerFILTERfor clean conditional counts. - If the mapping grows or becomes business-owned, graduate it to a table or a view.
If you want, I can also add a dedicated section showing CASE + generated columns (including indexing) and a “decision-table style” refactor where a long CASE becomes a small lookup table plus a join—those are two of the most common next steps when a real system outgrows inline branching.


