PostgreSQL LIKE Operator: Practical Patterns, Escaping, and Performance

When I’m debugging a production issue, one of the fastest ways to get context is to search text: customer names, email domains, invoice prefixes, feature flags, log tags, shipment statuses. You rarely know the full value you’re hunting for. You know fragments: it started with ‘K‘, it contains ‘er‘, it ends with ‘.io‘, it follows a code shape like ‘INV-2026-‘. That gap between partial knowledge and exact matching is where PostgreSQL’s LIKE operator earns its keep.

LIKE looks simple, but real systems quickly surface questions you can’t ignore: Why is the search case-sensitive? Why does a leading % get slow on big tables? How do you search for a literal underscore? How do you safely build patterns from user input without turning your query into a correctness or performance trap?

I’ll walk through LIKE from first principles, then move into patterns I actually use: prefix searches that stay fast, safe wildcard handling, and when I reach for ILIKE, regular expressions, or trigram indexes instead. If you write SQL that powers search boxes, admin panels, data cleanup scripts, or back-office tools, this is one of those operators you’ll use weekly.

How LIKE really evaluates

At runtime, LIKE compares a string value against a pattern:

value LIKE pattern

You almost always see it in a WHERE clause:

SELECT *

FROM customers

WHERE email LIKE ‘%@example.com‘;

A few mechanics matter more than the syntax:

  • LIKE returns TRUE if the value matches the pattern, FALSE if it doesn’t.
  • If either side is NULL, the result is NULL (which behaves like not-true in WHERE).
  • It operates on text-like types (text, varchar, char). Non-text values are usually cast to text explicitly if you want predictable behavior.

I like to keep NULL behavior visible in my mental model. If you have email nullable, this query:

SELECT count(*)

FROM customers

WHERE email LIKE ‘%@example.com‘;

will not count customers with NULL email. If you want them treated as empty strings (often you don’t), you’d need:

SELECT count(*)

FROM customers

WHERE coalesce(email, ‘‘) LIKE ‘%@example.com‘;

That coalesce changes semantics and can affect index usage, so I use it only when I genuinely want NULL treated as an empty value.

To make examples runnable, here’s a small dataset you can paste into psql:

DROP TABLE IF EXISTS customers;

CREATE TABLE customers (

customer_id bigserial PRIMARY KEY,

first_name text NOT NULL,

last_name text NOT NULL,

email text,

city text,

signup_code text NOT NULL

);

INSERT INTO customers (firstname, lastname, email, city, signup_code) VALUES

(‘Kara‘, ‘Nguyen‘, ‘[email protected]‘, ‘Austin‘, ‘INV-2026-0001‘),

(‘Kiran‘, ‘Patel‘, ‘[email protected]‘, ‘Seattle‘, ‘INV-2026-0002‘),

(‘Amber‘, ‘Dixon‘, ‘[email protected]‘, ‘Denver‘, ‘INV-2025-0912‘),

(‘Albert‘, ‘Crouse‘, ‘[email protected]‘, ‘Boston‘, ‘INV-2026-0104‘),

(‘Alberto‘, ‘Henning‘, ‘[email protected]‘, ‘Boston‘, ‘INV-2026-0105‘),

(‘Xheron‘, ‘Miles‘, NULL, ‘Chicago‘, ‘INV-2026-1111‘),

(‘Sheri‘, ‘Stone‘, ‘[email protected]‘, ‘Chicago‘, ‘INV-2026-2222‘);

Wildcards: % and _ as your pattern language

LIKE becomes powerful the moment you internalize its two wildcard characters:

  • % matches any sequence of characters (including empty)
  • _ matches exactly one character

That’s it. Everything else in the pattern is literal (with an important caveat about escaping, which we’ll handle soon).

Prefix matches (starts with)

If you want names that start with ‘K‘:

SELECT firstname, lastname

FROM customers

WHERE first_name LIKE ‘K%‘

ORDER BY first_name;

‘K%‘ means: a literal ‘K‘, followed by anything.

This is the pattern that most often stays fast on large tables (with the right index strategy), because it’s a left-anchored search.

Suffix matches (ends with)

Emails ending in a domain:

SELECT firstname, lastname, email

FROM customers

WHERE email LIKE ‘%@example.com‘

ORDER BY email;

The leading % is the important part: you’re saying the domain could occur anywhere near the end, so PostgreSQL cannot use a normal B-tree index to jump to the correct range. (I’ll show better approaches later.)

Contains matches (substring)

Names containing ‘er‘ anywhere:

SELECT firstname, lastname

FROM customers

WHERE first_name LIKE ‘%er%‘

ORDER BY first_name;

This is great for ad-hoc investigation and admin tooling. For user-facing search across big datasets, you usually need trigram or full-text search instead.

Fixed-shape matches with _

If you want first names where:

  • any single character
  • followed by ‘her‘
  • followed by anything

you can write:

SELECT firstname, lastname

FROM customers

WHERE firstname LIKE ‘her%‘

ORDER BY first_name;

_ is the simplest way to express shape constraints. I use it for codes too, like invoice numbers or signup tokens.

Example: invoice codes like ‘INV-2026-‘ plus four digits:

SELECT customerid, signupcode

FROM customers

WHERE signup_code LIKE ‘INV-2026-

ORDER BY signup_code;

The four underscores mean exactly four characters. If you want strictly digits, LIKE alone can’t enforce that; you’d move to a regex (~) for that constraint.

Case sensitivity and the ILIKE fork in the road

By default, LIKE is case-sensitive. That surprises people because many UI searches feel case-insensitive.

If you need case-insensitive matching in PostgreSQL, the most direct tool is ILIKE:

SELECT firstname, lastname

FROM customers

WHERE first_name ILIKE ‘k%‘

ORDER BY first_name;

In my experience, the decision tree looks like this:

  • If you’re building a quick admin filter, ILIKE is fine.
  • If you need case-insensitive search at scale, you need to think about indexing and data normalization.

Common workaround: lower(column) LIKE lower(pattern)

You’ll see this pattern:

SELECT firstname, lastname

FROM customers

WHERE lower(first_name) LIKE lower(‘k%‘);

It works, but there’s a catch: wrapping the column in a function changes whether an index can be used. If you go down this road, you typically add a functional index:

CREATE INDEX customersfirstnameloweridx

ON customers (lower(first_name));

Then you’d write:

SELECT firstname, lastname

FROM customers

WHERE lower(first_name) LIKE ‘k%‘;

That keeps the function consistent between query and index.

The 2026-friendly approach: choose predictable semantics

If you’re designing a product today, I recommend being explicit about what search means:

  • If you want case-insensitive matching and you mostly do equality or prefix search, consider storing a normalized copy (for example email_canon) and indexing it.
  • If you want case-insensitive substring search, consider trigram indexing (pg_trgm) and use ILIKE or LIKE on normalized text.

Also keep collations in mind. Collation rules affect comparisons and ordering, and they can influence how case behaves depending on configuration. If your environment mixes locales, test searches in staging with realistic data.

NOT LIKE and composing real filters

LIKE becomes genuinely useful when you combine it with other conditions. NOT LIKE is the obvious complement.

Exclude a pattern

Filter out vendor emails that contain an underscore (just as an example):

SELECT customer_id, email

FROM customers

WHERE email IS NOT NULL

AND email NOT LIKE ‘%\_%‘ ESCAPE ‘\\‘

ORDER BY email;

That query also previews an important concept: escaping. I’ll unpack it in the next section.

Multiple allowed prefixes

When people write:

WHERE first_name LIKE ‘Al%‘

OR first_name LIKE ‘Am%‘

OR first_name LIKE ‘Ka%‘

I usually refactor to ANY for readability:

SELECT firstname, lastname

FROM customers

WHERE first_name LIKE ANY (ARRAY[‘Al%‘, ‘Am%‘, ‘Ka%‘])

ORDER BY first_name;

It reads like what it is: match any of these patterns.

Multiple required constraints

Imagine an internal lookup page where you want:

  • city begins with ‘Bo‘
  • email ends with ‘@example.com‘
  • signup_code starts with ‘INV-2026-‘
SELECT customerid, firstname, lastname, email, city, signupcode

FROM customers

WHERE city LIKE ‘Bo%‘

AND email LIKE ‘%@example.com‘

AND signup_code LIKE ‘INV-2026-%‘

ORDER BY customer_id;

This is the kind of query that feels obvious, but performance differs wildly depending on where the wildcards are. Prefix patterns (‘Bo%‘, ‘INV-2026-%‘) can often be index-friendly; suffix patterns (‘%@example.com‘) usually are not.

Escaping % and _ (and why ESCAPE matters)

The most common LIKE bug I see in real code is accidental wildcard expansion.

If you build a pattern from user input and you do not escape it, % and _ become active wildcard characters. That changes results.

The problem: user input that contains wildcard characters

Say your UI lets an operator search by an email local part, and someone types:

  • sheri_stone

If you run:

SELECT email

FROM customers

WHERE email LIKE ‘%sheri_stone%‘;

the underscore will match any single character, so sheriXstone would also match. That’s usually wrong.

The fix: escape wildcard characters intentionally

PostgreSQL lets you specify an escape character:

... LIKE pattern ESCAPE escape_character

A common convention is to use backslash as the escape character:

SELECT email

FROM customers

WHERE email LIKE ‘%sheri\_stone%‘ ESCAPE ‘\\‘;

Now \_ means a literal underscore.

Same deal for %:

SELECT *

FROM some_table

WHERE some_column LIKE ‘%50\%%‘ ESCAPE ‘\\‘;

That matches text containing the literal string 50%.

Safe pattern building in application code

Even with escaping, you still want parameterized queries. Parameterization protects you from SQL injection and avoids surprising parsing edge cases.

Here’s a Node.js example (runnable structure, not tied to a specific framework) that escapes % and _ for a contains-search:

// Example with node-postgres style parameters ($1, $2, ...)

// The key idea: escape wildcard characters and add ESCAPE ‘\\‘ in SQL.

function escapeLike(input) {

// Escape backslash first, then % and _.

return input

.replace(/\\/g, ‘\\\\‘)

.replace(/%/g, ‘\\%‘)

.replace(//g, ‘\\‘);

}

const term = ‘sheri_stone‘;

const pattern = %${escapeLike(term)}%;

const sql = `

SELECT customer_id, email

FROM customers

WHERE email IS NOT NULL

AND email LIKE $1 ESCAPE ‘\\\\‘

ORDER BY customer_id

`;

// client.query(sql, [pattern])

Notes from the trenches:

  • Escape the escape character itself (backslash) first.
  • Always keep ESCAPE ‘\\‘ in the SQL if you are injecting escape sequences into the pattern.
  • Don’t confuse escaping for LIKE with SQL string escaping. Parameterization handles SQL string escaping; your escapeLike handles wildcard semantics.

Performance: what stays fast, what gets slow, and how I tune it

I’m going to be blunt: LIKE can be fast or painfully slow, and the difference is often just where you place %.

Why column LIKE ‘prefix%‘ can be fast

A left-anchored pattern like ‘Ka%‘ behaves a lot like a range search. PostgreSQL can often use a B-tree index to jump into the right region.

If you run a lot of prefix searches, consider a supporting index. On text columns, I often use textpatternops so the planner can use the index for pattern comparisons:

CREATE INDEX customersfirstnamepatternidx

ON customers (firstname textpattern_ops);

Then:

SELECT firstname, lastname

FROM customers

WHERE first_name LIKE ‘Ka%‘;

On large tables, this frequently turns a scan into an index range scan.

Why column LIKE ‘%substring%‘ is usually slow

The leading % means the match can start anywhere, so a standard B-tree index cannot help much. PostgreSQL typically falls back to scanning many rows.

In real systems, I see times like:

  • Prefix search on a well-indexed column: typically 10–30ms at moderate scale (plus network overhead)
  • Substring search with leading % on a big table: often 200ms–2s, sometimes worse under load

Exact numbers depend on hardware, caching, row width, and concurrency, so treat those ranges as a gut-check rather than a promise.

Trigram indexes for substring search (pg_trgm)

If you need fast substring search, trigram indexing is the workhorse.

Enable the extension:

CREATE EXTENSION IF NOT EXISTS pg_trgm;

Add a trigram index:

CREATE INDEX customersemailtrgm_idx

ON customers USING gin (email gintrgmops);

Now queries like:

SELECT customer_id, email

FROM customers

WHERE email ILIKE ‘%northwind%‘

ORDER BY customer_id;

can use the trigram index and perform well even with leading %.

When I’m tuning a system in 2026, I rarely guess. I check:

  • EXPLAIN (ANALYZE, BUFFERS) for the plan shape and I/O behavior
  • pgstatstatements to see which queries actually cost time
  • query logs with auto_explain for slow outliers

And yes, modern AI-assisted tooling helps here: paste the EXPLAIN output into your internal assistant, ask it to translate the plan into plain language, then validate the recommendation against what PostgreSQL actually did. I treat the assistant as a fast explainer, not the final authority.

The surprising win: search direction matters

If you only need suffix searches (like ‘%@example.com‘), trigrams can still help. Another approach is to store a reversed version of the string and do a prefix search on the reversed data. That’s extra storage and code complexity, so I only do it when the pattern is extremely common and latency budgets are tight.

Don’t sabotage indexes unintentionally

A few patterns often block index use:

  • Wrapping the column in functions without a matching functional index
  • Implicit casts (for example comparing citext to text in inconsistent ways)
  • Patterns that start with % when you expected prefix behavior

If a query should be fast and isn’t, start with EXPLAIN (ANALYZE, BUFFERS) and confirm whether you’re scanning.

LIKE vs other pattern tools: what I choose in practice

LIKE is not the only pattern tool in PostgreSQL. The fastest way to pick the right one is to decide what kind of search you’re building.

Here’s how I choose:

Goal

Best first choice

Why —

— Exact match

=

Simple semantics, strong index support Prefix match

LIKE ‘prefix%‘ + B-tree (maybe textpatternops)

Index-friendly, predictable Case-insensitive prefix

ILIKE ‘prefix%‘ or lower(col) LIKE ‘prefix%‘ with functional index

Correct behavior with scalable plan Substring match

ILIKE ‘%term%‘ + trigram index

Fast contains search without full-text complexity Shape match (fixed length)

LIKE with _

Simple pattern language for codes Strict shape match (digits, boundaries)

regex (~, ~*)

Enforces character classes and anchors Natural-language search

Full-text search (totsvector, plaintotsquery)

Tokenization, ranking, language-aware Similarity / fuzzy

trigram similarity (%, similarity())

“Close enough” matching, typo tolerance

This table hides a big truth: the “best” tool is often the one that matches your product behavior. If your UI says “Search by email domain”, a suffix match might be correct. If your UI says “Search by name”, users will expect case-insensitive, partial matching and typo tolerance. LIKE can implement part of that, but it isn’t the whole story.

Pattern design I actually use in production

When people say “use LIKE”, they usually mean “use LIKE plus a bunch of guardrails”. These are the patterns I reach for repeatedly.

A search box with multiple fields (and safe wildcards)

A common internal tool is a “search customers” box where the operator types one term and expects it to match name, email, or signup code.

The naive version is:

SELECT customerid, firstname, lastname, email, signupcode

FROM customers

WHERE first_name ILIKE ‘%‘ | $1 ‘%‘

OR last_name ILIKE ‘%‘ | $1 ‘%‘

OR email ILIKE ‘%‘ | $1 ‘%‘

OR signup_code ILIKE ‘%‘ | $1 ‘%‘

ORDER BY customer_id

LIMIT 50;

It’s easy, but it has two production problems:

1) If $1 contains % or _, the wildcard semantics change.

2) If the table is large, this becomes a scan unless you have trigram indexes.

The version I ship looks more like:

-- Assume $1 is already escaped for LIKE wildcards in application code.

SELECT customerid, firstname, lastname, email, signupcode

FROM customers

WHERE first_name ILIKE ‘%‘ | $1 ‘%‘ ESCAPE ‘\\‘

OR last_name ILIKE ‘%‘ | $1 ‘%‘ ESCAPE ‘\\‘

OR email ILIKE ‘%‘ | $1 ‘%‘ ESCAPE ‘\\‘

OR signup_code ILIKE ‘%‘ | $1 ‘%‘ ESCAPE ‘\\‘

ORDER BY customer_id

LIMIT 50;

Then I add supporting indexes based on what the tool is for:

  • For internal admin search across lots of fields: trigram indexes on the fields actually searched.
  • For public-facing search with more UX expectations: full-text search or a dedicated search service.

Prefix-first, contains-second (a practical compromise)

Sometimes I can improve both performance and relevance by using two tiers:

  • If the input looks like a prefix (length >= 3), try a prefix match first.
  • If that returns too few results, fall back to contains match.

In SQL, I’ll sometimes express that as two queries unioned (or two requests from the app). The app-side approach is usually clearer because it can decide when to fall back.

The key point: prefix search is often “good enough” and can stay fast with plain B-tree indexes. Contains search is powerful, but it’s also the one you have to actively engineer for.

“Starts with” but ignore whitespace and punctuation

Real data is messy. Names contain spaces, hyphens, apostrophes, and occasionally weird Unicode. If your app normalizes input (for example removing punctuation), LIKE can still be useful, but you need a consistent plan.

My rule is: if I normalize in the query, I also normalize in an index (or I add a stored generated column).

Example with a generated column (PostgreSQL 12+):

ALTER TABLE customers

ADD COLUMN firstnamenorm text

GENERATED ALWAYS AS (

regexpreplace(lower(firstname), ‘[^a-z0-9]+‘, ‘‘, ‘g‘)

) STORED;

CREATE INDEX customersfirstnamenormidx

ON customers (firstnamenorm textpatternops);

Then I can do fast-ish prefix search on the normalized representation:

SELECT customerid, firstname

FROM customers

WHERE firstnamenorm LIKE ‘al%‘

ORDER BY customer_id;

Is this overkill for every project? Yes. Is it worth it when you have messy identifiers and strict latency? Also yes.

Edge cases that surprise people

LIKE is one of those operators where most bugs aren’t “syntax bugs” but “behavioral surprises.” Here are the ones I’ve been burned by.

1) LIKE and NULL in boolean logic

If LIKE returns NULL, then WHERE email LIKE ‘%@example.com‘ behaves as if it’s false.

But if you start composing conditions, NULL can change the logic in ways that are easy to miss.

Example:

-- You might expect this to return rows where email is NULL or not example.com.

SELECT customer_id, email

FROM customers

WHERE email NOT LIKE ‘%@example.com‘;

Rows with NULL email do not match this condition, because NULL NOT LIKE ... is also NULL.

The explicit version is:

SELECT customer_id, email

FROM customers

WHERE email IS NULL

OR email NOT LIKE ‘%@example.com‘;

If you’re writing exclusion filters, always decide what you want to do with NULL and encode it explicitly.

2) Escaping isn’t optional when patterns come from users

If your input is user-controlled (even internal users), treat % and _ as special characters until you escape them.

I treat unescaped patterns as a correctness bug, not a “nice to have.” It’s the difference between “find the exact token they typed” and “accidentally broaden the search.”

3) Backslash confusion (SQL string vs LIKE escaping)

There are two layers of “escaping” that people mix up:

  • SQL string escaping: how you write ‘\‘ to represent a literal backslash in a string literal.
  • LIKE escaping: how you indicate \% means “literal percent” (depending on the escape character).

If you parameterize, you mostly avoid SQL string escaping problems, but you still need to do LIKE wildcard escaping. That’s why I keep an explicit ESCAPE ‘\\‘ in queries that build patterns.

4) Collation and “why doesn’t my index work?”

Collation affects ordering and comparison. In many environments, pattern matching and index usage interact with collation in ways that aren’t intuitive.

If you’re seeing a query that “should use the index” but doesn’t, check:

  • the database collation and column collation
  • whether the index uses textpatternops
  • whether you’re mixing collations between the column and the pattern

I don’t try to memorize every edge case here; I rely on EXPLAIN to tell me whether an index is used and then I adjust.

5) LIKE on non-text types

PostgreSQL will often cast non-text values to text if you force it, but I prefer being explicit.

Example: you have an integer code and you want prefix behavior:

SELECT *

FROM some_table

WHERE some_int::text LIKE ‘12%‘;

This works, but it’s usually a code smell. I’d rather store codes as text if I ever plan to do pattern matching, because casting on every query is expensive and makes indexing harder.

Indexing strategies that hold up under load

When LIKE becomes a hot path, you need to engineer the data access, not just the SQL.

B-tree for prefix and equality

If your queries are mostly = and LIKE ‘prefix%‘, a B-tree index is still the simplest and fastest tool.

For text columns used with LIKE, I reach for one of these patterns:

-- Often helpful for LIKE ‘prefix%‘

CREATE INDEX customerscitypattern_idx

ON customers (city textpatternops);

-- For case-insensitive prefix via lower()

CREATE INDEX customerscitylowerpatternidx

ON customers (lower(city) textpatternops);

One detail I watch closely: index bloat and write amplification. Every extra index costs you on INSERT/UPDATE, so I add these only when a query is truly important.

GIN + trigrams for contains search

If you need ILIKE ‘%term%‘ at scale, use pg_trgm and a trigram index:

CREATE EXTENSION IF NOT EXISTS pg_trgm;

CREATE INDEX customersfirstnametrgmidx

ON customers USING gin (firstname gintrgm_ops);

GIN indexes are bigger and heavier to maintain than B-tree, so I don’t add them casually. But when you need them, they’re the difference between “feature works” and “feature times out.”

Partial indexes for common subsets

A pattern I love is partial indexes for “active” records, because many real systems mostly query the active subset.

Example: imagine customers has deleted_at and you always exclude deleted customers.

CREATE INDEX customersemailtrgmactiveidx

ON customers USING gin (email gintrgmops)

WHERE deleted_at IS NULL;

This keeps the index smaller and more cache-friendly.

Functional indexes: make your query and index match

If the query is lower(email) LIKE ‘abc%‘, index lower(email). If the query is email LIKE ‘ABC%‘, index email.

I’m strict about consistency here. If the query and index don’t match, you’ll have a permanent “why is this slow sometimes?” mystery.

Practical EXPLAIN workflow (so I don’t guess)

When I’m troubleshooting LIKE performance, I run a tight loop:

1) Write the query I actually want.

2) Run EXPLAIN (ANALYZE, BUFFERS).

3) Ask: did it scan, did it use an index, and how many rows did it touch?

4) Add the minimal index or rewrite that changes the plan.

5) Repeat.

A tiny example:

EXPLAIN (ANALYZE, BUFFERS)

SELECT customer_id, email

FROM customers

WHERE email ILIKE ‘%northwind%‘;

If I see a sequential scan touching most rows, I know I’m paying O(N) per search. That’s fine for tiny tables and terrible for hot paths.

Then I add a trigram index and rerun. I’m not looking for “a perfect plan”; I’m looking for “a plan that touches far fewer rows” and has stable latency under concurrency.

When I don’t use LIKE

This is where LIKE becomes a design decision, not just syntax.

I avoid LIKE for natural language

If users are searching sentences, LIKE ‘%word%‘ creates weird results:

  • It matches substrings inside other words.
  • It doesn’t rank relevance.
  • It breaks on stemming and language nuance.

If I need “search tickets by description,” I reach for full-text search. If I need “search docs across many fields,” I might reach for a dedicated search engine.

I avoid LIKE for strict validation

If the goal is “this must be exactly INV-2026- plus four digits,” I’ll use a check constraint with regex:

ALTER TABLE customers

ADD CONSTRAINT signupcodeformat_chk

CHECK (signup_code ~ ‘^INV-[0-9]{4}-[0-9]{4}$‘);

That’s not a search query, that’s a data integrity rule. LIKE is great for exploration; it’s not the best validator.

I avoid LIKE for “fuzzy” matching

If users expect typo tolerance ("Jon" should match "John"), LIKE won’t do it. Trigram similarity can help, but it’s a different operator family.

Alternatives and complements to LIKE

PostgreSQL gives you a bunch of pattern tools. I don’t treat these as “better” or “worse” than LIKE; I treat them as different behaviors.

Regex (~, ~*) for strict shapes

Regex is what I use when the pattern language must be expressive.

  • ~ is case-sensitive regex match
  • ~* is case-insensitive regex match

Example: match invoice codes in 2026 with exactly four digits:

SELECT customerid, signupcode

FROM customers

WHERE signup_code ~ ‘^INV-2026-[0-9]{4}$‘

ORDER BY signup_code;

That’s the “digits only” enforcement LIKE ‘INV-2026- can’t do.

SIMILAR TO (rarely)

PostgreSQL supports SIMILAR TO, which is like a hybrid of LIKE and regex. I almost never use it because it’s easy to confuse with both.

If I need LIKE, I use LIKE. If I need regex, I use regex.

citext for case-insensitive equality (and sometimes LIKE)

If you mostly want case-insensitive equality on a field like email, the citext extension can simplify your code:

CREATE EXTENSION IF NOT EXISTS citext;

ALTER TABLE customers

ALTER COLUMN email TYPE citext;

Now email = ‘[email protected] works as expected.

For LIKE/ILIKE, I still test behavior carefully because collation and operator class choices affect whether the index can be used. In many codebases, I still prefer explicit normalization because it’s more obvious to future readers.

Full-text search for token-aware matching

If you want a search that respects word boundaries and language rules, full-text search is the native PostgreSQL tool.

Example (very minimal):

SELECT customerid, firstname, last_name

FROM customers

WHERE totsvector(‘simple‘, firstname | ‘ ‘ last_name)

@@ plainto_tsquery(‘simple‘, ‘albert‘);

This isn’t a drop-in replacement for LIKE. It’s a different semantics: tokens, dictionaries, stemming, ranking.

Trigram similarity for “close enough” matching

With pg_trgm, you can go beyond contains search and do similarity.

Example:

CREATE EXTENSION IF NOT EXISTS pg_trgm;

SELECT firstname, lastname, similarity(first_name, ‘Kira‘) AS score

FROM customers

WHERE first_name % ‘Kira‘

ORDER BY score DESC

LIMIT 10;

This is my go-to for typo tolerance in small-to-medium datasets where I don’t want the complexity of full search infrastructure.

Common mistakes (and how I avoid them)

These are the failure modes I see repeatedly in production code reviews.

Mistake 1: Treating user input as a ready-to-run pattern

If you accept raw input and concatenate it into LIKE, you’re letting the user decide wildcard semantics.

I fix it by:

  • escaping % and _
  • parameterizing the query
  • explicitly writing ESCAPE ‘\\‘

Mistake 2: Assuming ILIKE is “just like LIKE but case-insensitive”

Semantically, yes. Operationally, it can change index usage.

If performance matters, I either:

  • use lower(col) consistently with a functional index, or
  • use trigram indexes for substring search.

Mistake 3: Using leading % in hot paths without an index strategy

If you put ‘%term%‘ on a table with tens of millions of rows, you’re paying for it.

My rule: if you need contains search in a user-facing path, plan the index from day one.

Mistake 4: Over-indexing because search “might be used”

Every index slows writes. I add indexes based on measured query patterns:

  • start with basic B-tree indexes
  • add textpatternops for real prefix workload
  • add trigram only when I actually need contains search at scale

Mistake 5: Forgetting about LIMIT and sort order

Search pages usually don’t need all matches; they need the first 20–100.

I almost always:

  • add a LIMIT
  • choose an ORDER BY that matches the product need

If ordering by a different column forces a big sort, I revisit the UI expectations. Sometimes “order by most recent” is fine; sometimes “order by similarity” is required; sometimes “order by name” is a trap because it forces sorting enormous result sets.

A mini checklist I use before shipping a LIKE-based feature

When I’m about to ship or review a feature that uses LIKE, I mentally run this checklist:

  • Semantics: Is this case-sensitive or case-insensitive, and is that what users expect?
  • Correctness: Are % and _ escaped when the pattern comes from input?
  • Nulls: What happens when the column is NULL?
  • Performance: Does this query use a left-anchored pattern or a leading %?
  • Indexing: Do we have the right index for the query shape (B-tree pattern ops, functional, trigram)?
  • Observability: Do we have pgstatstatements and slow query logging so we’ll notice if it degrades?

Closing thoughts

I like LIKE because it sits in a sweet spot: simpler than regex, more flexible than equality, and perfect for those “I only know part of the string” moments. But the moment it escapes your local terminal and becomes a real feature, it deserves engineering.

If you remember only a few things, make them these:

  • Prefix searches (‘abc%‘) are your friend and can be indexed with B-tree.
  • Contains searches (‘%abc%‘) need trigram indexing if they’re on a hot path.
  • Always escape % and _ when patterns come from users.
  • Decide on case sensitivity as a product behavior, not an accident.

Once you do that, LIKE stops being a mystery operator and becomes a predictable, production-ready tool.

Scroll to Top