Baeldung Pro – SQL – NPI EA (cat = Baeldung on SQL)
announcement - icon

Learn through the super-clean Baeldung Pro experience:

>> Membership and Baeldung Pro.

No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.

1. Introduction

Performance optimization is very crucial in SQL for database applications, especially when it involves large datasets. A key concept that impacts query speed is sargability (Search ARGument ABILITY).

Sargability refers to the ability of an SQL query to utilize database indexes to speed up data retrieval operations. It is when the conditions within a WHERE, HAVING, and ORDER BY clause scan, but also JOINs and HAVING, are leveraged by the database engine to efficiently retrieve rows via indexing. When a query is sargable, the optimizer can determine whether to use index seeks (for small subsets of rows) or index scans for accessing larger data.

In this tutorial, we’ll explain sargability, what makes a query sargable, tools for identifying sargability, how it affects performance, and how to write sargable SQL queries. This tutorial uses the table from Baeldung’s simple University database schema.

2. What Makes an SQL Statement Sargable?

The core concept of sargability revolves around search arguments that can efficiently leverage database indexes to optimize query execution.

The difference between sargable and non-sargable queries is this: sargable queries enable the optimizer to determine when to use index seek and index scans. Non-sargable queries often force the database to perform full table scans, as they can’t utilize indexes due to their search conditions.

For a query to efficiently use an index, certain conditions must be met in the WHERE, JOIN, or ORDER BY clauses:

  • Direct column comparisons using operators like =, <, >, <=, >=, or BETWEEN
  • No functions or calculations applied to the indexed column (e.g., avoid LOWER(column) or column + 10)
  • Consistent data types between the column and the comparison value to avoid implicit type conversion
  • Leading column usage in multi-column indexes (e.g., use the first column in the index)
  • Pattern matching with a trailing wildcard, such as LIKE ‘abc%’, rather than LIKE ‘%abc’
  • Use of IN with specific values (e.g., WHERE id IN (1, 2, 3)), depending on the database engine

These conditions help the database perform index seeks rather than full scans, improving performance and reducing resource usage.

3. Tools to Identify Non-Sargable Queries

Detecting a non-sargable query at the early stage helps to optimize performance. It allows database admins, analysts, or developers to refactor the queries rather than executing inefficient queries with cost and computational implications. The table below shows the tools for identifying sargability in different SQL versions:

Tool Sargability Indicator Command  Description
SQL Server Index Seek vs Scan SET SHOWPLAN_TEXT ON;

 

SET STATISTICS IO ON;

Shows the execution plan

 

Reveals read operation

PostgreSQL Index Scan vs Seq Scan EXPLAIN ANALYZE — add query; Checks index usage
MYSQL type: range vs ALL or NULL EXPLAIN FORMAT=JSON –add query;

 

EXPLAIN ANALYZE — add query;

 

EXPLAIN — add query;

Returns query analysis in JSON format

 

Shows the query’s cost and time usage

 

Checks index usage and filters

In addition, it is important to note these observations:

  • Check execution plans for table scans instead of index seeks
  • Pay attention to warnings regarding “implicit conversions” or “cardinality estimate errors”
  • Monitor and compare query read times relative to the rows returned

Examples in the next section use the EXPLAIN command to detect non-sargable queries.

4. Examples of Sargable vs. Non-Sargable Queries

Understanding the difference between sargable and non-sargable queries becomes easier with practical examples. In this section, we aim to highlight the common query patterns that prevent index usage and understand the best way to rewrite them. Rather than focusing on the query result, we use EXPLAIN to analyze if a query is sargable or not.

4.1. Function on a Column

Wrapping a column in a function like CAST(), YEAR(), or CONVERT() makes the query non-sargable, as it prevents the use of an index even if applicable to the specified column:

-- Non-Sargable
EXPLAIN SELECT * FROM Student 
WHERE YEAR(birth_date) = 2001;
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | Student | NULL | ALL | NULL | NULL | NULL | NULL | 25 | 100 | Using where |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+

Here, the YEAR() function is applied to birth_date, so the index on the birth_date column is ignored. Let’s look at a sargable alternative:

-- Sargable
EXPLAIN SELECT * FROM Student 
WHERE birth_date >= '2001-01-01' AND birth_date < '2002-01-01';
+----+-------------+---------+------------+-------+---------------+----------+---------+------+------+----------+-----------------------+
| id | select_type | table   | partitions | type  | possible_keys | key      | key_len | ref  | rows | filtered | Extra                 |
+----+-------------+---------+------------+-------+---------------+----------+---------+------+------+----------+-----------------------+
|  1 | SIMPLE      | Student | NULL       | range | year_idx      | year_idx | 4       | NULL |   10 |      100 | Using index condition |
+----+-------------+---------+------------+-------+---------------+----------+---------+------+------+----------+-----------------------+

This query is sargable because it allows the optimizer to search the columns in the WHERE clause using index scans.

4.2. Leading Wildcard in LIKE

Using a preceding “%” when performing a text search prevents the use of an index on the column to be searched:

-- non-sargable
EXPLAIN SELECT * FROM Student WHERE name LIKE '%Po';
+----+-------------+---------+------------+------+---------------+------+---------+------+------+--------------------+-------------+
| id | select_type | table   | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered           | Extra       |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+--------------------+-------------+
|  1 | SIMPLE      | student | NULL       | ALL  | NULL          | NULL | NULL    | NULL |   25 | 11.110000610351562 | Using where |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+--------------------+-------------+

The better approach to make the query sargable is to use the wildcard at the end of the search, just like this:

-- Sargable
EXPLAIN SELECT * FROM Student WHERE name LIKE 'Po%';
+----+-------------+---------+------------+-------+---------------+------------+---------+------+------+----------+-----------------------+
| id | select_type | table   | partitions | type  | possible_keys | key        | key_len | ref  | rows | filtered | Extra                 |
+----+-------------+---------+------------+-------+---------------+------------+---------+------+------+----------+-----------------------+
|  1 | SIMPLE      | student | NULL       | range | index_name    | index_name | 243     | NULL |    2 |      100 | Using index condition |

The above table shows that the query will utilize the “index_name” index, reducing the search to two rows.

4.3. Data Type Mismatch

If the id in the column is numerical (INT), using a quote in the search can make the query non-sargable, because the engine first converts the data type from VARCHAR to INT before executing the query:

-- Non-Sargable (if id is INT)
SELECT * FROM customers WHERE id = '1001';

A better alternative is to run the search without the quote. It shows the significant difference between the two queries:

-- Sargable
SELECT * FROM customers WHERE id = 1001;

The above query won’t need a data type conversion. Hence, it’s considered searchable.

4.4. OR Conditions Across Different Columns

The OR condition can be sargable if the columns in the conditions have appropriate indexes. However, the columns become non-sargable when neither of them has an index, the columns do not share a composite index, or the query combines indexed and non-indexed columns.

-- non-sargable 
EXPLAIN 
SELECT * FROM Student 
WHERE national_id = 123345566 OR name = 'John Liu';
+----+-------------+---------+------------+------+---------------+------+---------+------+------+--------------------+-------------+
| id | select_type | table   | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered           | Extra       |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+--------------------+-------------+
|  1 | SIMPLE      | Student | NULL       | ALL  | index_name    | NULL | NULL    | NULL |   25 | 13.600000381469727 | Using where |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+--------------------+-------------+

We can fix this by using a UNION statement, which ensures that each column is independently indexed:

-- Sargable
EXPLAIN 
SELECT * FROM Student WHERE national_id = 123345566
UNION 
SELECT * FROM Student WHERE name='John Liu';
+----+--------------+------------+------------+------+---------------+------------+---------+-------+------+----------+-----------------+
| id | select_type  | table      | partitions | type | possible_keys | key        | key_len | ref   | rows | filtered | Extra           |
+----+--------------+------------+------------+------+---------------+------------+---------+-------+------+----------+-----------------+
|  1 | PRIMARY      | Student    | NULL       | ALL  | NULL          | NULL       | NULL    | NULL  |   25 |       10 | Using where     |
|  2 | UNION        | student    | NULL       | ref  | index_name    | index_name | 243     | const |    1 |      100 | NULL            |
|  3 | UNION RESULT | <union1,2> | NULL       | ALL  | NULL          | NULL       | NULL    | NULL  | NULL |     NULL | Using temporary |
+----+--------------+------------+------------+------+---------------+------------+---------+-------+------+----------+-----------------+

While the first part uses a full table scan due to the lack of an index on the national_id column, the second part uses an index and a “ref“. Since a part of the query benefits from using an index, it helps to retain sargability.

4.5. Sargability in JOINs

What makes JOINs and subqueries sargable is if predicates or conditions can use indexes. Avoiding functions when trying to match case-sensitive composite columns makes a significant difference:

-- non-sargable JOIN
EXPLAIN SELECT c.name, e.semester, e.grade 
FROM Course c 
JOIN Exam e 
       ON LOWER(c.id) = LOWER(e.course_id) 
WHERE c.credits > 4;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+-------------------+--------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered          | Extra                                      |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+-------------------+--------------------------------------------+
|  1 | SIMPLE      | c     | NULL       | ALL  | NULL          | NULL | NULL    | NULL |   81 | 33.32999801635742 | Using where                                |
|  1 | SIMPLE      | e     | NULL       | ALL  | NULL          | NULL | NULL    | NULL |  143 |               100 | Using where; Using join buffer (hash join) |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+-------------------+--------------------------------------------+

From the output, we can see why this is non-sargable. Many of the conditions listed in section 2 also apply to optimizing JOINs. It is important to ensure that filtering happens in the execution plan rather than after joining large datasets. Let’s now look at a sargable version.

-- Sargable
EXPLAIN SELECT c.name, e.semester, e.grade 
FROM Course c 
JOIN Exam e 
        ON c.id = e.course_id 
WHERE c.credits > 4;
+----+-------------+-------+------------+------+---------------------+---------------------+---------+-----------------+------+-------------------+-------------+
| id | select_type | table | partitions | type | possible_keys       | key                 | key_len | ref             | rows | filtered          | Extra       |
+----+-------------+-------+------------+------+---------------------+---------------------+---------+-----------------+------+-------------------+-------------+
|  1 | SIMPLE      | c     | NULL       | ALL  | PRIMARY,id          | NULL                | NULL    | NULL            |   81 | 33.32999801635742 | Using where |
|  1 | SIMPLE      | e     | NULL       | ref  | exam_course_id_fkey | exam_course_id_fkey | 43      | university.c.id |    1 |               100 | NULL        |
+----+-------------+-------+------------+------+---------------------+---------------------+---------+-----------------+------+-------------------+-------------+

Removing the functions and sticking to direct column comparison with the “>” operator solves the optimization issue from the first query.

5. How Non-Sargable Queries Affect Performance

Non-sargable queries force the database engine to ignore indexes and perform full table scans, thereby degrading performance. This results in higher CPU usage, increased disk I/O, and longer execution time as the dataset gets larger.

When indexes do not apply to queries, the engine examines every row to determine if it matches the computationally expensive condition. For non-sargable JOINs, or subqueries, the query optimizer may be forced to use inefficient join strategies or process larger results than necessary.

Additionally, non-sargable queries create scaling issues. A query that performs acceptably on small datasets (e.g., 10,000 rows) may become unusable on larger tables (e.g., 10 million rows) due to the non-linear growth in resource requirements.

6. Difference in Sargability Across SQL Engines

The table below shows details on how different database engines interpret and optimize queries for sargability. It is important to make this distinction in cases when we’re tuning for specific environments:

Feature SQL Server PostgreSQL MySQL Oracle
Function on Indexed Column Non-sargable unless using a computed column index Non-sargable, but can utilise function-based index Breaks sargability function-based indexes not native Supports function-based index
Implicit Type Conversion Can silently break index usage Can be detected in the query plan Common cause of non-sargable queries may result in an index skip unless there are matching data types
LIKE with leading Wildcard Non-sargable Non-sargable Non-sargable Non-sargable
OR conditions on Multiple Columns Can be non-sargable unless it has a composite index May use sequential scan depending on context Non-sargable without UNION or composite index may use the function index depending on the query type
Index on expression Supported Supported Supported for generated columns Supported

7. Conclusion

Sargability isn’t just about having indexes. It’s about ensuring that queries can use indexes more efficiently. However, this concept of SQL query optimization is often overlooked due to more focus on functionality rather than performance. Understanding this concept can improve the scalability of database systems.

In this article, we explored what sargability is and why it matters, common patterns that prevent sargability, sargable and non-sargable examples, and tools and techniques to identify non-sargable conditions.

The code backing this article is available on GitHub. Once you're logged in as a Baeldung Pro Member, start learning and coding on the project.