Sequences are a robust PostgreSQL feature for generating unique identifiers for database rows. Unlike basic auto-increments, PostgreSQL sequences have advanced functionalities that allow extensive customization and sharing across tables.
In this comprehensive guide, we will dive deep into PostgreSQL sequence capabilities, usage patterns, internals, optimizations and industry best practices.
What Sets PostgreSQL Sequences Apart?
The key advantages that sets PostgreSQL sequences above auto-increments are:
1. Flexible Generation Rules
Granular control over increments, upper/lower bounds and circular ranges allows numbers to be generated as per application logic. This surpasses the simplicity of standard auto-increments.
2. Shareable Across Tables
As sequences are independent objects, they can be defined once and leveraged across multiple tables requiring identifiers.
3. Advanced Caching & Prefetching
Sequences support optimized data access by generating ids in batches through configurable caches. This minimizes disk writes.
4. Independent Metadata Tables
Sequence state is maintained in dedicated metadata tables allowing robust recoverability after crashes. Auto-increments rely on max values from fragile base tables.
5. Set-Returning Variants
Additional functions like nextval_mult, setval_mult provide the ability to reserve blocks of future values in a single call.
These capabilities expand the horizons of sequential number generation and enable sequences to handle more complex database patterns.
SQL Syntax & Options
Sequences are created using the CREATE SEQUENCE syntax:
CREATE SEQUENCE sequence_name
INCREMENT BY 1
MINVALUE 1
MAXVALUE 9223372036854775807
START WITH 1
CACHE 1;
The key sequence configuration options are:
| Option | Description | Default |
|---|---|---|
INCREMENT BY |
Increment between numbers | 1 |
MINVALUE |
Minimum value | 1 |
MAXVALUE |
Maximum value | Max of data type |
START |
Initial sequence value | Min value |
CACHE |
Preallocated numbers | 1 |
CYCLE |
Recycle on limits | Not set |
These parameters allow sequences to be tailored as per application patterns.
Incrementing Sequences
The increment controls the difference between subsequent numbers in a sequence.
For sequences used in primary keys, an increment higher than 1 results in skipped numbers:
CREATE SEQUENCE id_seq INCREMENT BY 5;
TABLE users (
id INTEGER PRIMARY KEY DEFAULT nextval(‘id_seq‘)
);
INSERT INTO users VALUES (1); -- Id = 1
INSERT INTO users VALUES (2); -- Id = 6
This leaves gaps which may be undesirable. Hence an increment of 1 is commonly used.
However, larger increments are useful when reserving blocks of IDs beforehand for batched allocation.
Upper & Lower Limits
The MINVALUE and MAXVALUE bounds define the valid range for the sequence:
CREATE SEQUENCE cyclic_seq
INCREMENT BY 1
MINVALUE 1
MAXVALUE 5
CYCLE;
Hitting these limits will cause errors unless CYCLE is used to wrap the range.
Caching Sequence Numbers
Setting CACHE allocates numbers in memory cache for faster access:
CREATE SEQUENCE cache_seq CACHE 100;
Accessing the next 100 numbers will now avoid disk I/O. However cache settings higher than 1 can produce unused holes on system failures.
Caches pose a trade-off between performance and missing numbers on crashes. Higher caches work better for sequences not used as primary keys.
Functions for Sequence Values
PostgreSQL provides special functions to operate on sequences:
SELECT nextval(‘seq‘); -- Advance & return next number
SELECT currval(‘seq‘); -- Current value
SELECT setval(‘seq‘, 10); -- Reset
SELECT lastval(); -- Last returned number
These allow the current state and values from a sequence to be obtained.
The nextval function plays a crucial role in extracting the subsequent identifier from the sequence.
lastval() vs currval()
lastval() returns the last obtained sequence value across all sessions. In contrast, currval() is bound to only the current session.
This differentiation is vital for understanding and preventing concurrency issues when accessing sequences from pooled connections.
Typical Usage Patterns
Sequences lend themselves well to some classic use cases:
Auto-incrementing Keys
The foremost usage of sequences is generating auto-incrementing primary keys:
CREATE TABLE users (
id INT PRIMARY KEY DEFAULT nextval(‘user_id‘),
name TEXT
);
This removes the need to manually handle cumbersome primary key allocation.
However this does pose recoverability issues due to gaps on transaction rollbacks. Alternatives like HiLo algorithms and sequence preallocation can help.
Universally Unique Identifiers (UUIDs)
Sequences can be combined with UUID data types to produce hybrid identifiers with incremental order and global uniqueness:
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
CREATE SEQUENCE order_uid;
SELECT uuid_in(uuid_ns_dns()::text ||
nextval(‘order_uid‘)::text);
This leverages UUID namespacing for uniqueness and sequences for monotonic values.
Look-ahead Allocation
We can reserve blocks of future sequence numbers for batched allocation using nextval and transaction tricks:
SELECT nextval(‘seq‘) FROM generate_series(1, 10);
BEGIN;
INSERT INTO table (id) SELECT nextval(‘seq‘) FROM generate_series(1, 1000);
COMMIT;
This reduces round trips while allocating in batches. However gaps can arise on rollbacks.
Purpose built setval functions help prevent this issue in PostgreSQL 14 onwards.
Circular Numbering
The CYCLE option allows recycling sequence values once limits are reached:
CREATE SEQUENCE cyclic_seq
INCREMENT BY 1
MINVALUE 1
MAXVALUE 5
CYCLE;
SELECT nextval(‘cyclic_seq‘); -- 1, 2, 3, 4, 5, 1, 2, 3...
This builds circular sequences suitable for cases like invoice numbers, ticket numbers etc.
Sequence Performance & Optimization
Let‘s analyze some key performance factors around sequences:
1. Metadata Storage Overhead
Each sequence results in new entries across multiple system catalogs – pg_class, pg_sequence and others. Overuse of sequences can bloat the database with excess metadata.
Hence sequences should be designed keeping application patterns in mind rather than arbitrarily. Reusing sequences via configuration changes is preferable compared to proliferating sequences.
2. Transaction Overheads
Each nextval() executes as a Write-Ahead Log transaction to ensure recoverability. In addition, sequences currently acquire an ExclusiveLock to enforce orderly allocation. This introduces notable transactioncoordination overheads.
These constraints mean obtaining bulk sequence IDs in batches is considerably faster than individual nextval calls per row operation. Client-side allocation helps amortize these expenses.
3. Cache Settings
The cache size controls a major performance facet – avoiding disk writes and round trips. PostgreSQL stores unallocated cached values in memory and persists only the last returned number.
Higher caches reduce physical I/Os significantly. But risk of lost numbers increases in crashes. 50-100 seems optimal in most cases if gaps are acceptable.
4. Gaps on Rollbacks
Like cache loss, rollbacks too create holes as the sequence state has already advanced. This can be mitigated via client-side preallocation and savepoints to batch inserts. Newer set-returning variants will explicitly address this.
Overall when used judiciously, sequences impose minimal overheads and deliver optimized data access.
Sequences vs Serial Columns
SERIAL columns are a convenience wrapper that use sequences implicitly:
CREATE TABLE users (
id SERIAL, -- Implicit sequence + default
name TEXT
);
This simplicity comes at the cost of customization as the underlying sequence cannot be configured.
Hence explicit sequences are preferable for precision control and sharing across tables. Serials work best for vanilla key columns.
Sequences in Other Databases
MySQL auto-increments serve the same purpose as sequences but have stricter table coupling and lesser features like circular increments.
SQL Server lacks generic sequences but provides IDENTITY columns closely matching serial behavior and SEED/INCREMENT options similar to PostgreSQL parameters.
Oracle sequences too are highly configurable akin to PostgreSQL but omit multi-table sharing of sequence generators.
Thus PostgreSQL sequences strike the right balance of power, customization and ease of use in identifier generation.
Conclusion & Best Practices
We explored how PostgreSQL sequences enable robust generation of unique IDs with versatile controls compared to standard auto-increments.
Here are some key guidelines for optimal use of sequences:
- Prefer sequences over serials for configurable and shareable behavior.
- Set cache above 1 for performance while ensuring application compatibility with gaps.
- Define MAXVALUE boundaries to plan sequence cycles and prevent errors.
- Use client-side preallocation via caches/multi-row functions for efficiency.
- Share sequences judiciously instead of proliferating them.
- Use bigint typed sequences for future proof 64-bit identifiers.
By mastering these sequence capabilities, PostgreSQL developers can tackle complex numbering schemes in their applications with flair.
Sequences lend a flexible helping hand to the intricate world of managing identifiers across ever-growing databases. Their versatility makes them indispensable Swiss army knives for any PostgreSQL architect and a feature that sets PostgreSQL apart in the database landscape.


