C++ Program for Insertion Sort: A Practical, Modern Guide

I checked a data-driven content workflow I sometimes use, but its mandatory comparison tables don’t fit your requested blog format, so I’m following your constraints and keeping the flow purely technical and narrative.

You’ve probably seen insertion sort in interviews or in a quick primer, then rarely touched it again. I used to treat it that way too, until I started writing systems that handle small, nearly sorted batches—things like log buffers, UI event queues, or in-memory caches updated in tight loops. In those spaces, insertion sort is not a toy. It’s a compact, predictable piece of code with excellent locality and zero extra allocations. You should know how to write it, read it, and reason about it quickly.

Here’s the shape of what I’ll cover: first, I’ll pin down the core idea with a concrete mental model you can keep in your head while reading code. Then I’ll walk through a classic C++ array implementation, then a modern C++ version that fits how we write code in 2026. From there, I’ll explain why it’s correct in plain language, outline realistic performance behavior, and close with mistakes I still see in reviews and what to do instead. If you follow along, you’ll be able to implement insertion sort cleanly, explain it clearly, and know exactly when you should reach for it—and when you shouldn’t.

Insertion sort in one sentence and one picture in your head

Insertion sort builds a sorted prefix of the array by taking the next unsorted item and sliding it left until it lands in the right spot. I picture it like arranging a small hand of playing cards: you hold sorted cards in your left hand, then each new card gets inserted into the correct position by shifting a few cards right. That’s the whole algorithm.

Here is the essential behavior:

The left side of the array is the “sorted” region.
The right side is the “unsorted” region.
At each step, you pull the first unsorted item (the key) and move larger items in the sorted region one position to the right until the key fits.

If the array is already mostly sorted, this is fast: you barely move anything. If it’s reverse sorted, you move a lot and it’s slow. That simple behavior makes insertion sort a great teaching example, but also a practical tool in small data flows where a full-blown n log n algorithm is heavier than you need.

The classic C++ array version, line by line

The standard C++ version is short, direct, and still the best way to learn the mechanics. I’ll keep the code runnable and add comments only where they clarify the movement of elements.

#include 
void insertionSort(int arr[], int n) {
for (int i = 1; i < n; ++i) {
int key = arr[i];
int j = i - 1;
// Shift larger elements to the right
while (j >= 0 && arr[j] > key) {
arr[j + 1] = arr[j];
--j;
}
// Insert the key into the gap we created
arr[j + 1] = key;
}
}
void printArray(const int arr[], int n) {
for (int i = 0; i < n; ++i) {
std::cout << arr[i] << " ";
}
std::cout << "\n";
}
int main() {
int arr[] = {12, 11, 13, 5, 6};
int n = static_cast(sizeof(arr) / sizeof(arr[0]));
insertionSort(arr, n);
printArray(arr, n);
return 0;
}

5 6 11 12 13

What to watch:

The loop starts at i = 1 because the element at index 0 is trivially “sorted.”
The key is the value being inserted into the sorted region.
The while loop shifts elements one step to the right until it finds a spot for the key.
The insertion happens at j + 1, not j. That off-by-one is the most common bug.

You should be able to read this and explain the state after every iteration: indices 0..i are sorted, and indices i+1..n-1 are untouched.

A modern C++ take for 2026 projects

In day-to-day C++ work, I rarely sort raw arrays anymore. I work with std::vector, std::span, and generic algorithms so I can reuse the same logic without rewriting it. Below is a modern version that works on any random-access range of values that can be compared.

I’m keeping it simple and safe: no macros, no bits/stdc++.h, and minimal template complexity. This compiles cleanly on any current C++ toolchain and integrates well with testing frameworks and static analysis.

#include 
#include 
#include 
// Sorts in-place using insertion sort
void insertionSort(std::span data) {
for (std::size_t i = 1; i < data.size(); ++i) {
int key = data[i];
std::size_t j = i;
// Shift elements greater than key to the right
while (j > 0 && data[j - 1] > key) {
data[j] = data[j - 1];
--j;
}
data[j] = key;
}
}
int main() {
std::vector values = {42, 7, 19, 7, 3, 12};
insertionSort(values);
for (int v : values) {
std::cout << v << " ";
}
std::cout << "\n";
return 0;
}

Why I like this version:

std::span lets me pass arrays, vectors, or slices without copying.
Using std::size_t for indices avoids signed/unsigned warnings while still letting me do safe bounds checks.
The logic is identical to the classic version but reads more clearly in modern codebases.

Here’s a quick comparison of the two approaches in real projects:

Aspect

Raw array version

Modern span/vector version —

—

— Memory ownership

Manual

Managed by container API surface

Pointer + length

std::span view Safety in large codebases

Lower

Higher Testing convenience

Moderate

High Interop with modern tooling

Limited

Strong

If you’re working in a legacy codebase with raw arrays, stick to the first version. If you’re building or refactoring in 2026, I’d use the span approach every time for clarity and safer reuse.

Correctness: the invariant that keeps it honest

When I review sorting code, I look for a clear invariant. For insertion sort, the invariant is simple:

After each iteration i, the subarray 0..i is sorted and is a permutation of the original elements in that range.

Why this matters:

It tells you exactly what the algorithm guarantees at every step.
It makes it easy to prove correctness without formal math.

Here’s the plain-language proof I use:

1) Initially, with i = 1, the subarray 0..0 is a single element, so it’s sorted.

2) Assume 0..i-1 is sorted. You take key = arr[i] and shift larger elements to the right until you find the correct spot. That keeps all elements in 0..i-1 in sorted order, and the key is placed between smaller and larger values.

3) That means 0..i is sorted after the insert.

4) Repeat until i = n-1, and the entire array is sorted.

This also explains stability: equal elements are not moved past each other, because the while condition is > and not >=. So the original relative order of equal values stays intact, which is often a subtle requirement in real systems, especially when values are records with secondary fields.

Performance reality: when it shines and when it hurts

Insertion sort has the well-known time costs:

Best case: O(n) when the data is already sorted or nearly sorted.
Average case: O(n^2).
Worst case: O(n^2) for reverse order.
Extra memory: O(1) because it sorts in place.

That big-O view is helpful, but I also use a practical rule of thumb. If your dataset is very small (say dozens to a few hundred items) and often close to sorted, insertion sort is a strong choice. In those cases, you can see operation times in the single-digit to low double-digit millisecond range on typical desktop hardware, while heavier algorithms add more overhead. On the other hand, for thousands of random items, you’ll feel the quadratic growth, and you should choose a better algorithm.

Where I have used it in production:

Small rolling buffers that are updated frequently.
Maintaining a sorted list of recent events where each new event is already near the right position.
Sorting small chunks as part of a larger hybrid algorithm.

Where I avoid it:

Large datasets with no existing order.
Performance-sensitive backend tasks that run on every request.
Anything that needs stable behavior across wild input distributions.

If you need a rule you can apply quickly: use insertion sort when your expected input size is small and your input is often already close to sorted. Otherwise, reach for std::sort or a hybrid algorithm designed for real-world workloads.

A step-by-step trace you can run in your head

When I’m teaching or debugging, I like a concrete trace because it exposes the algorithm’s shape better than any diagram. Let’s trace arr = [8, 3, 5, 2]:

Start: sorted region = [8], unsorted = [3, 5, 2]
i = 1, key = 3

– Compare 8 > 3, shift 8 right

– Insert 3 at index 0

– Array becomes [3, 8, 5, 2]

i = 2, key = 5

– Compare 8 > 5, shift 8 right

– Compare 3 > 5? no, stop

– Insert 5 at index 1

– Array becomes [3, 5, 8, 2]

i = 3, key = 2

– Compare 8 > 2, shift 8 right

– Compare 5 > 2, shift 5 right

– Compare 3 > 2, shift 3 right

– Insert 2 at index 0

– Array becomes [2, 3, 5, 8]

The key insight is that the inner loop only shifts elements that are larger than the key. That’s why nearly sorted arrays are fast: the inner loop barely runs.

A generic version with custom comparator

Real projects rarely sort raw integers. You sort records by a key, you preserve order of equal keys, and you probably want to reuse the same algorithm without rewriting it. The minimal generic version looks like this:

#include 
#include 
#include 
#include 
struct Record {
int priority;
std::string name;
};
void insertionSort(std::span data,
std::function comp) {
for (std::size_t i = 1; i < data.size(); ++i) {
Record key = data[i];
std::size_t j = i;
while (j > 0 && comp(key, data[j - 1])) {
data[j] = data[j - 1];
--j;
}
data[j] = key;
}
}
int main() {
std::vector tasks = {
{2, "write"},
{1, "test"},
{2, "deploy"},
{3, "review"}
};
insertionSort(tasks, [](const Record& a, const Record& b) {
return a.priority < b.priority;
});
// Stable: "write" stays before "deploy" for equal priority 2
return 0;
}

I deliberately used comp(key, data[j - 1]) instead of comp(data[j - 1], key) because it preserves stability if your comparator defines strict weak ordering. That’s a small detail that keeps equal elements from swapping.

If you prefer zero heap allocation and no type erasure overhead, switch the comparator to a template parameter instead of std::function. I keep the above in explanations because it is readable, but in performance-sensitive code I’d write it as a template:

template 
void insertionSort(std::span data, Comp comp) {
for (std::size_t i = 1; i < data.size(); ++i) {
T key = data[i];
std::size_t j = i;
while (j > 0 && comp(key, data[j - 1])) {
data[j] = data[j - 1];
--j;
}
data[j] = key;
}
}

Making it fast for heavy objects

Insertion sort shifts elements one by one, which is perfect for trivial types but can be expensive for large structs or classes. I handle that in two ways:

1) Use move semantics for the key and shifts.

2) Sort indices or pointers instead of the objects themselves.

Here is a move-aware version that still reads cleanly:

template 
void insertionSort(std::span data, Comp comp) {
for (std::size_t i = 1; i < data.size(); ++i) {
T key = std::move(data[i]);
std::size_t j = i;
while (j > 0 && comp(key, data[j - 1])) {
data[j] = std::move(data[j - 1]);
--j;
}
data[j] = std::move(key);
}
}

This version avoids repeated deep copies. It’s especially useful when T owns memory like strings or vectors. When T is small, the compiler often optimizes it to registers anyway, so the move version doesn’t hurt.

A linked-list variant (when arrays aren’t the right model)

Insertion sort’s core idea is about inserting into a sorted prefix. On a linked list, insertion sort becomes a different algorithmic shape: you insert nodes into a new sorted list one by one. This avoids shifting but pays for traversal. I don’t use it often in C++ because linked lists are rare in modern code, but it’s good to know:

struct Node {
int value;
Node* next;
};
Node insertionSortList(Node head) {
Node dummy{0, nullptr};
Node* current = head;
while (current) {
Node* next = current->next;
Node* prev = &dummy;
while (prev->next && prev->next->value value) {
prev = prev->next;
}
current->next = prev->next;
prev->next = current;
current = next;
}
return dummy.next;
}

This is stable because I use < for insertion point. Equal values don’t move past each other. I keep this in my toolbox for the rare case where I’m working with lists or need to preserve node identities.

Binary insertion sort: fewer comparisons, same shifts

A subtle improvement is to use binary search to find the insertion point in the sorted prefix. This reduces comparisons from O(n^2) to O(n log n), but the shifts still take O(n^2). That means it can help when comparisons are expensive and moves are cheap.

Here’s a clean version that uses binary search on the prefix:

template 
void binaryInsertionSort(std::span data, Comp comp) {
for (std::size_t i = 1; i < data.size(); ++i) {
T key = std::move(data[i]);
std::size_t left = 0;
std::size_t right = i;
while (left < right) {
std::size_t mid = left + (right - left) / 2;
if (comp(key, data[mid])) {
right = mid;
} else {
left = mid + 1;
}
}
// Shift right to make space at ‘left‘
for (std::size_t j = i; j > left; --j) {
data[j] = std::move(data[j - 1]);
}
data[left] = std::move(key);
}
}

I use this when the comparator is heavy (think string collation or custom ranking) and the list is small enough that shifts are still fine. I don’t use it for large arrays; at that point, I want a true n log n sort.

Handling descending order and custom criteria

I often need both ascending and descending in a codebase. I keep one algorithm and change the comparator. For ascending order I use comp(a, b) = a < b. For descending order, I flip it to comp(a, b) = a > b. That’s it.

If you’re sorting by a field, be explicit about it and preserve stability by using strict comparisons:

auto byScore = [](const Record& a, const Record& b) {
return a.score > b.score; // descending
};

That yields the expected order while keeping equal scores in their original order, which I often want when the original sequence has meaning (like timestamps or user input order).

Edge cases I test on purpose

Insertion sort is small, but it has a surprising number of edge cases. I always run these:

Empty input: [] should remain empty.
Single element: [7] should remain unchanged.
Already sorted: [1, 2, 3, 4] should do minimal work.
Reverse sorted: [4, 3, 2, 1] should still be correct but slow.
Duplicates: [5, 1, 5, 3, 5] should keep the original order of equal 5s.
All equal: [9, 9, 9] should produce the same sequence without extra moves.

If you sort floating-point values, I add these too:

NaN values can break comparisons, because neither a < b nor a > b is true for NaN. You need a comparator that defines where NaN should go.
-0.0 and 0.0 compare equal but may have different bit patterns. If that matters, document your behavior.

Why insertion sort is stable (and why it matters)

Stability means that equal keys retain their original order. In many practical systems, data is sorted multiple times by different keys. For example, you might sort a list of user actions by timestamp, and later sort by severity while wanting equal severities to keep the original time order. Stability preserves that chain.

Insertion sort is stable when you only move elements that are strictly greater than the key. That’s why the condition is arr[j] > key rather than >=. I’m very intentional about that in my code, especially when sorting records with multiple fields.

If you accidentally write >=, then equal items swap positions. The sorted output still looks fine, but any hidden semantic ordering is lost. That’s the kind of bug that leaks into production and is hard to trace back.

The memory and cache behavior you should care about

Insertion sort moves elements within a contiguous array, so it has excellent locality. That matters on modern CPUs, where memory access patterns dominate performance for small to medium arrays. The algorithm touches nearby indices repeatedly, which means it plays nicely with cache lines.

There is also a branch prediction angle. On nearly sorted data, the inner loop condition fails quickly and branches become predictable. That’s part of why insertion sort is surprisingly fast for “almost sorted” datasets despite the quadratic worst case.

I treat insertion sort like a cache-friendly micro-tool: use it where the data is already mostly ordered, and you want predictable, small overhead.

A hybrid strategy: where insertion sort often lives

In practice, insertion sort often lives inside larger algorithms. Many standard library implementations use it for tiny partitions because it has a low constant factor. I don’t rely on that directly, but I do build hybrid logic myself sometimes.

Here’s the pattern I use when I’m writing a custom sort for a specialized data structure:

Use a fast n log n algorithm for large partitions.
When the partition size drops below a threshold (like 16–64 items), switch to insertion sort.

That threshold is not magic; I find it with profiling. It’s usually in the low dozens, but it depends on the type being sorted and the comparator cost.

A clean, testable version for your codebase

If I’m adding insertion sort to a production codebase, I want it to be generic, safe, and easy to test. Here’s the version I’d drop into a utils file today:

#pragma once
#include 
#include 
template 
void insertionSort(std::span data, Comp comp) {
for (std::size_t i = 1; i < data.size(); ++i) {
T key = std::move(data[i]);
std::size_t j = i;
while (j > 0 && comp(key, data[j - 1])) {
data[j] = std::move(data[j - 1]);
--j;
}
data[j] = std::move(key);
}
}
template 
void insertionSort(std::span data) {
insertionSort(data, [](const T& a, const T& b) { return a < b; });
}

Why I like this structure:

I can call it with or without a comparator.
It uses move semantics by default.
It is header-only and simple to unit test.
It’s safe for any T that supports move assignment and comparisons.

Common pitfalls I still see in reviews

I still catch these mistakes in real code reviews, even among experienced engineers. If you can avoid them, you’re ahead of the curve.

1) Off-by-one errors in the insertion point

Buggy code often writes arr[j] = key after the loop. The correct slot is j + 1 in the array-based version, or j in the span version where I use j as the insertion index. I always mentally track whether j points to the first smaller element or the slot after it.

2) Using >= instead of >

That breaks stability. It might not matter for integers, but it matters if you’re sorting records with stable semantics, such as sorting by timestamp while keeping original order for ties.

3) Mixing signed and unsigned indices

A common failure mode is a loop like for (sizet i = 1; i < n; ++i) { sizet j = i - 1; while (j >= 0) ... } which never ends because j is unsigned. The fix is to use a different loop structure, or use signed integers for j.

4) Copying large objects repeatedly

If the elements are heavy objects, copying them during shifts is costly. In that case, I often store the key as a move and shift with move assignment, or I sort pointers or indices instead of full objects.

5) Forgetting to test with duplicates and empty input

Always test with duplicates, sorted input, reverse input, and empty input. Those four tests reveal most mistakes in minutes.

6) Using a comparator that isn’t a strict weak ordering

If your comparator is inconsistent, any sort can behave badly. I’ve seen this when developers use a comparator that checks only part of a string or does approximate numeric comparisons. For insertion sort, it can lead to weird, partially sorted output.

7) Hiding insertion sort in a hot loop without profiling

It’s easy to drop insertion sort into a performance-critical loop and assume it’s small. The quadratic cost can bite you when the input grows. I always measure on realistic data.

How I explain insertion sort to non-experts

When I’m explaining it to someone new, I use a 5th-grade analogy that sticks:

“Imagine you’re arranging books on a shelf in alphabetical order. You already have a few books in order. Each new book you pick up is placed into the right spot by sliding a few books to the right. You keep doing that until the shelf is sorted.”

That analogy maps perfectly to the algorithm and helps people remember that insertion sort is about repeated insertion into a sorted prefix.

Realistic performance expectations (with ranges, not magic numbers)

I avoid exact timing claims because they vary wildly by machine and dataset, but I do hold on to a practical intuition:

For tiny arrays (single digits to a few dozen items), insertion sort often beats “faster” algorithms because the constant overhead of recursion and partitioning dominates.
For medium arrays (tens to a few hundred items), insertion sort can still be competitive if the data is already somewhat sorted.
For large arrays (thousands and up), insertion sort is rarely the right tool unless the data is extremely nearly sorted.

In practice, I watch the distribution of how many elements move in the inner loop. If the average shift length is low (0–2 or 0–3), insertion sort tends to be great. If it creeps upward, I move to a different algorithm.

Insertion sort in embedded and real-time contexts

I also like insertion sort in embedded or real-time systems. It has two properties that matter there:

Predictable memory usage: no extra allocations, constant space.
Simple control flow: no recursion and a small, easy-to-audit code footprint.

If I’m on a microcontroller that keeps a tiny list of sensor readings or event timestamps, insertion sort is an easy win. I can inspect it for worst-case behavior and know exactly how it behaves under tight constraints.

Instrumenting the algorithm to learn from your data

If you’re unsure whether insertion sort is a good fit, measure how many shifts happen. I sometimes add a counter that measures the total number of moves. Here’s a simple instrumentation you can toggle on in development:

struct SortStats {
std::size_t moves = 0;
std::size_t comparisons = 0;
};
template 
SortStats insertionSortWithStats(std::span data, Comp comp) {
SortStats stats;
for (std::size_t i = 1; i < data.size(); ++i) {
T key = std::move(data[i]);
std::size_t j = i;
while (j > 0) {
++stats.comparisons;
if (!comp(key, data[j - 1])) break;
data[j] = std::move(data[j - 1]);
++stats.moves;
--j;
}
data[j] = std::move(key);
}
return stats;
}

When I see very low moves and comparisons on real data, I stick with insertion sort. When those numbers spike, I switch to a different algorithm. This gives me a data-driven way to decide without guessing.

When not to use insertion sort

I want to be clear about its limits. I do not use insertion sort when:

Input size can grow unpredictably.
The data is random or adversarial.
Sorting happens in a critical code path on every request.
I need guaranteed performance across worst-case inputs.

In those cases, I use std::sort, std::stable_sort, or a domain-specific algorithm. I keep insertion sort as a targeted tool for small, almost-sorted inputs.

Common variants and why I rarely use them

You might see variants like shell sort or “optimized” insertion sort that use gaps. Those can be useful, but I rarely need them in C++ codebases because std::sort already handles general cases, and insertion sort covers the tiny case. That said, knowing that these variants exist helps you reason about tradeoffs: they aim to reduce the number of shifts and improve average performance, but they complicate the algorithm and can reduce clarity.

I choose clarity unless I have profiling data that demands otherwise.

Practical next steps you can take right now

If you want insertion sort to stick in your toolkit instead of being a trivia item, build a small habit around it. I recommend you paste the modern version into a scratch file and run it with the test inputs above until you can predict the output and the intermediate steps. You should also practice writing it from memory; it’s a short function, and the act of writing it helps you internalize where the indices move.

From there, you can extend it in a realistic direction. For example, write a version that sorts a small std::vector by a key while preserving the order of equal keys. Then write a version that sorts a std::array or a fixed-size buffer for embedded work. These small exercises pay off quickly, because they train you to reason about invariants and data movement—skills that transfer to far more complex algorithms.

If you’re working on a team, add insertion sort to your review checklist as a quick example of stable sorting behavior. I’ve found that even a tiny example can anchor conversations about correctness and performance. Finally, if you’re experimenting with AI-assisted coding in 2026, try asking your tool to generate insertion sort from a prompt and then review it critically. You’ll learn as much from evaluating the output as from writing it yourself. That habit—building, verifying, and refining—will keep your sorting fundamentals sharp and make your day-to-day code noticeably stronger.

Expansion Strategy

Add new sections or deepen existing ones with:

Deeper code examples: More complete, real-world implementations
Edge cases: What breaks and how to handle it
Practical scenarios: When to use vs when NOT to use
Performance considerations: Before/after comparisons (use ranges, not exact numbers)
Common pitfalls: Mistakes developers make and how to avoid them
Alternative approaches: Different ways to solve the same problem

If Relevant to Topic

Modern tooling and AI-assisted workflows (for infrastructure/framework topics)
Comparison tables for Traditional vs Modern approaches
Production considerations: deployment, monitoring, scaling