Topological Sort in Practice: Ordering Work When Dependencies Matter

Last year I watched a CI pipeline fail in a way that looked random: sometimes the frontend built, sometimes it didn’t, and sometimes the database migrations ran after the API started. Nothing was “broken” in isolation; the problem was the order. We had a pile of tasks with constraints: migrations must run before the API boots, the API must boot before integration tests, and the frontend build can happen in parallel—but only after shared assets are generated.

That kind of situation shows up everywhere: build systems, dependency injection graphs, course prerequisites, job schedulers, data workflows, even UI initialization order. When you have “A must happen before B” relationships, you’re holding a directed graph of dependencies. Topological sort is the tool I reach for when I need a valid ordering (or a clear error when no valid ordering exists).

By the end of this post, you’ll be able to model real dependency problems as a graph, implement topological sort two different ways, detect cycles reliably, and apply the technique to practical engineering work (including the sort of guardrails I expect in 2026 codebases: fast tests, strong invariants, and AI-assisted review workflows).

Thinking in Dependencies: From Sticky Notes to Directed Graphs

I like starting with a physical analogy: imagine each task is a sticky note, and each arrow “A -> B” means “A must be done before B.” If you can arrange the notes so every arrow points forward (left-to-right), you’ve got a valid schedule.

Formally, that’s a directed graph:

  • Each task is a vertex (node).
  • Each dependency is a directed edge A -> B.

A topological ordering is a linear list of all vertices where every directed edge A -> B appears with A earlier than B.

There’s one big requirement: the graph must be a DAG (Directed Acyclic Graph). If there’s a cycle—say A depends on B, B depends on C, and C depends on A—no ordering can satisfy those constraints. In real systems, cycles are often configuration mistakes (“module-a imports module-b and module-b imports module-a”) or dependency drift over time.

A few practical notes I keep in mind:

  • A node can have multiple prerequisites (incoming edges).
  • Multiple valid orderings can exist; topological sort returns one of them.
  • “Disconnected” components are normal (two independent sets of tasks).

When you choose data structures, you typically store:

  • An adjacency list (outgoing edges from each node).
  • Either indegree counts (number of prerequisites) or a visitation state for DFS.

What You Get (and What You Don’t): Guarantees and Failure Modes

Topological sort gives you two outcomes:

1) A valid ordering of all nodes.

2) A proof that no valid ordering exists (cycle detected).

That sounds simple, but I’ve seen subtle mistakes when engineers assume extra properties that aren’t guaranteed.

What you do get:

  • Every prerequisite appears before its dependent.
  • Time complexity is O(V + E) for standard algorithms.
  • The ability to detect cycles as a first-class result.

What you don’t get automatically:

  • A “best” order according to runtime, cost, or priority.
  • Parallel execution planning by itself (though it can help you build it).
  • A stable result across runs unless you enforce tie-breaking.

If you care about deterministic output (for reproducible builds or consistent diffs), you should tie-break consistently. For example, when multiple nodes are ready, pick the lexicographically smallest name, or the smallest numeric ID. That turns “any valid ordering” into “the same valid ordering every time.”

I also recommend thinking about the error experience. If there is a cycle, “cycle detected” is technically correct but not helpful. In production tooling, you want to identify a cycle path (A -> B -> C -> A) so the person fixing it knows where to look.

Kahn’s Algorithm: The Queue-Based Work Scheduler

Kahn’s algorithm is the approach I reach for when I want something that feels like a scheduler: repeatedly pick tasks with no remaining prerequisites.

Core idea:

  • Compute indegree for each node (number of incoming edges).
  • Initialize a queue (or priority queue) with all nodes with indegree 0.
  • Pop one node, append it to the ordering, and “remove” its outgoing edges by decrementing indegrees of its neighbors.
  • If you process all nodes, you have an ordering.
  • If you get stuck early (no indegree-0 nodes remain), you have a cycle.

This maps nicely to real work: indegree 0 means “ready to run.”

Here’s a complete Python example you can run as-is. I’m using task names (strings) because that’s common in build and workflow tools.

from collections import defaultdict, deque

def toposortkahn(tasks, dependencies):

"""Return a topological ordering of tasks or raise ValueError on cycle.

tasks: iterable of task names

dependencies: iterable of (prereq, task) edges meaning prereq must run before task

"""

tasks = list(tasks)

graph = defaultdict(list)

indegree = {t: 0 for t in tasks}

for prereq, task in dependencies:

graph[prereq].append(task)

indegree[task] = indegree.get(task, 0) + 1

if prereq not in indegree:

indegree[prereq] = 0

q = deque([t for t, d in indegree.items() if d == 0])

order = []

while q:

t = q.popleft()

order.append(t)

for nxt in graph[t]:

indegree[nxt] -= 1

if indegree[nxt] == 0:

q.append(nxt)

if len(order) != len(indegree):

remaining = [t for t, d in indegree.items() if d > 0]

raise ValueError(f"Cycle detected. Remaining blocked tasks: {remaining}")

return order

if name == "main":

tasks = [

"generate-assets",

"migrate-db",

"start-api",

"build-frontend",

"integration-tests",

]

deps = [

("generate-assets", "build-frontend"),

("migrate-db", "start-api"),

("start-api", "integration-tests"),

("build-frontend", "integration-tests"),

]

print(toposortkahn(tasks, deps))

A few engineering choices I’d call out:

  • I treat tasks as a declared set, but I also tolerate a dependency mentioning a task not in the list by creating indegree entries. In strict systems, you might reject unknown nodes instead.
  • The cycle error includes “blocked tasks,” which is a quick hint but not a full cycle trace.
  • If you want deterministic output, replace the queue with a min-heap (priority queue) so you always pick the smallest ready node.

When I want to show “waves” of parallelism, Kahn’s algorithm is also the natural foundation: all indegree-0 nodes at a given step are runnable in parallel.

DFS Postorder: The Recursive View (and Cycle Detection That Explains Itself)

The depth-first search approach builds the ordering by exploring dependencies and adding nodes after their outgoing edges are processed. Intuitively, it’s like saying: “Before I commit this task to the final list, I’m going to ensure everything that depends on it is explored.”

The key detail is cycle detection. I use a 3-color (or 3-state) visitation model:

  • 0 = unvisited
  • 1 = visiting (currently on recursion stack)
  • 2 = visited (fully processed)

If during DFS you see an edge to a “visiting” node, that’s a back edge and therefore a cycle.

Here’s a runnable JavaScript example (Node.js). I’m keeping it plain JS (not framework-specific), since you’ll often embed this inside tooling.

function topoSortDfs(nodes, edges) {

// nodes: array of node ids (strings)

// edges: array of [from, to] meaning from must happen before to

const graph = new Map();

for (const n of nodes) graph.set(n, []);

for (const [from, to] of edges) {

if (!graph.has(from)) graph.set(from, []);

if (!graph.has(to)) graph.set(to, []);

graph.get(from).push(to);

}

const state = new Map(); // 0 unvisited, 1 visiting, 2 visited

const parent = new Map();

const order = [];

function dfs(u) {

state.set(u, 1);

for (const v of graph.get(u)) {

const s = state.get(v) ?? 0;

if (s === 0) {

parent.set(v, u);

dfs(v);

} else if (s === 1) {

// Reconstruct a cycle path for a helpful error

const cycle = [v];

let cur = u;

while (cur !== v && cur != null) {

cycle.push(cur);

cur = parent.get(cur);

}

cycle.push(v);

cycle.reverse();

throw new Error(Cycle detected: ${cycle.join(" -> ")});

}

}

state.set(u, 2);

order.push(u);

}

for (const n of graph.keys()) {

if ((state.get(n) ?? 0) === 0) dfs(n);

}

// Postorder needs reversing

order.reverse();

return order;

}

// Demo

if (require.main === module) {

const nodes = ["parse", "compile", "bundle", "test", "deploy"];

const edges = [

["parse", "compile"],

["compile", "bundle"],

["bundle", "test"],

["test", "deploy"],

];

console.log(topoSortDfs(nodes, edges));

}

Why I still use DFS in 2026:

  • Cycle reporting is often easier to make great (as shown above).
  • DFS fits naturally when you’re already traversing a graph for other reasons.

Why I sometimes avoid DFS:

  • Deep graphs can blow the call stack in JavaScript or Python if you go fully recursive. You can rewrite it iteratively, but Kahn’s algorithm avoids recursion entirely.

If you’re building internal developer tooling, I recommend you prioritize “cycle explanation quality.” It saves real time during on-call and during dependency refactors.

Choosing an Approach in Real Code: Traditional vs Modern Expectations (2026)

Topological sort is old, but the environment around it has changed. In 2026, I expect more automation around correctness, reproducibility, and observability.

Here’s how I think about it in practice:

Concern

Traditional Approach

Modern Approach (2026) —

— Determinism

“Any ordering is fine”

Enforce stable tie-breaks; deterministic builds and CI diffs Cycle handling

Throw generic error

Report cycle path; attach actionable diagnostics Validation

A few hand-picked test cases

Property-based tests + fuzzing for graphs; invariant checks Performance

Big-O only

Measure typical latency ranges; keep it under budget in CI Tooling

Manual debugging

Graph visualization exports (DOT), traces, and AI-assisted review prompts

Two quick, concrete recommendations I give teams:

  • If you’re ordering human-readable steps (like tasks in a pipeline), use Kahn + deterministic tie-break so the output is stable.
  • If you’re building a library where bad configs are common, use DFS with explicit cycle path reporting.

On AI-assisted workflows: I’ll often ask an assistant to generate “nasty graphs” (dense graphs, near-cycles, duplicate edges, missing nodes) and then I verify with tests. I treat generated examples as adversarial input, not as proof of correctness.

Where Topological Sort Pays Off: Scenarios and Edge Cases I Actually See

Topological sort shows up any time order matters and the constraints are partial (not a single chain). A few examples from day-to-day engineering:

1) Build and packaging pipelines

  • Compile shared libraries before downstream services.
  • Generate API clients from schemas before TypeScript builds.
  • Run lint and unit tests after build artifacts exist.

2) Database migrations

  • Some migrations must precede others (table creation before index creation, enum update before data rewrite).
  • I’ve used topological sort to order migration scripts by declared dependencies to avoid “works on my machine” ordering.

3) Data workflows

  • ETL steps where later steps depend on earlier derived tables.
  • Backfills that depend on dimension tables being ready.

4) Feature flag rollout plans

  • Enable underlying infrastructure flags before user-facing flags.
  • Disable in reverse order during rollback drills.

5) UI initialization

  • Initialize auth, then fetch config, then start routing.
  • Or in mobile apps: permissions gating before starting camera flows.

Edge cases you should plan for up front:

  • Duplicate edges: they can mess up indegree counts if you don’t de-duplicate.
  • Self-dependency (A -> A): it’s an immediate cycle.
  • Missing nodes: do you auto-create them, or treat it as config error?
  • Large graphs: tens of thousands of nodes can still be fine, but watch memory and recursion depth.

One pattern I really like: return not just an ordering, but “levels.” Levels group tasks you can run in parallel safely.

  • Level 0: all tasks with no prerequisites.
  • Level 1: tasks whose prerequisites are all in level 0.
  • And so on.

That’s basically Kahn’s algorithm with batching.

Common Mistakes (and the Fixes I Standardize)

I’ve reviewed enough implementations to have a short list of mistakes that keep repeating.

1) Forgetting to include nodes with no edges

Symptom: output misses tasks that are isolated.

Fix: initialize indegree/state for every declared node, not only those appearing in edges.

2) Mixing up edge direction

Symptom: you get an ordering that violates the real dependency (“deploy” appears before “test”).

Fix: write edges as “prereq -> dependent” and stick to that mental model. I often name variables prereq, task to keep it honest.

3) Not detecting cycles correctly

Symptom: algorithm returns a partial order without noticing it’s incomplete.

Fix: for Kahn, compare len(order) vs V. For DFS, use visiting/visited states.

4) Non-deterministic output causing flaky tooling

Symptom: same input yields different valid orders (especially across different runtimes or map iteration behaviors).

Fix: choose a deterministic ready-queue (heap) or sort adjacency lists.

5) Losing debuggability

Symptom: a cycle error doesn’t tell you what to change.

Fix: reconstruct a cycle path (DFS with parent pointers is a good way).

If you want one “house style” rule: always expose a debug mode that emits the graph (for example as DOT format) and the chosen ordering. That makes dependency problems visible instead of mysterious.

Performance, Testing, and Production Guardrails

Topological sort is O(V + E), and that’s usually plenty. In typical internal tools, I see graphs in the hundreds to low tens of thousands of nodes; those usually run in milliseconds to a few tens of milliseconds in a scripting runtime, depending on density and environment.

Where performance goes sideways:

  • Extremely dense graphs (E close to V^2) will spend time walking huge adjacency lists.
  • Excessive string handling (giant task IDs) can dominate runtime more than graph traversal.

What I do in production-quality implementations:

  • De-duplicate edges if inputs can be messy (use a set of from|to keys).
  • Use arrays and integer IDs internally for speed, but keep a mapping to user-friendly names for errors.
  • Add a maximum node/edge budget if the input comes from untrusted sources.

Testing strategy that catches real bugs:

  • Invariant test: for every edge A -> B, ensure index(A) < index(B) in the result.
  • Cycle test: inject a known cycle and ensure you get an error.
  • Fuzz test: generate random DAGs and validate invariants.
  • Regression test: keep a few “historic” graphs from incidents (the CI pipeline that broke last quarter is a great test case).

If you use AI tools in your workflow (I do), here’s the way I keep it grounded:

  • I ask the assistant to produce tricky inputs and edge cases.
  • I never trust the assistant’s claim that code is correct unless tests confirm invariants.
  • I paste the cycle error output back into the assistant to refine error messaging, not algorithm correctness.

That division of labor—AI for exploration, tests for truth—fits the way modern teams move fast without treating speed as an excuse for surprises.

Key Takeaways and What I’d Do Next If This Is Your Codebase

When dependencies start to pile up, guessing an order is a trap. I’ve learned to treat ordering as a graph problem early, because the graph gives you clarity: either there’s a valid schedule, or there’s a cycle you must remove.

If you’re implementing topological sort today, I recommend you make three decisions explicitly instead of letting them happen by accident:

  • Pick Kahn’s algorithm when you want a scheduler feel (and when you might later batch tasks into parallel “levels”). Pick DFS when you want cycle errors that point directly to the problem.
  • Make the output deterministic if humans or CI will read it. A stable tie-break saves you from noisy diffs and “why did it change?” reviews.
  • Treat cycle reporting as a feature, not a footnote. A good cycle path turns a frustrating debugging session into a quick fix.

If you want a practical next step, take one dependency-heavy area you own (build scripts, migrations, workflow orchestration) and add two guardrails: a topological sort with clear cycle errors, and a test that checks the ordering invariant on real inputs. Once that’s in place, refactors get safer, incident reviews get shorter, and you stop spending time arguing about order because the graph makes it obvious.

Scroll to Top