Difference Between Dijkstra’s Algorithm and A* Search Algorithm

Last quarter I helped a robotics team rework the routing for a fleet of warehouse pickers. The machines were fast, but the routes were not; every extra turn added minutes across hundreds of jobs. That kind of pain shows up in web services too—API gateways choose a path through a dependency graph, multiplayer games move NPCs across maps, and data pipelines pick the next best processing node. When a system needs a shortest path, the algorithm choice shapes latency, cost, and even battery life. I usually start with two classics: Dijkstra’s algorithm and A search. They solve the same core problem, yet they feel very different in practice. Dijkstra explores from the start outward and guarantees the best distance to every node. A focuses on a single target and uses a heuristic guess to head in the right direction. In the rest of this post I’ll show how each algorithm behaves, where the guarantees come from, and how I decide which one to ship.

Why shortest path still matters in 2026

Shortest path algorithms are still at the heart of modern systems. I see them in routing layers that pick the cheapest microservice chain, in robotics where a planner weighs energy and time, and in city-scale mapping where a path must avoid road closures. In 2026, many teams also store graph data in streaming stores and compute routes inside low-latency APIs. A one-millisecond change in a per-request route plan can mean a lot when you serve millions of requests per day.

Both Dijkstra and A operate on weighted graphs, but the meaning of the weight has widened. It might be latency, fuel use, risk, or even a blend produced by a policy engine. I’ve noticed that teams now build small simulation harnesses and use AI assistants to generate synthetic graphs for testing. That makes it easier to measure how search behaves under load. The result is a pragmatic question: do you need shortest paths to every node, or do you need the best path to one goal right now? Your answer points you to Dijkstra or A before you even write code.

Dijkstra’s algorithm as a greedy baseline

Dijkstra’s algorithm is the baseline I keep in my pocket. It is greedy in a very specific way: it always finalizes the node with the smallest known distance from the start. Once a node is finalized, that distance can never be improved because all edge weights are non-negative. That property gives Dijkstra a clear mental model. You are growing a wavefront of confirmed shortest distances, one node at a time, until you cover the entire graph.

In practice I represent the graph with adjacency lists and use a priority queue to pop the nearest node. That moves the time cost from O(V^2) in the textbook array version to O((V + E) log V) in the heap-based version. The algorithm gives me a distance table for every node plus a predecessor pointer to rebuild the actual path. That is perfect for routing tables, multi-target dispatch, or any system that needs all shortest paths from a source.

import heapq

from typing import Dict, List, Tuple, Optional

Graph = Dict[str, List[Tuple[str, float]]]

def dijkstra(graph: Graph, start: str, goal: Optional[str] = None):

distances = {start: 0.0}

previous = {start: None}

queue = [(0.0, start)]

visited = set()

while queue:

current_distance, node = heapq.heappop(queue)

if node in visited:

continue

visited.add(node)

if goal is not None and node == goal:

break

for neighbor, weight in graph.get(node, []):

if weight < 0:

raise ValueError(‘Dijkstra requires non-negative weights‘)

newdistance = currentdistance + weight

if new_distance < distances.get(neighbor, float('inf')):

distances[neighbor] = new_distance

previous[neighbor] = node

heapq.heappush(queue, (new_distance, neighbor))

return distances, previous

def reconstruct_path(previous: Dict[str, Optional[str]], start: str, goal: str):

path = []

current = goal

while current is not None:

path.append(current)

current = previous.get(current)

path.reverse()

if not path or path[0] != start:

return []

return path

I usually add that optional goal parameter because it lets me stop early if I only need a single destination. That early exit makes Dijkstra behave more like a target search, which is a small but real performance win on large graphs. The cost is that I no longer have distances for every node, so I only enable it when I know I will not need those extra answers.

The correctness guarantee in one sentence

The key idea I explain to teammates is this: because every edge weight is non-negative, the first time Dijkstra removes a node from the priority queue, we have already found the shortest path to it. That single sentence is what makes the algorithm easy to trust. If you ever introduce negative weights, that guarantee breaks and you need a different algorithm.

Dijkstra in weighted grids and turn-aware routing

Real systems often convert a grid or roadmap into a graph where moving straight has one cost and turning has another. I add turn penalties by encoding orientation into the node state, effectively expanding each grid cell into multiple nodes (north, east, south, west). That makes Dijkstra’s algorithm still valid, but with more nodes, so it is slower. The benefit is that it captures the true cost of motion, which matters a lot for robots and vehicles. Dijkstra is still reliable here because the expanded graph still has non-negative weights.

A* search as goal-directed guidance

A* is my choice when I only care about one target and I have a decent heuristic. It still explores the graph using a priority queue, but instead of prioritizing nodes by the distance from the start alone, it prioritizes by f(n) = g(n) + h(n), where g(n) is the known cost from the start and h(n) is a heuristic estimate of the remaining cost to the goal. That heuristic is the steering wheel that nudges the search toward the destination.

If the heuristic never overestimates the true remaining cost, then A is optimal: it finds the shortest path just like Dijkstra does. If the heuristic is also consistent (also called monotonic), then A has the same nice property that once a node is finalized, its shortest cost is fixed. In practice, I design heuristics to be admissible first, then I check consistency as a correctness guard.

import heapq

from typing import Dict, List, Tuple, Callable, Optional

Graph = Dict[str, List[Tuple[str, float]]]

Heuristic = Callable[[str, str], float]

def a_star(graph: Graph, start: str, goal: str, heuristic: Heuristic):

open_queue = [(heuristic(start, goal), 0.0, start)]

g_score = {start: 0.0}

previous = {start: None}

closed = set()

while open_queue:

, currentg, node = heapq.heappop(open_queue)

if node in closed:

continue

if node == goal:

return g_score, previous

closed.add(node)

for neighbor, weight in graph.get(node, []):

if weight < 0:

raise ValueError(‘A* requires non-negative weights‘)

tentativeg = currentg + weight

if tentativeg < gscore.get(neighbor, float(‘inf‘)):

gscore[neighbor] = tentativeg

previous[neighbor] = node

fscore = tentativeg + heuristic(neighbor, goal)

heapq.heappush(openqueue, (fscore, tentative_g, neighbor))

return g_score, previous

When I teach this, I emphasize that A is Dijkstra plus a hint. If the hint is zero everywhere, A and Dijkstra are identical. If the hint is good, A* explores far fewer nodes and feels dramatically faster. If the hint is too optimistic but still admissible, the result is still correct but slower. If the hint ever overestimates, I lose the optimality guarantee and might return a suboptimal path.

A tiny heuristic example

On a square grid where you can move only up, down, left, or right, I use Manhattan distance as the heuristic because each move changes the sum of row and column distance by at most one. On a grid with diagonal moves, I use octile or Euclidean distance. For a road network, I use straight-line distance between coordinates multiplied by the fastest possible speed. The common theme is that I pick a heuristic that never promises a cheaper path than actually exists.

Dijkstra vs A*: side-by-side view

Here is the mental table I keep on a sticky note. It is not about which algorithm is better, but about which question I am asking.

Dimension

Dijkstra

A* —

— Primary goal

Shortest paths to all nodes

Shortest path to one goal Priority key

Distance from start (g)

g + heuristic (f) Optimality guarantee

Yes, with non-negative weights

Yes, if heuristic admissible Typical expansions

Large wavefront

Narrow cone toward goal Best use case

Routing tables, multi-target dispatch

Navigation to a known goal Sensitivity to heuristic

None

High Early exit

Optional

Built in

The table hides a subtle point: Dijkstra’s algorithm makes no assumptions about geometry or coordinates, while A* usually benefits from them. That is why in abstract graphs without coordinates I lean toward Dijkstra or I invest time to build a heuristic from graph properties like landmarks or precomputed potentials.

How to design a heuristic that helps (not hurts)

Heuristics are both a gift and a trap. A good heuristic cuts the search space dramatically; a bad one makes A* behave like Dijkstra with extra overhead. I design heuristics by thinking about lower bounds on the remaining cost. If I can prove a lower bound, I can use it confidently.

For grids and maps, the standard heuristics are easy: Manhattan, Euclidean, and octile distance. For weighted road networks, I usually take straight-line distance and divide by the maximum speed limit, which yields a lower bound on time. For energy-aware routing, I might take distance times a minimum energy-per-meter coefficient. If I cannot justify a global bound, I switch to data-driven heuristics based on landmarks.

Landmark heuristics (often called ALT: A*, landmarks, triangle inequality) precompute distances from a set of reference nodes. For any node n and goal g, I can derive a lower bound on distance using triangle inequality. This is expensive upfront, but if I have to answer many path queries, it is worth it. I used this in a fleet routing system where thousands of queries hit the same network each hour.

I also tune heuristics with scaling. If I multiply an admissible heuristic by a factor greater than 1, I get weighted A (also called A epsilon). This is no longer optimal but can be much faster. I deploy this only when a slightly longer path is acceptable and I have explicit product buy-in. It is a powerful knob: more speed, less optimality.

Edge cases and correctness traps

This is the section I wish every engineer read before deploying.

1) Negative weights: Dijkstra and A* both break if any edge weight is negative. If you are modeling rebates or credits, reframe the model or use Bellman–Ford. I always validate weights at runtime in development and in tests.

2) Zero-weight edges: These are allowed, but they can create large plateaus where the priority queue has many equal keys. If tie-breaking is sloppy, your search becomes noisy. I tend to break ties by preferring larger g values in A* so that the search hugs the heuristic direction rather than expanding outward.

3) Disconnected graphs: If there is no path, both algorithms return incomplete distance maps. I treat this as a normal outcome and return a clear empty path rather than raising an exception.

4) Floating-point precision: When distances are very small or very large, float comparisons can wobble. I mitigate by using a small epsilon when comparing and by avoiding accumulation over extremely long paths without normalization.

5) Duplicate queue entries: The priority queue does not delete outdated entries. That is why I track a visited set or compare against the latest g_score before expanding. Skipping this will produce correctness bugs and performance regressions.

6) Coordinate mismatch: On a grid, if you choose Manhattan distance but allow diagonal moves, your heuristic can overestimate and you silently break optimality. I always align the heuristic to the allowed movement model.

Performance considerations in the real world

Textbook complexity is only a starting point. The real cost is in node expansions, memory pressure, and priority queue operations. In production graphs with millions of nodes, I focus on three levers.

First, reduce expansions. A* excels when the heuristic is informative, so I invest in a strong lower bound rather than micro-optimizing heap operations. On dense graphs without geometry, Dijkstra is often competitive because any heuristic would be weak.

Second, choose the right data structures. A binary heap is fine for most use cases, but I have seen 20–40 percent speedups by switching to pairing heaps or custom priority queues when the graph is huge and the system is CPU bound. The trade-off is complexity and maintenance cost, so I only do this for hotspot services.

Third, manage memory. Both algorithms store distance maps and predecessor links, and A also stores heuristic values. On very large graphs, I sometimes store only the necessary slice of data, or I compress node IDs into integers and use arrays for tight memory. If memory is a hard cap, A with a strong heuristic can win by expanding fewer nodes.

The performance delta is usually a range, not a constant. In my measurements on navigation-like graphs, A can expand a small fraction of the nodes Dijkstra would, often an order of magnitude fewer. But on graphs where the heuristic is weak, A is only marginally better or even slower because of the extra computation per node. That is why I benchmark both on realistic data before committing.

Practical scenarios I use to decide

When I choose between Dijkstra and A*, I ask what the product actually needs.

  • Robotics navigation: I almost always use A* with a geometry-based heuristic, often on a grid with turn penalties. The goal is known, the heuristic is strong, and the search space is huge.
  • Network routing tables: I use Dijkstra because I need shortest paths to many destinations from a single source. The overhead of running A* for each destination is higher than a single Dijkstra run.
  • Multiplayer game NPCs: A* is the default, but if I need to compute paths to many targets for multiple agents, I sometimes compute a Dijkstra tree from a key hub node to reuse across agents.
  • Service dependency graphs: If the graph is abstract and has no geometry, Dijkstra is simpler and safer. A* needs a heuristic, and inventing one can lead to subtle bugs.
  • Data pipeline scheduling: When the graph represents costs and dependencies, I choose Dijkstra for correctness and transparency, unless I have a well-justified heuristic from domain rules.

Alternatives and hybrids worth knowing

Dijkstra and A* are not the whole story. I keep a few alternatives in mind so I do not overfit to the classics.

  • BFS and 0–1 BFS: If all edges have the same cost, BFS is faster and simpler. If costs are only 0 or 1, 0–1 BFS is a great middle ground.
  • Bellman–Ford: This handles negative weights and can detect negative cycles. It is slower, but it is the correct choice when rebates or penalties create negative edges.
  • Johnson’s algorithm: This lets me compute all-pairs shortest paths on sparse graphs by reweighting edges and running Dijkstra multiple times.
  • Bidirectional Dijkstra and bidirectional A*: These run searches from both the start and the goal and meet in the middle. They can cut work roughly in half, but correctness is trickier and requires careful termination conditions.
  • D Lite and Lifelong Planning A: These are for dynamic graphs where edges can change while the agent moves. I use them in robotics when the map updates in real time.
  • Contraction hierarchies and hub labeling: These are heavy precompute methods for routing on massive road networks. They provide extremely fast queries, but the upfront work and storage are significant.

Testing and monitoring in production

I treat pathfinding as a product feature with its own observability. In testing, I generate graphs that match expected density and weight distribution, then I compare algorithms on metrics like nodes expanded, max queue size, and path length. I record a small suite of golden queries and check that both algorithms agree on path cost when the heuristic should be admissible.

In production, I log summary metrics instead of full paths: algorithm used, runtime, expanded nodes, and whether the path was found. I set timeouts and fallback strategies. For example, if A* exceeds a time budget, I might drop to a simpler heuristic or return the best-known partial path. I also store a small sample of routes for offline inspection so I can detect heuristic bugs before they impact customers.

Modern tooling helps here. I use profiling and flame graphs to confirm whether the bottleneck is the priority queue or the neighbor expansion. I also use AI assistants to generate synthetic graphs and stress cases, which is a fast way to find performance cliffs. But I never trust synthetic results alone; I always replay real traffic traces to validate behavior.

Common pitfalls checklist

Here is the checklist I keep in my code reviews:

  • Ensure all edge weights are non-negative.
  • Ensure the heuristic never overestimates.
  • Reconstruct paths from a predecessor map, not from distances.
  • Handle unreachable goals gracefully.
  • Avoid mutating the graph while searching.
  • Treat the priority queue as append-only and skip stale entries.
  • Align heuristic to movement model (grid, diagonal, turn costs).
  • Benchmark with real-world graphs before committing.

Summary: the mental model I keep

When I strip it down, I use Dijkstra when I need a complete map of shortest distances from one source, and I use A when I have a single target and a trustworthy heuristic. Dijkstra is the reliable floodlight: it shines everywhere and tells me the shortest distances with no extra assumptions. A is the laser pointer: it uses a heuristic to aim straight at the goal, often with dramatic savings. If the heuristic is wrong, the laser misses; if it is right, the search feels almost magical.

In practice I prototype both, benchmark with realistic data, and then pick the one that matches the product’s needs and risk tolerance. The algorithms are simple, but the choices around them—heuristics, data structures, and constraints—are where the real engineering happens. If you internalize that, the difference between Dijkstra and A* stops being academic and starts being a practical decision you can defend.

Scroll to Top