A* Algorithm

Heuristic Function h(n)

A* algorithm uses a heuristic function h(n) to estimate the cost from node n to the goal.

Why h(n) is correct if it never overestimates the true cost?

  • Admissible, meaning it never overestimates the true cost to reach the goal from node n. Therefore
    • Suppose P* is the optimal path from the start node to the goal node, and its cost is C*. When exploring any node n along this optimal path, the total estimated cost f(n) = g(n) + h(n) can be expressed as f(n) <= g(n) + h*(n), since h(n) <= h*(n), where h*(n) is the true cost from n to the goal. This implies that f(n) <= C* (since g(n) + h*(n) = C*) Therefore, A* will never overlook the optimal path.
    • We need to prove that when G is first taken out from the priority queue, it is optimal.
      • When A* reaches G, h*(G) = 0, so f(G) = g(G) + h(G) >= g(G) + h*(G) = g(G) (since h(G) >= 0).
      • Suppose there is a better path P' with cost C' < g(G). Since we want to prove by contradiction that G is optimal, we can assume that there’s a node n on P' that is still in the priority queue when G is taken out. By the previous point, f(n) <= C' < g(G) <= f(G), which contradicts the fact that G is taken out first (since A* always takes the smaller one out first). Therefore, G must be optimal.