Heuristic Function h(n)
A* algorithm uses a heuristic function h(n)
to estimate the cost from node n
to the goal.
Why h(n) is correct if it never overestimates the true cost?
- Admissible, meaning it never overestimates the true cost to reach the goal from node
n
. Therefore- Suppose P* is the optimal path from the start node to the goal node, and its cost is C*. When exploring any node
n
along this optimal path, the total estimated costf(n) = g(n) + h(n)
can be expressed asf(n) <= g(n) + h*(n)
, sinceh(n) <= h*(n)
, whereh*(n)
is the true cost fromn
to the goal. This implies thatf(n) <= C*
(sinceg(n) + h*(n) = C*
) Therefore, A* will never overlook the optimal path. - We need to prove that when
G
is first taken out from the priority queue, it is optimal.- When A* reaches
G
,h*(G) = 0
, sof(G) = g(G) + h(G) >= g(G) + h*(G) = g(G)
(sinceh(G) >= 0
). - Suppose there is a better path
P'
with costC' < g(G)
. Since we want to prove by contradiction thatG
is optimal, we can assume that there’s a noden
onP'
that is still in the priority queue whenG
is taken out. By the previous point,f(n) <= C' < g(G) <= f(G)
, which contradicts the fact thatG
is taken out first (since A* always takes the smaller one out first). Therefore,G
must be optimal.
- When A* reaches
- Suppose P* is the optimal path from the start node to the goal node, and its cost is C*. When exploring any node