notebooks/foundations/foundations10.mld

Breadth-First v Depth-First Tree Traversal

binary trees as decision trees
look for solution nodes
- Depth-first: search one subtree in full before moving on
- Breadth-first: search all nodes at level $k$ before moving to $k+1$
finds all solutions --- nearest first!

Preorder, inorder and postorder tree traversals all have something in common: they are depth-first. At each node, the left subtree is entirely traversed before the right subtree. Depth-first traversals are easy to code and can be efficient, but they are ill-suited for some problems.

Suppose the tree represents the possible moves in a puzzle, and the purpose of the traversal is to search for a node containing a solution. Then a depth-first traversal may find one solution node deep in the left subtree, when another solution is at the very top of the right subtree. Often we want the shortest path to a solution.

Suppose the tree is infinite or simply extremely large. Depth-first search is almost useless with such trees, for if the left subtree is infinite then the search will never reach the right subtree. OCaml can represent infinite trees by the means discussed in the lecture on laziness. Another tree representation (suitable for solving solitaire, for example) is by a function next : pos -> pos list, which maps a board position to a list of the positions possible after the next move. For simplicity, the examples below use the OCaml datatype tree, which has only finite trees.

A breadth-first traversal explores the nodes horizontally rather than vertically. When visiting a node, it does not traverse the subtrees until it has visited all other nodes at the current depth. This is easily implemented by keeping a list of trees to visit. Initially, this list consists of one element: the entire tree. Each iteration removes a tree from the head of the list and adds its subtrees after the end of the list.

Breadth-First Tree Traversal --- Using Append

# let rec nbreadth = function
  | [] -> []
  | Lf :: ts -> nbreadth ts
  | Br (v, t, u) :: ts ->
      v :: nbreadth (ts @ [t; u]);;
  val nbreadth : 'a tree list -> 'a list = <fun>

Keeps an enormous queue of nodes of search, and is a wasteful use of append.

Breadth-first search can be inefficient, this naive implementation especially so. When the search is at depth $d$ of the tree, the list contains all the remaining trees at depth $d$ , followed by the subtrees (all at depth $d+1$ ) of the trees that have already been visited. At depth 10, the list could already contain 1024 elements. It requires a lot of space, and aggravates this with a gross misuse of append. Evaluating ts@[t, u] copies the long list ts just to insert two elements.

An Abstract Data Type: Queues

qempty is the empty queue
qnull tests whether a queue is empty
qhd returns the element at the head of a queue
deq discards the element at the head of a queue
enq adds an element at the end of a queue

Breadth-first search becomes much faster if we replace the lists by queues. A queue represents a sequence, allowing elements to be taken from the head and added to the tail. This is a First-In-First-Out (FIFO) discipline: the item next to be removed is the one that has been in the queue for the longest time. Lists can implement queues, but append is a poor means of adding elements to the tail.

Our functional arrays are suitable, provided we augment them with a function to delete the first array element. (See ML for the Working Programmer page 156.) Each operation would take $O(\log n)$ time for a queue of length $n$ .

We shall describe a representation of queues that is purely functional, based upon lists, and efficient. Operations take $O(1)$ time when “amortized”: averaged over the lifetime of a queue.

A conventional programming technique is to represent a queue by an array. Two indices point to the front and back of the queue, which may wrap around the end of the array. The coding is somewhat tricky. Worse, the length of the queue must be given a fixed upper bound.

Efficient Functional Queues: Idea

Represent the queue $x_1; x_2; \ldots; x_m; y_n; \ldots; y_1$ by any pair of lists
```
([x_1,x_2,\ldots,x_m], [y_1,y_2,\ldots,y_n])
```
Add new items to the rear list
Remove items from front list and if empty move rear to front
Amortized time per operation is $O(1)$

Queues require efficient access at both ends: at the front, for removal, and at the back, for insertion. Ideally, access should take constant time, $O(1)$ . It may appear that lists cannot provide such access. If enq(q, x) performs q@[x], then this operation will be $O(n)$ . We could represent queues by reversed lists, implementing enq(q, x) by x::q, but then the deq and qhd operations would be $O(n)$ . Linear time is intolerable: a series of $n$ queue operations could then require $O(n^2)$ time.

The solution is to represent a queue by a pair of lists, where

([x_1,x_2,\ldots,x_m], [y_1,y_2,\ldots,y_n])

represents the queue $x_1 x_2 \ldots x_m y_n \ldots y_1$ .

The front part of the queue is stored in order, and the rear part is stored in reverse order. The enq operation adds elements to the rear part using cons, since this list is reversed; thus, enq takes constant time. The deq and qhd operations look at the front part, which normally takes constant time, since this list is stored in order. But sometimes deq removes the last element from the front part; when this happens, it reverses the rear part, which becomes the new front part.

Amortized time refers to the cost per operation averaged over the lifetime of any complete execution. Even for the worst possible execution, the average cost per operation turns out to be constant; see the analysis below.

Efficient Functional Queues: Code

# type 'a queue =
  | Q of 'a list * 'a list;;
  type 'a queue = Q of 'a list * 'a list
# let norm = function
  | Q ([], tls) -> Q (List.rev tls, [])
  | q -> q;;
  val norm : 'a queue -> 'a queue = <fun>
# let qnull = function
  | Q ([], []) -> true
  | _ -> false;;
  val qnull : 'a queue -> bool = <fun>
# let enq (Q (hds, tls)) x = norm (Q (hds, x::tls));;
  val enq : 'a queue -> 'a -> 'a queue = <fun>
# exception Empty;;
  exception Empty
# let deq = function
  | Q (x::hds, tls) -> norm (Q (hds, tls))
  | _ -> raise Empty;;
  val deq : 'a queue -> 'a queue = <fun>
# let qempty = Q ([], []);;
  val qempty : 'a queue = Q ([], [])
# let qhd = function
  | Q (x::_, _) -> x
  | _ -> raise Empty;;
  val qhd : 'a queue -> 'a = <fun>

The datatype of queues prevents confusion with other pairs of lists. The empty queue has both parts empty.

The function norm puts a queue into normal form, ensuring that the front part is never empty unless the entire queue is empty. Functions deq and enq call norm to normalise their result.

Because queues are in normal form, their head is certain to be in their front part, so qhd looks there.

Let us analyse the cost of an execution comprising (in any possible order) $n$ enq operations and $n$ deq operations, starting with an empty queue. Each enq operation will perform one cons, adding an element to the rear part. Since the final queue must be empty, each element of the rear part gets transferred to the front part. The corresponding reversals perform one cons per element. Thus, the total cost of the series of queue operations is $2n$ cons operations, an average of 2 per operation. The amortized time is $O(1)$ .

There is a catch. The conses need not be distributed evenly; reversing a long list could take up to $n-1$ of them. Unpredictable delays make the approach unsuitable for real-time programming where deadlines must be met.

Breadth-First Tree Traversal --- Using Queues

# let rec breadth q =
    if qnull q then []
    else
      match qhd q with
      | Lf -> breadth (deq q)
      | Br (v, t, u) -> v :: breadth (enq (enq (deq q) t) u);;
  val breadth : 'a tree queue -> 'a list = <fun>

This function implements the same algorithm as nbreadth but uses a different data structure. It represents queues using type queue instead of type list.

To compare their efficiency, I applied both functions to the full binary tree of depth 12, which contains 4095 labels. The function nbreadth took 30 seconds while breadth took only 0.15 seconds: faster by a factor of 200.

For larger trees, the speedup would be greater. Choosing the right data structure pays handsomely.

Iterative deepening: Another Exhaustive Search

Breadth-first search examines $O(b^d)$ nodes:

1 + b + \cdots + b^d = {b^{d+1}-1 \over b-1} \qquad \begin{array}[c]{r@{}l} b & {} = \hbox{branching factor}\\ d & {} = \hbox{depth} \end{array}

Recompute nodes at depth $d$ instead of storing them
Time factor is $b/(b-1)$ if $b>1$ ; complexity is still $O(b^d)$
Space required at depth $d$ drops from $b^d$ to $d$

Breadth-first search is not practical for infinite trees: it uses too much space. Large parts of the tree have to be stored. Consider the slightly more general problem of searching trees whose branching factor is $b$ (for binary trees, $b=2$ ). Then breadth-first search to depth $d$ examines $(b^{d+1}-1)/(b-1)$ nodes, which is $O(b^d)$ , ignoring the constant factor of $b/(b-1)$ . Since all nodes that are examined are also stored, the space and time requirements are both $O(b^d)$ .

Depth-first iterative deepening combines the space efficiency of depth-first with the “nearest-first” property of breadth-first search. It performs repeated depth-first searches with increasing depth bounds, each time discarding the result of the previous search. Thus it searches to depth 1, then to depth 2, and so on until it finds a solution. We can afford to discard previous results because the number of nodes is growing exponentially. There are $b^{d+1}$ nodes at level $d+1$ ; if $b\geq2$ , this number actually exceeds the total number of nodes of all previous levels put together, namely $(b^{d+1}-1) / (b-1)$ .

Korf shows that the time needed for iterative deepening to reach depth $d$ is only $b/(b-1)$ times that for breadth-first search, if $b>1$ . This is a constant factor; both algorithms have the same time complexity, $O(b^d)$ . In typical applications where $b\geq2$ the extra factor of $b/(b-1)$ is quite tolerable. The reduction in the space requirement is exponential, from $O(b^d)$ for breadth-first to $O(d)$ for iterative deepening. Of course, this assumes that the tree itself is not stored in memory.

Another Abstract Data Type: Stacks

empty is the empty stack
null tests whether a stack is empty
top returns the element at the top of a stack
pop discards the element at the top of a stack
push adds an element at the top of a stack

A stack is a sequence such that items can be added or removed from the head only. A stack obeys a Last-In-First-Out (LIFO) discipline: the item next to be removed is the one that has been in the queue for the shortest time. Lists can easily implement stacks because both cons and hd affect the head. But unlike lists, stacks are often regarded as an imperative data structure: the effect of push or pop is to change an existing stack, not return a new one.

In conventional programming languages, a stack is often implemented by storing the elements in an array, using a variable (the “stack pointer”) to count them. Most language processors keep track of recursive function calls using an internal stack.

A Survey of Search Methods

Depth-first: use a stack (efficient but incomplete)
Breadth-first: use a queue (uses too much space!)
Iterative deepening: use depth-first to get benefits of breadth-first (trades time for space)
Best-first: use a priority queue (heuristic search)

The data structure determines the search!

Search procedures can be classified by the data structure used to store pending subtrees. Depth-first search stores them on a stack, which is implicit in functions like inorder, but can be made explicit. Breadth-first search stores such nodes in a queue.

An important variation is to store the nodes in a priority queue, which is an ordered sequence. The priority queue applies some sort of ranking function to the nodes, placing higher-ranked nodes before lower-ranked ones. The ranking function typically estimates the distance from the node to a solution. If the estimate is good, the solution is located swiftly. This method is called best-first search.

The priority queue can be kept as a sorted list, although this is slow. Binary search trees would be much better on average, and fancier data structures improve matters further.

Exercise 10.1

Suppose that we have an implementation of queues, based on binary trees, such that each operation takes logarithmic time in the worst case. Outline the advantages and drawbacks of such an implementation compared with one presented above.

Exercise 10.2

The traditional way to implement queues uses a fixed-length array. Two indices into the array indicate the start and end of the queue, which wraps around from the end of the array to the start. How appropriate is such a data structure for implementing breadth-first search?

Exercise 10.3

Write a version of the function breadth using a nested let construction rather than match.

Exercise 10.4

Iterative deepening is inappropriate if $b\approx1$ , where $b$ is the branching factor. What search strategy is appropriate in this case?

Exercise 10.5

Consider the following OCaml function.

let next n = [2 * n; 2 * n + 1]

If we regard it as representing a tree, where the subtrees are computed from the current label, what tree does next 1 represent?