Graph review

Graph: edges and vertices. Directed vs. undirected edges. Weighted vs. unweighted edges.

Graph representations:

Breadth-first search: finds shortest paths (in terms of path length) from a source node to as many other nodes are reachable from it, by adding newly discovered nodes to a queue.

Depth-first search: finds paths (not necessarily shortest) from a source node to all reachable nodes. DFS has various useful properties that can expose the structure of the graph (e.g., whether or not it has cycles) while we are exploring it. Example:

Weighted graph algorithms

Weighted graphs have many physical applications: we can think of the edges as being (for example) roads between cities, then the weights become milage. If the graph represents a network of pipes, then the edges might be the flow capacity of a given pipe. Given this intepretation, we can ask for some operations that we might like to perform on weighted graphs:

We’re going to assume that nodes in our graphs have a structure like this:

struct node {
  float d;      // Shortest distance to this node
  node* parent; // Pointer to shortest-parent node 
};

Shortest path in a weighted graph

What if the edges have weights, how can we find the shortest path(s) in terms of edge weights? (The path weight of a path is just the sum of all the weights along it.)

First of all, when does the shortest path even exist? Example: (graph with neg weight cycles). Does the shortest path exist? No. Example (graph with neg edges, but no neg-weight cycles). Negative weight edges are OK, but neg-weight cycles are not. If a negative-weight cycle exists anywhere along a path from start to finish, then the “shortest” path does not exist (because we could always make it shorter, by taking another trip around the cycle). We’ll also assume in the following discussion that the shortest path from start to finish actually exists.

Assuming there are no negative weight cycles, how would we find the shortest path from start to finish? Well, to start with, what is the longest length (in terms of number of edges) that the shortest path could be, in the worst case? It’s \(|V|-1\). Why? because that path allows us to visit every vertex once. If we visit any vertex more than once, then we must have a cycle in our path, and without negative-weight cycles, any other cycles can only make a path worse than it would be if we removed them. (Demo)

So in the worst case, the longest possible shortest-path has \(|V|\) nodes and \(|V|-1\) edges and no cycles. Let’s take a more local view.

For the starting node, what is the shortest path to the starting node itself? The empty path, which contains only the starting node and zero edges. (And note that the path-weight of this path is also 0.)

For any of the starting node’s neighbors, can we say that the shortest path is the one via the edge that connects them? Not necessarily:

      4
a ------> b
 \       ^
  \2    /1
   \   /
    V / 
     c

The direct path a -> b is longer, in terms of path weight, than the path a -> c -> b. So we can’t just set the distance to b to 4 and be done with it; we have to be prepared for the possibility that this value will be reduced by a later path. This is different from the BFS, where once we set the distance to a node, it was fixed.

In order to allow for reducing the distance to a node, we use an operation called relax:

void relax(node a, node b, float weight) {
  if(a.d + weight < b.d)
    b.d = a.d + weight;
}

We are re-using the .d distance attribute from the BFS. The idea is that if there is an edge from a to b with weight weight, we check it against the current distance to b and see if using this edge would be better: if so, we adjust the distance to b. Note that relax can only ever make a node’s distance smaller: a call to relax can never increase the distance of a node. This seemingly-simple fact will be important later. We’ll also initially set the distance to every node except the starting node to (\infty) (with the assumption that \(\infty\) can be used in comparisons like in relax).

Suppose we relax all the edges along the shortest path: does doing so set their distances to their final values? Yes, but only if we do it from start to finish. And besides, we don’t know what the shortest path is, so we can’t do that. What happens if we relax every edge in the graph? After doing this, at least one additional node (beyond the starting node) will now have its final distance, and, since distances are never reduced by relax, this node is now “done”, we can do more relaxations without worrying about it.

Put another way, after running relax on all edges, at least one node adjacent to the starting node will have its final distance (it may be more than one).

What if we run another “relax all edges” cycle after this? If a shortest path exists, then another cycle will “lock in” the distance of another node. Thus, every time we “relax all nodes” another node gets its final distance. In the worst case, the shortest path has \(|V|\) nodes, so it would take \(|V|\) “relax all edges” cycles to finalize the distance at the target node.

This brings us to the Bellman-Ford algorithm:

  1. Set the starting node’s distance to 0, and set every other node’s distance to \(\infty\).

  2. Repeat the following \(|V| - 1\) times:

  3. For every edge

  4. Relax that edge

Note that the order in which we relax edges does not matter, because we “relax all times” enough that even if we did them in a terrible order, it would still construct the shortest path.

With a simple extension, we can also detect, after the algorithm completes, whether there are any negative-weight cycles. (A nice feature of Bellman-Ford is that it will complete even if there are neg.-weight cycles. The next algorithm we’ll look at, Dijkstra’s algorithm, will go into a infinite loop!)

  1. For each edge a -> b with weight weight, if a.d + weight < b.d then a negative weight cycle exists.

This just checks every edge to see if it’s possible for it to be relaxed even further. Remember that we demonstrate that the above algorithm should have finalized the distances to all nodes if there were no negative weight cycles. If there are, then it is not possible to finalize the distances, because any path that touches the neg-weight cycle could always be made “shorter”.

Note that we can also easily determine if a shortest path exists: if the target node’s distance is still \(\infty\) after the algorithm completes, then no shortest path exists.

If we want to actually find the shortest path itself, and not just its path length, this can be accomplished with an easy extension to relax:

void relax(node a, node b, float weight) {
  if(a.d + weight < b.d) {
    b.d = a.d + weight;
    b.parent = a;
  }
}

This saves a pointer to the node that we came from in b.parent. The last time that the if body is executed will both “lock in” the final distance, and also save the pointer to the parent node that gives the shortest path back to the starting node. After completion, every node will have both its distance to the starting node, and a pointer to a node that leads backwards along the shortest path to the start. (Thus, Bellman-Ford actually finds all shortest paths from the starting node, to any other node. )

What is the complexity of this algorithm? It’s easy to determine, because the two loops are independent of each other: \(O(|V| \times |E|)\). (This is assuming the adjacency list representation, where we can iterate over all the edges in \(O(|E|)\) time. In the adjacency matrix, iterating over all edges takes \(O(|V|^2)\) time, which means that the total runtime is actually \(O(|V|^3)\)!)

An example of running the Bellman-Ford algorithm:

Dijkstra’s algorithm

Bellman-Ford is an exhaustive algorithm: it basically tries everything, which allows it to work (or at least, terminate) even in the presence of negative-weight cycles. We can do better if we assume that there are no negative-weight cycles and take a greedy approach. This means that instead of trying everything, we’re going to give priority to the option that “looks” the best. In this case, the “best” means, smallest distance: the next node to be processed will be the node with the smallest total distance.

Dijkstra’s algorithm is similar to breadth-first search in that it uses a queue, however, it uses a min-priority queue, where items are enqueued with a priority, and the item with the smallest priority is the next one to be dequeued. The priority we’ll use is the node’s current distance estimate. Note that when we relax an edge, this might change the distance to the node on the other end, which in turn might change its position within the heap!

A review of min-heaps:

When running Dijksta’s algorithm, for every time through the main loop, we “lock in” another node. In order for this to work, we must process nodes in the order they occur along the shortest path.

  1. Set the distance to all nodes other than the start to \(\infty\). Set the starting node’s distance to 0.

  2. Enqueue all nodes. (Note that the only node with a non-infinite distance will be the starting node, so it will be the first node to be dequeued.)

  3. While the queue is not empty:

  4. Let a = q.extract_min()

  5. Relax every outbound edge from a, updating queue priorities as needed

Example…

This algorithm is effectively the original BFS but with path weights substituted for the simple path length “distance” that that algorithm relied on. The same property applies here: At any given time, the “fringe” will be at a certain distance (total path weight) from the starting node, and the algorithm proceeds by expanding the fringe.

The complexity of Dijsktra’s algorithm is complicated by the fact that each time we dequeue an element, we update the distances of its neighbors, which may dramatically restructure the queue. Whenever we relax a node, reducing its distance, we must also use the heap algorithm reduce to adjust its position within the queue, which requires \(O(\log n)\) time in terms of the size of the queue. But the queue is always shrinking, because we remove one node from it each time through the loop, and new nodes are never added into it.

Interestingly, we actually have two choices as to how to implement the heap:

This assumes we are using an adj. list. Using an adj. matrix adds an extra \(|V|\) to find the neighbors of a node.

Note that Dijkstra’s algorithm does not reliably work if there are any negative edges (it may produce incorrect results even if there are non negative-weight cycles). Dijkstra’s algorithm fails in the presence of negative-edges because it is greedy, it commits too early to what appears to be the shortest path. If there are no negative weight edges, then any additional edges will at best leave the path weight unchanged (if the weight is 0), so it makes sense to choose the best case we have right now. But if there are negative edges, any amount of seemingly-bad choices right now could be completed undone by a large negative edge later on. Example:

Weighted undirected edge algorithms

We’ll only look at one algorithm on weighted undirected graphs: Prim’s algorithm for minimum spanning trees.

A minimum spanning tree is a subset of \(E\) that still connects all the vertices, has no cycles, and where the sum of all the edge weights is as small as possible. For example:

There are two general approaches to building a MST:

The first methods requires maintaining a “forest” of separate trees which are only joined into a single tree at the very end, so we’ll look at the second method. This works by picking a starting node (because all nodes must be in the tree, it doesn’t matter which) and then picking an edge to one of its neighbors to add to the tree.

The key idea is to imagine that we already have a partial MST: some vertices and edges that have already been selected to be “in” the MST. We are now looking at new edges to add to the tree. In order to add an edge, it must fullfil two criteria:

We call an edge “safe” if it fullfils both criteria. The MST algorithm can be thought of as just repeatedly finding and adding “safe” edges until a tree is formed.

The first criteria is easy, especially if we grow the tree by edges: we already node when a node is in the tree, so when we add an edge we just have to make sure that one of its end points is in the tree, and the other is not in the tree.

The second criteria is more difficult to quantify. When does adding a (non-cycle-forming) edge preserve the partial-MST property? Let us look at all the edges that fit the first criteria: they touch one node inside the tree, and one node outside it. Example:

Which of the highlighted edges is “safe”? The one with the minimum weight of all the edges. We call an edge like this a light edge: it’s a major theorem of the spanning tree algorithm that light edges are always safe.

It might not be obvious why the minimum weight edge would be safe. Couldn’t we run into a situation like in the shortest-weight-path algorithm, where we think we have found the smallest weight, but later we discover a better option? As it turns out, no, this cannot happen. Remember that we assume that we already had a partial MST to start with: if we already have a MST, then growing it by the smallest weight light edge is the correct choice. If we don’t have a MST, then growing it by a minimum-weight edge might be the wrong choice, but if we start with the smallest possible MST (a single node) and grow it safely every time, this situation will never arise.

This brings us to the implementation of Prim’s Algorithm, which, like Dijkstra’s, relies on the use of a min-priority queue to keep track of what to do next.

  1. Set the starting node’s priority to 0, set every other node’s priority to \(\infty\), and set the parent pointers of all other nodes to nullptr.

  2. Enqueue all nodes (again, noting that the starting node will be the first node to be dequeued).

  3. While the queue is not empty:

  4. Let a = q.extract_min()

  5. For every b which is adjacent to a and still in the queue:

  6. If weight(a,b) < b.d, set b.d = weight(a,b) and b.parent = a. (Note that because b.d is b‘s priority, this will trigger a reduce heap operation.)

At any given time, the queue contains the nodes which are still outside the tree. A nodes “distance” is repurposed as the minimum weight of all edges that connect it to a node inside the tree. The parent pointers form the tree itself, by connecting all nodes back eventually to the starting node.

Example:

The runtime of Prim’s algorithm is \(O(|E| \log |V|)\), again assuming the adj. list representation.