Graph review
Graph: edges and vertices. Directed vs. undirected edges. Weighted vs. unweighted edges.
Graph representations:
Adj. list: store an array of lists, one list for each vertex. The list stores pointers to the vertices that are adjacent (connected by outbound edges) to that one.
Adj. matrix: store an n by n
bool
matrix (where n is the number of vertices). To determine whether there is an edge from x to y look inmatrix[x][y]
.
Breadth-first search: finds shortest paths (in terms of path length) from a source node to as many other nodes are reachable from it, by adding newly discovered nodes to a queue.
Depth-first search: finds paths (not necessarily shortest) from a source node to all reachable nodes. DFS has various useful properties that can expose the structure of the graph (e.g., whether or not it has cycles) while we are exploring it. Example:
Weighted graph algorithms
Weighted graphs have many physical applications: we can think of the edges as being (for example) roads between cities, then the weights become milage. If the graph represents a network of pipes, then the edges might be the flow capacity of a given pipe. Given this intepretation, we can ask for some operations that we might like to perform on weighted graphs:
Find the shortest path from one vertex to another, where “shortest” is defined in terms of the path weight (i.e., the sum of all the weights of the edges in the path), rather than just the number of edges.
Find the maximal flow of a graph between one vertex and another, if we treat the weights as capacities.
Find a minimum spanning tree of an (undirected) weighted graph. The MST is a tree built from edges in the graph (i.e., a “subgraph”) where the sum of all the edges is as small as possible. The MST is useful because it is essentially a graph in which every simple path is the shortest path between its two endpoints; it is not possible to construct a non-shortest path in a MST.
We’re going to assume that nodes in our graphs have a structure like this:
struct node {
float d; // Shortest distance to this node
node* parent; // Pointer to shortest-parent node
};
Shortest path in a weighted graph
What if the edges have weights, how can we find the shortest path(s) in terms of edge weights? (The path weight of a path is just the sum of all the weights along it.)
First of all, when does the shortest path even exist? Example: (graph with neg weight cycles). Does the shortest path exist? No. Example (graph with neg edges, but no neg-weight cycles). Negative weight edges are OK, but neg-weight cycles are not. If a negative-weight cycle exists anywhere along a path from start to finish, then the “shortest” path does not exist (because we could always make it shorter, by taking another trip around the cycle). We’ll also assume in the following discussion that the shortest path from start to finish actually exists.
Assuming there are no negative weight cycles, how would we find the shortest path from start to finish? Well, to start with, what is the longest length (in terms of number of edges) that the shortest path could be, in the worst case? It’s \(|V|-1\). Why? because that path allows us to visit every vertex once. If we visit any vertex more than once, then we must have a cycle in our path, and without negative-weight cycles, any other cycles can only make a path worse than it would be if we removed them. (Demo)
So in the worst case, the longest possible shortest-path has \(|V|\) nodes and \(|V|-1\) edges and no cycles. Let’s take a more local view.
For the starting node, what is the shortest path to the starting node itself? The empty path, which contains only the starting node and zero edges. (And note that the path-weight of this path is also 0.)
For any of the starting node’s neighbors, can we say that the shortest path is the one via the edge that connects them? Not necessarily:
4
a ------> b
\ ^
\2 /1
\ /
V /
c
The direct path a -> b
is longer, in terms of path weight, than the
path a -> c -> b
. So we can’t just set the distance to b
to 4 and be
done with it; we have to be prepared for the possibility that this value will
be reduced by a later path. This is different from the BFS, where once we
set the distance to a node, it was fixed.
In order to allow for reducing the distance to a node, we use an operation
called relax
:
void relax(node a, node b, float weight) {
if(a.d + weight < b.d)
b.d = a.d + weight;
}
We are re-using the .d
distance attribute from the BFS. The idea is that
if there is an edge from a
to b
with weight weight
, we check it against
the current distance to b
and see if using this edge would be better: if so,
we adjust the distance to b
. Note that relax
can only ever make a node’s
distance smaller: a call to relax
can never increase the distance of a
node. This seemingly-simple fact will be important later. We’ll also
initially set the distance to every node except the starting node to
(\infty) (with the assumption that \(\infty\) can be used in comparisons
like in relax
).
Suppose we relax
all the edges along the shortest path: does doing so set
their distances to their final values? Yes, but only if we do it from start to
finish. And besides, we don’t know what the shortest path is, so we can’t do
that. What happens if we relax
every edge in the graph? After doing this,
at least one additional node (beyond the starting node) will now have its
final distance, and, since distances are never reduced by relax
, this node
is now “done”, we can do more relaxations without worrying about it.
Put another way, after running relax
on all edges, at least one node adjacent
to the starting node will have its final distance (it may be more than one).
What if we run another “relax all edges” cycle after this? If a shortest path exists, then another cycle will “lock in” the distance of another node. Thus, every time we “relax all nodes” another node gets its final distance. In the worst case, the shortest path has \(|V|\) nodes, so it would take \(|V|\) “relax all edges” cycles to finalize the distance at the target node.
This brings us to the Bellman-Ford algorithm:
Set the starting node’s distance to 0, and set every other node’s distance to \(\infty\).
Repeat the following \(|V| - 1\) times:
For every edge
Relax that edge
Note that the order in which we relax
edges does not matter, because we
“relax all times” enough that even if we did them in a terrible order, it would
still construct the shortest path.
With a simple extension, we can also detect, after the algorithm completes, whether there are any negative-weight cycles. (A nice feature of Bellman-Ford is that it will complete even if there are neg.-weight cycles. The next algorithm we’ll look at, Dijkstra’s algorithm, will go into a infinite loop!)
- For each edge
a -> b
with weight weight, ifa.d + weight < b.d
then a negative weight cycle exists.
This just checks every edge to see if it’s possible for it to be relaxed even further. Remember that we demonstrate that the above algorithm should have finalized the distances to all nodes if there were no negative weight cycles. If there are, then it is not possible to finalize the distances, because any path that touches the neg-weight cycle could always be made “shorter”.
Note that we can also easily determine if a shortest path exists: if the target node’s distance is still \(\infty\) after the algorithm completes, then no shortest path exists.
If we want to actually find the shortest path itself, and not just its path
length, this can be accomplished with an easy extension to relax
:
void relax(node a, node b, float weight) {
if(a.d + weight < b.d) {
b.d = a.d + weight;
b.parent = a;
}
}
This saves a pointer to the node that we came from in b.parent
. The last
time that the if
body is executed will both “lock in” the final distance,
and also save the pointer to the parent node that gives the shortest path
back to the starting node. After completion, every node will have both its
distance to the starting node, and a pointer to a node that leads backwards
along the shortest path to the start. (Thus, Bellman-Ford actually finds all
shortest paths from the starting node, to any other node. )
What is the complexity of this algorithm? It’s easy to determine, because the two loops are independent of each other: \(O(|V| \times |E|)\). (This is assuming the adjacency list representation, where we can iterate over all the edges in \(O(|E|)\) time. In the adjacency matrix, iterating over all edges takes \(O(|V|^2)\) time, which means that the total runtime is actually \(O(|V|^3)\)!)
An example of running the Bellman-Ford algorithm:
…
Dijkstra’s algorithm
Bellman-Ford is an exhaustive algorithm: it basically tries everything, which allows it to work (or at least, terminate) even in the presence of negative-weight cycles. We can do better if we assume that there are no negative-weight cycles and take a greedy approach. This means that instead of trying everything, we’re going to give priority to the option that “looks” the best. In this case, the “best” means, smallest distance: the next node to be processed will be the node with the smallest total distance.
Dijkstra’s algorithm is similar to breadth-first search in that it uses a queue,
however, it uses a min-priority queue, where items are enqueued with a priority,
and the item with the smallest priority is the next one to be dequeued. The
priority we’ll use is the node’s current distance estimate. Note that when
we relax
an edge, this might change the distance to the node on the other
end, which in turn might change its position within the heap!
A review of min-heaps:
Insert: add at the next available space, and the swap up.
Extract-Min: remove root, replace with last node, and then swap down.
Reduce: new, this allows us to lower the value of a node already in the heap. If effectively works the same way as insert, swaping the newly decreased value up until it gets to its proper place.
When running Dijksta’s algorithm, for every time through the main loop, we “lock in” another node. In order for this to work, we must process nodes in the order they occur along the shortest path.
Set the distance to all nodes other than the start to \(\infty\). Set the starting node’s distance to 0.
Enqueue all nodes. (Note that the only node with a non-infinite distance will be the starting node, so it will be the first node to be dequeued.)
While the queue is not empty:
Let
a = q.extract_min()
Relax every outbound edge from
a
, updating queue priorities as needed
Example…
This algorithm is effectively the original BFS but with path weights substituted for the simple path length “distance” that that algorithm relied on. The same property applies here: At any given time, the “fringe” will be at a certain distance (total path weight) from the starting node, and the algorithm proceeds by expanding the fringe.
The complexity of Dijsktra’s algorithm is complicated by the fact that each
time we dequeue an element, we update the distances of its neighbors, which
may dramatically restructure the queue. Whenever we relax
a node, reducing
its distance, we must also use the heap algorithm reduce
to adjust its
position within the queue, which requires \(O(\log n)\) time in terms of the
size of the queue. But the queue is always shrinking, because we remove one
node from it each time through the loop, and new nodes are never added into it.
Interestingly, we actually have two choices as to how to implement the heap:
We can not implement it at all: instead, just keep an array of the priorities. Finding the minimum requires \(O(|V|)\) time, because we must scan through the entire array, but changing the priority takes \(O(1)\) time, because we don’t actually have to move anything.
This leads to a complexity of \(O(|V|^2)\), which may be good if the graph is dense.
If we use a min-heap, then each
fix_up
takes logarithmic time, leading to \(|E| \log |V|) which may be better for sparse graphs.
This assumes we are using an adj. list. Using an adj. matrix adds an extra \(|V|\) to find the neighbors of a node.
Note that Dijkstra’s algorithm does not reliably work if there are any negative edges (it may produce incorrect results even if there are non negative-weight cycles). Dijkstra’s algorithm fails in the presence of negative-edges because it is greedy, it commits too early to what appears to be the shortest path. If there are no negative weight edges, then any additional edges will at best leave the path weight unchanged (if the weight is 0), so it makes sense to choose the best case we have right now. But if there are negative edges, any amount of seemingly-bad choices right now could be completed undone by a large negative edge later on. Example:
Weighted undirected edge algorithms
We’ll only look at one algorithm on weighted undirected graphs: Prim’s algorithm for minimum spanning trees.
A minimum spanning tree is a subset of \(E\) that still connects all the vertices, has no cycles, and where the sum of all the edge weights is as small as possible. For example:
…
There are two general approaches to building a MST:
Grow the tree by adding nodes to it
Grow the tree by adding edges to it.
The first methods requires maintaining a “forest” of separate trees which are only joined into a single tree at the very end, so we’ll look at the second method. This works by picking a starting node (because all nodes must be in the tree, it doesn’t matter which) and then picking an edge to one of its neighbors to add to the tree.
The key idea is to imagine that we already have a partial MST: some vertices and edges that have already been selected to be “in” the MST. We are now looking at new edges to add to the tree. In order to add an edge, it must fullfil two criteria:
It must not form a cycle (i.e., must not link two nodes that are both already in the tree)
After adding it to the tree, the resulting tree must still be a partial MST.
We call an edge “safe” if it fullfils both criteria. The MST algorithm can be thought of as just repeatedly finding and adding “safe” edges until a tree is formed.
The first criteria is easy, especially if we grow the tree by edges: we already node when a node is in the tree, so when we add an edge we just have to make sure that one of its end points is in the tree, and the other is not in the tree.
The second criteria is more difficult to quantify. When does adding a (non-cycle-forming) edge preserve the partial-MST property? Let us look at all the edges that fit the first criteria: they touch one node inside the tree, and one node outside it. Example:
Which of the highlighted edges is “safe”? The one with the minimum weight of all the edges. We call an edge like this a light edge: it’s a major theorem of the spanning tree algorithm that light edges are always safe.
It might not be obvious why the minimum weight edge would be safe. Couldn’t we run into a situation like in the shortest-weight-path algorithm, where we think we have found the smallest weight, but later we discover a better option? As it turns out, no, this cannot happen. Remember that we assume that we already had a partial MST to start with: if we already have a MST, then growing it by the smallest weight light edge is the correct choice. If we don’t have a MST, then growing it by a minimum-weight edge might be the wrong choice, but if we start with the smallest possible MST (a single node) and grow it safely every time, this situation will never arise.
This brings us to the implementation of Prim’s Algorithm, which, like Dijkstra’s, relies on the use of a min-priority queue to keep track of what to do next.
Set the starting node’s priority to 0, set every other node’s priority to \(\infty\), and set the parent pointers of all other nodes to
nullptr
.Enqueue all nodes (again, noting that the starting node will be the first node to be dequeued).
While the queue is not empty:
Let
a = q.extract_min()
For every
b
which is adjacent toa
and still in the queue:If
weight(a,b) < b.d
, setb.d = weight(a,b)
andb.parent = a
. (Note that becauseb.d
isb
‘s priority, this will trigger areduce
heap operation.)
At any given time, the queue contains the nodes which are still outside the tree. A nodes “distance” is repurposed as the minimum weight of all edges that connect it to a node inside the tree. The parent pointers form the tree itself, by connecting all nodes back eventually to the starting node.
Example:
The runtime of Prim’s algorithm is \(O(|E| \log |V|)\), again assuming the adj. list representation.