Given a connected graph \(G\), with positive edge weights, the
**spanning tree** is a subgraph \(T\) that is both a tree (connected
and acyclic) and **spanning** (contains all the nodes).

There is often a benefit to finding the *minimum* spanning tree (the sum
of all the edges is minimized).

## Properties

- The minimum spanning tree exists and is unique if the edge weights are all distinct and the graph is connected.
- The crossing edge of minimum weight is in the MST. To see this, consider a MST that does not have that crossing edge. It will, however, have another crossing edge (otherwise the tree would not span the cuts). Note that if we now add the minimum crossing edge, we’ll necessarily have a cycle. If I now delete the other crossing edge, we no longer have a cycle, and our overall weight is less.
- If any edge is not in a cycle, then it must be part of the MST.
- If \(T\) is a connected subgraph of \(G\) with strictly positive edge weights, and \(T\) is minimally weighted, than it is an MST (i.e. it has no cycles).
- If for every cut in \(G\), the lightest crossing edge is unique, then the MST is unique.

## Kruskal’s Algorithm

Sort all edges by weight. Keep adding edges to your tree by this order. If any edge creates a cycle in \(T\), simply continue to the next edge. Once you have \(N-1\) edges, you have your tree.

For the proof of correctness, this is simply a special case of the brute force method (which I have not described): Consider any new edge I’m about to add. Consider the cut where I take all the nodes currently in \(T\) from one side of this edge. There is no existing crossing edge, as the nodes in the cut form a connected component with respect to \(T\). And by design no potential crossing edge has lower weight.

The potential challenge is to find the cycle at each step. If we used DFS, the cost to find a cycle would be \(N+M\). But if we use the data structure in union find (say, using a tree with compression), then this is simply a check if two nodes are in the same equivalence class, which takes \(\log^{*}N\) time.

So the overall complexity is \(O(M\log M)\).

## Prim’s Algorithm

Start with any node, and add the minimum edge connected to it. From here
onwards, add the edge that has minimum weight *with only one end
connected to* \(T\). In other words, start scanning the edges in
minimum weight order until you find one that connects to \(T\). Stop
once you have \(N-1\) edges.

The proof that this works: It is a special case of the greedy algorithm. If \(e\) is the new edge you’re adding, let the cut be \(T\). We have no crossing edge of lower weight, and no other crossing edge already exists.

How can we find the minimum weight edge with only one edge on \(T\)? Use a priority queue (binary heap will do). Maintain this heap using the weight as the ordering. The heap only has edges with one edge on \(T\). When you add a new node, delete the edge that added it, and add in all new edges the new node brings in (again, with only one node in \(T\) for each edge).

In time, this binary heap will have redundant edges (e.g. if it had 3 edges that could connect to node 3, you’ll use at most one of them). This is OK. There exist implementations that are not “lazy” (known as “eager”), but this lazy one is often good enough.

Complexity is \(O(M\log M)\).

## Random Notes

There is no known *deterministic* algorithm that is linear in the number of edges, but
no one has proven it doesn’t exist. For practical purposes, there is an
algorithm that is kinda there: It is \(O(M\log^{*}N)\), and this is
bounded above by \(6M\) in our universe.

There does exist a *randomized* algorithm that is on average linear.