
Graph G=(V,E)G=(V, E)G=(V,E):

  1. A set of locations V={v1,v2,...,vn}V=\{ v_1, v_2, ..., v_n\}V={v1​,v2​,...,vn​}
  2. 任意两个 node vi,vjv_i, v_jvi​,vj​ 可以构成一个 edge e=(vi,vj)e=(v_i, v_j)e=(vi​,vj​)
  3. 每条边 eee has a positive cost cec_ece​

Find a subset of the edges T⊆ET \subseteq ET⊆E so that the graph (V,T)(V, T)(V,T) is connected, and T=argminT∑e∈TceT = \text{argmin}_{T} \sum_{e \in T} c_eT=argminT​∑e∈T​ce​ (让 total cost 取最小)


  1. Kruskal’s Algorithm
    1. Start without any edges
    2. Insert edges from EEE in order of increasing cost
    - 若 eee 不 create a cycle,则添加此 edge
    - 若 eee create a cycle, 删除此 edge
  2. Prim’s Algorithm
    1. Start with a root node sss, i.e., S={s}S = \{s\}S={s}
    2. add the node v=argmin{e=(u,v):u∈S}cev = \text{argmin}_{\{e=(u, v): u \in S\}} c_ev=argmin{e=(u,v):u∈S}​ce​
  3. Reverse-Delete Algorithm
    1. Start with the full graph (V,E)(V, E)(V,E)
    2. Delete edges in order of decreasing order
    - 若 eee 没有 disconnect the graph, 则添加 eee
    - 若 eee disconnect the graph, 则删除 eee


定理 (4.16): Let TTT be a minimum-cost solution to the network design problem defined above. Then (V,T)(V, T)(V,T) is a tree.
由定义知,(V,T)(V, T)(V,T) 是 connected 的,要证明其中不包含 cycle.
假设 (V,T)(V, T)(V,T) 包含一个 cycle CCC, 令 eee 是 CCC 中的一条 edge.
We claim that (V,T−{e})(V, T-\{e\})(V,T−{e}) is still connected
∵\because∵ any path that previously used the edge eee can now go “the long way” around
\quad the remainder of the cycle CCC instead.

定理 (4.17): 假设所有边之间的 cost 不同。 令 SSS be any subset of nodes that is neither empty nor equal to all of V。令 e=(v,w)e=(v, w)e=(v,w) be the minimum cost edge with one end in SSS and the other in V−SV-SV−S。 那么, every minimum spanning tree contains the edge eee.
证明: (exchange argument)
【思路:e′∈Te' \in Te′∈T and ce′>cec_{e'}>c_ece′​>ce​, WTS 交换 eee 和 e′e'e′ 可以得到另一个 spanning tree 】
Let TTT be a spanning tree that does not contain eee
设 The ends of eee are vvv and www.
∵\because∵ TTT 是 spanning tree
∴\therefore∴ 在 TTT 中有一条 path PPP from vvv to www.
从 node vvv 开始,沿着 PPP 走:
-> there is a first node w′w'w′ on PPP in V−SV-SV−S
-> the node just before w′w'w′ on PPP is v′∈Sv' \in Sv′∈S
Let e′=(v′,w′)e'=(v', w')e′=(v′,w′), exchange e′e'e′ and eee we get a set of edges T′=T−{e′}∪{e}T' = T - \{e'\} \cup \{e\}T′=T−{e′}∪{e}
【证明 T′T'T′ 是 connected 的】
∵(V,T)\because (V, T)∵(V,T) 是 connected 的
∴\therefore∴ vvv 和 v′v'v′ 之间有一条 path ppp ; www 和 w′w'w′ 之间有一条 path qqq.
∴\therefore∴ 任意用到 edge e′e'e′ 的 path 可以
1.走到 v′v'v′
2.走到 vvv
3.沿 eee 走到 www
4.走到 w′w'w′
【证明 T′T'T′ 是 acyclic 的】
在 T′∪{e′}T' \cup \{e'\}T′∪{e′} 中,只有一个 cycle 就是 eee 和 path PPP 构成的
这个 cycle 不在 T′T'T′ 中,因为 e′e'e′ 被删除了

定理 (4.18): Kruskal’s Algorithm produces a minimum spanning tree of G.
eee added by Kruskal’s Algorithm
Let SSS be the set of all nodes to which vvv has a path at the moment just before eee is added.
显然 v∈S,w∉Sv \in S, w \notin Sv∈S,w∈/​S (因为添加 eee 不会产生 cycle)
∵e\because e∵e 是 cheapest edge 且 一端在 SSS 一端在 V−SV-SV−S
定理 (4.17) 可知 e∈e \ine∈ every minimum spanning tree.

定理 (4.19): Prim’s Algorithm produces a minimum spanning tree of G
同上。每次添加的 edge eee 是 cheapest edge with one end in SSS and the other end in V−SV-SV−S.

定理 (4.20): 假设所有边之间的 cost 不同。Let CCC be any cycle in GGG, and let edge e=(v,w)e=(v, w)e=(v,w) be the most expansive edge belonging to CCC. Then eee does not belong to any minimum spanning tree of GGG.
证明:(exchange argument)
从 TTT 中删除 eee,this partition the nodes into two components: v∈Sv\in Sv∈S and w∈V−Sw\in V-Sw∈V−S.
The edge of CCC other than eee form a path PPP with one end at vvv and the other at www.
如果我们沿着路径 PPP 从 vvv 到 www, 即从 SSS 开始,到 V−SV-SV−S 结束。
那么,∃e′\exist e'∃e′ on PPP that crosses from SSS to V−SV-SV−S
若 T′=T−{e}∪{e′}T'=T-\{e\} \cup \{e'\}T′=T−{e}∪{e′}, 类似(4.17)证明,可知 (V,T′)(V, T')(V,T′) 是 connected & acyclic.
所以 T′T'T′ 是一个 spanning tree of GGG.
∵e\because e∵e 是 most expensive edge on cycle CCC 且 e′∈Ce' \in Ce′∈C
∴ce′<ce\therefore c_{e'}<c_e∴ce′​<ce​ 即 T′T'T′ is cheaper than TTT

Implementing Prim’s Algorithm

时间复杂度:O(mlog⁡n)O(m \log n)O(mlogn)

(4.22) Using a Priority Queue, Prim’s Algorithm can be implemented on a graph with nnn nodes and mmm edges to run in O(m)O(m)O(m) time, plus the time for nnn ExtractMin, and mmm ChangeKey operations.

Implementing Kruskal’s Algorithm: The Union-Find Data Structure

时间复杂度:O(mlog⁡n)O(m \log n)O(mlogn)


对于一条边 e=(v,w)e=(v, w)e=(v,w),component 是相互联通的 node 的集合

  • 若 v,wv, wv,w 在不同的 component 中,那么添加此 edge
  • 若 v,wv, wv,w 在相同的 component 中,那么删除此 edge

The Union-Find Data Structure allows us to maintain disjoint sets (such as the components of a graph).

(1) MakeUnionFind(S)MakeUnionFind(S)MakeUnionFind(S) return a Union−FindUnion-FindUnion−Find data structure on set SSS where all elements are in separate sets.
【O(n)O(n)O(n) where n=∣S∣n=|S|n=∣S∣】

(2) For a node u∈Su \in Su∈S, the operation Find(u)Find(u)Find(u) will return the name of the set containing uuu. 【O(log⁡n)O(\log n)O(logn) or O(1)O(1)O(1)】

  • 若 Find(u)=Find(v)Find(u)=Find(v)Find(u)=Find(v),则 u,vu, vu,v 在同一个 component 中。

(3) Implement Union(A,B)Union(A, B)Union(A,B) to merge set AAA and set BBB. 【O(log⁡n)O(\log n)O(logn)】

  • 若 Find(u)≠Find(v)Find(u) \neq Find(v)Find(u)​=Find(v), 则 Union(Find(u),Find(v))Union(Find(u), Find(v))Union(Find(u),Find(v))


构建一个 array ComponentComponentComponent of size n=∣S∣n=|S|n=∣S∣, 其中Component[s]Component[s]Component[s] is the name of the set containing sss.

  • Initialize Component[s]=sComponent[s]=sComponent[s]=s for all s∈Ss \in Ss∈S.

Maintain an additional array sizesizesize of length nnn, where size[A]size[A]size[A] is the size of set AAA.

  • 当调用 Union(A,B)Union(A, B)Union(A,B) 时, use the name of the larger set for the union.

定理 (4.23): Consider the array implementation of the Union-Find data structure from some set SSS of size nnn, where unions keep the name of the larger set. The FindFindFind operation takes O(1)O(1)O(1) time, MakeUnionFind(S)MakeUnionFind(S)MakeUnionFind(S) takes O(n)O(n)O(n) time, and any sequence of kkk UnionUnionUnion operations takes at most O(klog⁡k)O(k \log k)O(klogk) time.


对于任意 node v∈Sv\in Sv∈S, 添加一个 associate pointer 指向 the name of the set that contains vvv.

  • MakeUnionFind(S)MakeUnionFind(S)MakeUnionFind(S): we initialize a record for each element v∈Sv\in Sv∈S with a pointer that points to itself (or null pointer)
  • Union(A,B)Union(A, B)Union(A,B): 假设 set AAA 是 node v∈Av \in Av∈A,set BBB 是 node u∈Bu \in Bu∈B。若我们用 vvv 作为合并之后的名字,我们仅仅需要 update uuu's pointer to point to vvv, 不需要更新 BBB 中其他 node 的 pointer。
  • 为了更新 size 更小的 component name,需要 maintain an additional field with the nodes: the size of the corresponding set.
    定理 (4.24): Consider the above pointer-based implementation of the Union-Find data structure from some set SSS of size nnn, where unions keep the name of the larger set. A UnionUnionUnion operation takes O(1)O(1)O(1) time, MakeUnionFind(S)MakeUnionFind(S)MakeUnionFind(S) takes O(n)O(n)O(n) time, and a FindFindFind operation takes O(log⁡n)O(\log n)O(logn) time.

