Graph 表示物件之間的關係,並且最重要的是 Edge
Content:
+ 起源
+ Application of Graphs
+ Graph composition
+ Path and Connectivity
+ Connected component
+ Trees in Graph
+ Graph 最重要的兩種表示法
+ Graph Traversal
+ Breadth-First Search (BFS)
+ Depth-First Search (DFS)
+ Connectivity in Directed Graphs
+ DAGs and Topological Ordering
— — — — — — — — — — — — — — — —
+ Greedy algorithms
+ Spanning Tree
+ The Shortest Path Problem
— — — — — — — — — — — — — — — —
+ TSP (Travel Salesman Problem)
+ AOV
+ AOE
+ Flow Network
+ References
起源
18世紀的 Euler ( 也是 Graph 之父) 的 The seven bridge problem
The seven bridge problem
Q: Is is possible to walk across all the bridges exactly once and return to the starting land area?
A: Euler proofs that it is impossible (因為與土地連接的橋為奇數,必須要是偶數)
Application of Graphs
- Analysis of electrical circuits (E.g. VC 電路)
- Finding shortest routes (E.g. 導航、Google map)
- Project planning
- Identification of chemical compounds
- Social sciences
Graph composition
- A graph
G = (V, E)
=>V
: vertex(ices) = node(s),E
: Edge e = {u, v}
(e
是nodeu
andv
的 edge)
Very often the edges of a graph habe weights associated with them
- An undirected graph:
{u, v} == {v, u}
- A directed graph:
(u, v) != (v, u)
if e = (u, v)
then u 是 v 的鄰居 (但 v 不是 u 的鄰居) => Adjacency
Adjacency: v is one of u’s neighbor and there is an edge (u ,v)
|V|
= the number of nodes =n
|E|
= the number of edges =m
- 我們希望花費
< O(m+n)
時間讀取G
()有方向性,{}沒有方向性
Path and Connectivity
- Path: a sequence of vertices
w1, w2, w3,…, wN
where(wi, w(i+1)) ∈ E
,∀ 1 <= i <= N
- node
u
andv
is connected, if there is a path fromu
tov
(不需要有 edge). - distance:
u
andv
間經過的最少 edge 數
Connected components
A connected Component: 一群 nodes 可以到達彼此
Strongly Connected Components
E.g.
v = a, b
e = (a, b), (b, a)
Find all nodes reachable from s
// R will consist of nodes to which s has a path
1. initialize R = {s}
2. while (there is an edge (u, v) where u∈R and v∉R)
R = R + {v}
=> Proof:
要證明的點:
R
中的 nodes 是否最終都能連到s
(has a path)R
外不能有漏網之魚
反證:
假設在 R
之外存在 node w
,卻和 node s
(R
內) 間有 path
=> 必然存在一個在 R
內的 node u
與在 R
外的 node v
相連
=> an edge (u, v)
+ u∈R
and v∉R
=> 與演算法 conflict
Trees in Graph
connected and no cycle
property
- G is connected
- G does not contain a cycle
- G has n-1 edges
=> G is a tree if it satisfies any two of the three statements,並且任意兩個 statements 可以推導第三 statement
Graph 最重要的兩種表示法
- Adjacency Matrix
- Adjacency List
儲存所有 node 之後可以產生應用,例如找出 spanning tree
Adjacency Matrix
The Adjacency Matrix of G
is an n*n
matrix A
where
A[u, v] = 1, if (u, v) ∈ E
A[u, v] = 0, otherwise
- Space:
𝚯(|V|²)
, good for dense, not for sparse (不適合稀疏) - Undirected graph: symmetric matrix
Adjacency List
- An Array for nodes
- 每個 index 掛上它的鄰居
- Space: O(|V|+|E|), good for sparse
- 比 Matrix 好用
Adjacency: v is one of u’s neighbor and there is an edge (u ,v)
Adjacency Matrix vs. Adjacency List
- Faster to find an edge? Matrix =>
O(1)
- Faster to find degree (連出去的 edge)? List =>
time < O(n)
- Faster to traverse the graph? List =>
time = O(nodes + the number of edges)
< Matrix:O(n²)
- Storage for sparse graph? List
- Storage for dense graph? Matrix
- Edge insertion or deletion more easily? Matrix
- Better for most applications? List
Graph Traversal
1. Breadth-First Search (BFS)
2. Depth-First Search (DFS)
Breadth-First Search (BFS)
- 一層一層向下搜
- 使用 queue
應用
例如小畫家倒水上色
Adjacency list is ideal for implementing BFS
=> 演算法在 Lec07 演算法 第四週課程 (1/2) 1:15:45
Testing Bipartiteness
A bipartite graph is a graph whose nodes can be patitioned into sets X and Y in such a way that every edge has one one end in X and the other end in Y (Edge 必定是在兩個 set 之間,同一個 set 間不會有 edge)
- 等於 Two coloring 問題
應用:晶圓製程
Depth-First Search (DFS)
- Data structure: stack
Connectivity in Directed Graphs
- Strong component (Strongly connected component): 一個 ”maximal” set 裡面的點皆互為 mutually reachable
Theorem: For any two nodes
s
andt
in a directed graph, their strong components are either identical or disjoint.
Can we determine if a graph is strongly connected in linear time?
Yes.
1. pick any node s in G
2. R = BFS(s, G) // save all reachable nodes in R set
3. Rrev = BFS(s, Grev) // Grev = 所有 edge 方向翻轉
4. if (R==V==Rrev) then return true else falseTime: O(n)
DAGs and Topological Ordering
Directed Acyclic Graphs (有向無環圖)
- a.k.a DAG
- A directed graph without cycles
- 當問題有依存關係 (dependency) 或者先後關係 (precedence constraints) 經常使用
4 drivers come to the junction “simultaneously”, who goes first?
- 四台車「同時」到交叉路口,誰先走?
=> Deadlock! Dependencies from a cycle
Topological Ordering
- 把 directed graph 裡所有的點拿出來做排序
Lemma: If
G
has a topological ordering, thenG
is a DAG (反之亦然)Lemma: In every DAG
G
, there is a node with no incoming edges.
TopologicalOrder(G)
1. find a node v without incoming edges
2. order v
3. G = G - {v} // delete v from G
4. if(G is not empty) then TopologicalOrder(G)
- Time: from O(n²) to O(m+n)
Greedy algorithms
- Interval scheduling: The greedy algorithm stays ahead (永遠取第一名的)
- Scheduling to minimize lateness: An exchange argument
- Shortest paths
- The minimum spanning tree problem
- implementing Kruskal’s algorithm: Union-find
What is a greedy algorithm?
- 一種短視近利的方法
- Build up a solution in small steps, choosing a decision at each step myopically => 以當下最好的解作為選擇 => Intuitive and fast but usually not optimal
- 兩種常用證明 optimal 的方式:stays ahead, exchange argument
The Interval Scheduling Problem
- 應用:幫 CPU 排程、演藝廳借場地
- 要解決的問題:在固定時間內完成最多 request
- Single resource => 單核
Greedy Rule
1. Repeat
=> use a simple rule to select a first request i1
=> Once i1 is selected, reject all request not compatible with i1.
2. Until run out of requests
- Better chosen rule: earliest finish time (even better then fewest conflicts) => principle: 最早釋放 resource
Interval-Scheduling(R)
// R: undetermined requests; A: accepted requests
1. A = {}; //空集合 (empty set)
2. while (R is not empty) do
3. choose a request i ∈ R with minimum end-time(i) // greedy rule => sorting 由小到大 => O(nlogn)
4. A = A + {i}
5. R = R - {i} - X, where X = {j: j ∈ R and j is not compatible with i} => 每個 request 比較 => O(n^2)
6. return ARunning time from O(n^2) to O(nlogn)
用 construct S 紀錄 all requests 開始的時間
=> 加速 3~5 step to O(n)
- Proof by induction:
=> 假設 O
為最佳解 ,我們要證明我們的解 A
, |A| = |O|
=> Stay ahead: A
的每一步都比 O
快
The Interval partitioning Problem
- Multiple resources
- 希望用最少的 resource 完成全部的 requests (這裡等於 intervals)
=> 先看 depth (單一時間軸重疊的 requests 數, resources 至少要大於這個數目) => 通常用 depth 數量的 resources 就能 schedule all requests
Interval-Partition(R)
1. {l1, ..., ln} = sort intervals(requests) in ascending order of their start time
2. for j from 1 to n do
// label {1, 2, ..., d} == the number of resources
3. exclude the labels of all assigned intervals that are not compatible with lj
4. if(there is a nonexcluded label from {1, 2, ..., d}) then assign a nonexcluded label to lj
6. else leave lj unlabeled // 這行寫了但是不會發生// 每個 label 都有一個 end time
Scheduling to minimize lateness
- A single resource and each request with a deadline
- Goal: Schedule all requests without overlapping so as to minimize the max. lateness => 只取最大遲交值,沒有累加
- Greedy rule: Earliest deadline first
Min-Lateness(R, beginTime)
1. {d1, ..., dn} = sort requests in ascending order of their deadlines
2. endTime, beginTime = 0
3. for i from 1 to n do
4. assign request i to the time interval from (beginTime(i) = endTime) to (endTime(i) = endTime+ ti)
5. endTime = f + ti
6. return the set of scheduled intervals
- No idle time => 沒有空閒時間
- Exchange argument: Gradually transform an optimal solution to the one found by the greedy algorithm without hurting its quality. => 超常用證明法
- No Inversions
An inversion: 有人 deadline 比較晚卻比較早放進 schedule 中
Theorem: The greedy schedule S is optimal
Proof by contradiction
+ Let O be an optimal schedule with inversions // 不管是怎麼找到的
+ Assume O has no idle time //有就壓縮
+ If O has no inversions, then S = O. (done)
+ If O has an inversion, let i-j be an adjacent inversion
=> Swapping i and j does not increase the max. lateness and strictly decreases the number of inversions. // cut and paste
=> This contradicts definition of O.
Spanning Tree
- A tree consisting solely of edges in
G
and including all vertices inG
in called a spanning tree - = 希望用最少的 edge 把
G
中的所有頂點都連接起來 - 可以用 DFS, BFS 找出 spanning tree
- no cycle
Minimal Cost Spanning Tree
- Four greedy-method algorithms to find: Kurskal’s algorithm, Prim’s algorithm, Sollin’s algorithm and reverse-delete algorithm
- Cut property: Let
S
be any subset of nodes, and lete = (v, w)
be the minimum cost edge with one end inS
and the other inV-S
. Then every MST containse
. - Cycle property: Let
C
be any cycle inG
, an lete = (v, w)
be the maximum cost edge inC
. Thene
does not belong to any MST.
Kurskal’s algorithm
- 每次都選最小 cost 的「邊」,若產生 cycle 就跳下一個邊
Kruskal(G, C)
1. {all edges in ascending order of their costs}
2. T = {}
3. for each ei = (u, v) do
4. if (u and v are not connected by edges in T) then // different subtrees
5. T = T + {ei} // merge these two corresponding subtrees
- Data structure: Union-find => represents disjoint sets (很多集合但裡面元素不重複) => each set has a representative => merge sets with respective representatives
disjoint set: 例如一所學校的班級
Prim’s algorithm
- 任選一個「頂點」開始往外擴散
- 與 Dijkstra’s Algorithm 類似
- Data structure: min heap
- Cut property
Prim(G, c)
// S: the set of explored nodes
// for each u ∈ S, we store a shortest path distance c(u) (tree edge)
1. init S = {s}, d(s) = 0
2. While S != V do // 找最靠近的新的點
3. select a node v ∉ S with at lease one edge from S for which d'(v) = min(c)
4. add v to S and define d(v) = d'(v)
Sollin’s algorithm
- 一次選多個邊
- 不重要
Reverse-delete algorithm
- 原理:
n
個 node 只需要n-1
條 edge - 一開始看整個 Graph,從 cost 大到小刪除 edge
- 基本沒人用
Spanning Tree Protocol
- 用於 OSI 第二層,應用是防止冗餘鏈路產生,避免佔用 Router 資源
Biconnected Components
<空>
The Shortest Path Problem
Given:
+ Directed graph G = (V, E)
=> Length le = length of edge e = (u, v) ∈ E
+ Source s
Goal:
Shortest path Pv from s to each other node v ∈ V - {s}
=> 到所有 nodes 的最短路徑總和
Dijkstra’s Algorithm
Edsger W. Dijkstra 1959
- Edge 沒有負值時使用
- 用 min heap 儲存每隔點的最小距離
d
Dijkstra(G, l)
// S: the set of explored nodes
// for each u ∈ S, we store a shortest path distance d(u) from s (start node) to u
1. init S = {s}, d(s) = 0
2. While S != V do // 找最靠近的新的點
3. select a node v ∉ S with at lease one edge from S for which d'(v) = min(d(u)+le)
4. add v to S and define d(v) = d'(v)
Bellman and Ford Algorithm
- Edge 可以為負值
- 可以一次走 k 個邊 (k>0)
- Dynamic programming
Condition: no negative cycle
Negative Cycles
- 假如存在 negative cycle ,the shortest path 無法吐出最短路徑
- Bellman and Ford 可以找出 negative cycle
All pairs
<空>
TSP (Travel Salesman Problem)
- 可以用 Dynamic programming 但卻不是最佳解的原因是: TSP 不是 nature order
AOV (Activity on Vertex)
- A directed graph
- the vertices represent tasks or activities
- the edges represent precedence relations between tasks
Topological order
- AOV 任兩個 vertex 的先後順序會 maintain 在 output sequence 上
- order 不是唯一
應用
- 大學修課系統 => 例如:要先修 programming I 才能修 Data Structures
AOE (Activity on Edge)
- 計算每一個 node 最早的到達時間
Critical Path Analysis
- 找出起點和終點之間,不允許 delay 的 nodes
應用
- 控制 project 時間
Flow Network
有「物質」在節點中流動
A flow network G = (V, E)
is a directed graph
- 每一條 edge 可以看做「水管」,有 capacity
- 有 source node (generates the flow), sink node (absorbs the flow), internal node(剩下的 node)
Examples
- Water pipes and water
- Transportation network and traffic (運輸管理用)
- Computer network and packets
- Circuit network (wires) and current
應用
資源分配問題
The Maximum Flow Problem
properties:
- Capacity conditions:
0 ≤ f(e) ≤ ce
=> 每條水管的水量必須大於0且小於水管 capacity - Conservation conditions: 除了 source and sink ,其他 internal node 的送進水流=送出水流
Q: Can we find the upper bound of the flow? (確認水流上限)
A: Yes!
Find cuts of min capacity = find max flow
Ford-Fulkersion: Residual Graph
Procedure
1. Start with f(e) = 0 for all edge e ∈ E and construct the residual graph
2. Find an source-sink path P in the residual graph
3. Augment flow along path P and update the residual graph
4. Repeat steps 2-3 until you get stuck
Bipartite Matching
A bipartite graph is a graph whose nodes can be patitioned into sets X and Y in such a way that every edge has one one end in X and the other end in Y (Edge 必定是在兩個 set 之間,同一個 set 間不會有 edge)
用 maximal flow 解 Bipartite matching
References
- 江蕙如教授。Lec06 演算法 第三週課程 (2/2) (2015)
- 江蕙如教授。Lec07 演算法 第四週課程 (1/2) (2015)
- 江蕙如教授。Lec20 演算法 第十五週課程 (1/2) (2015)
- 彭文志教授。Lec18 資料結構 第十二週課程 (2013)