What’s a good Algorithm? -> run quickly, small memory
Content:
+ Analyze an “efficient” algorithm
+ Notation O, 𝛺, 𝛩 (Theta)
+ 常見演算法
+ Graph
+ Divide and conquer
+ Dynamic programming
+ References
Analyze an “efficient” algorithm
- Good scale-ability, Platform-independent (與執行的機器無關), Instance-Independent (與 input 無關)
- We focus on the worse-case
- E.g. better than brute-force
- Or if polynomial running time, it is efficient
Method: check primitive computational step 確認基本運算步驟
Polynomial running time: with constants c>0, d>0 and input N, the running time is bounded by cN^d
Notation O, 𝛺, 𝛩 (Theta)
找最小的 upper bound 和最大的 lower bound 才有意義
Big-O (upper bound)
E.g. T(n) = 3n+2
, F(n) = n
If T(n) = 3n+2
= O(n)
=> 3n+2 < c*n
當c = 4
時=> 存在
Properties of Notation O, 𝛺, 𝛩 (Theta)
- Transitivity
- Rule of sums
使用時機:在算一個演算法需要花的時間時,不用計算所有步驟的加總,計算最花時間的步驟即可
- Rule of products
使用時機:For loop, while loop
- Transpose symmetry
- Reflexivity
- Symmetry
常見演算法
1. Gale-Shapley
Gale-Shapley => best matching
1. initialize each person to be free
2. while(some man m is free and hasn’t proposed to every woman) do
3. w = highest ranked woman in m’ s list to whom m has not proposed yet
4. if(w is free) then
5. m and w become engaged
6. else if(w prefers m to her fiancé m’) then
7. m and w become engaged
8. m’ becomes free
9. return the set S of engaged pairs
性質:主動有利、每次執行產生相同結果
應用:大學入學分配
Graph
請到這篇
Divide and conquer
- Divide: break the input into several parts of the same type
- Conquer: solve the problem in each part recursively
- Combine: combine the solutions to sub-problems into an overall solution
Binary search
- Check the middle element
- Search the sub-array recursively
Merge sort
- John von Neumann
- 1945
- Merge sort i s often the best choice for sorting a linked list
- Time: O(nlogn)
Mergesort(A, p, r)
// A[p..r]: initally unsorted
1. if (p < r) then
2. q = (p+r)/2 // 找 floor
3. Mergesort(A, p, q)
4. Mergesort(A, q+1, r)
5. Merge(A, p, q, r)Running time:
Divide: lines 1-2 => O(1)
Conquer: lines 3-4 => ?
Combine: lines 5 => O(n)
Counting Inversions
- 應用: Music site consults database to find people with similar tastes
Divide: separate the list into two pieces
Conquer: count inversions in each half recursively
Combine: count inversions where ai, aj are in different halves, and return sum of three quantities
- 讓「比較」加速的方式:在 conquer 的過程中順便排序
Sort-and-Count(L, p, q)
// L[p..q]: initially unsorted
1. if(p=q) then return 0
2. else
3. m = (p+q)/2 // floor
4. rp = Sort-and-Count(L, p, m)
5. rq = Sort-and-Count(L, m+1, q)
6. rm = Merge-and-Count(L, p, m, q)
7. return r = rp + rq + rm
Closest Pair of Points
- Given: A set of
n
points on a plane,pi
is located at(xi, yi)
- Find: 兩兩間最近的點 => Euclidean distance (直線距離)
Divide: draw vertical line L so that half points on each side.
Conquer: find the closest pair d in each side recursively.
Combine: find the closest pair with one point in each side; return best of 3 solutions.
- Find the closest pair with one in each side
L = {(x, y): x = x* = 左側最大的 x}
d = the smaller one of these two pairs // 左右側選最小距離的點
- Observation: only need to consider points within d of line L (只需要考慮離 L 左右一個 d 距離的所有點)
從上到下只要找
15
個點就會有結果
Dynamic programming
Dynamic “programming” came from the term “mathematic programming” => 最佳化的意思
- Principles of dynamic programming =>
Memoization
oriteration
over subproblems - 以數學歸納法的方式思考
- In a nutschell, DP is recursion without repetition
Key for DP
DP can be used if the problem satisfies the following properties
- A polynomial number of subproblems not exponential (?)
- 最大的解是由比較小的解組成
- (Important!) 有一個自然的計算順序
Standard operation procedure for DP
- Start with devide-and-conquer
- Show that the number of different instances of your recurrence is bounded by a polynomial
- Specify an order of evaluation for the recurrence so you always have what you need
Weighted Interval Scheduling
Give: A set of n intervals with start/finish times and weights(values)
Find: A subset S of mutually compatible intervals with max total values
可以與 The Interval Scheduling Problem (Greedy algorithm) 進行比較
First attempt (第一感): Induction based on time?
t'(< t) 所貢獻的所有 values (Induction step two) + t 所貢獻的 value
=> t’
要取多少不好控制 (Bad idea)
Second attempt: Induction based on interval index
1. First of all, sort intervals in ascending order of finish times
2. 遞迴 OPT(j) = 找第 1..j 個 interval 的最大值
=> OPT(j) = max{ interval j + j 之前的 OPT, OPT(j-1)}計算:
//Preprocessing
1. Sort intervals by finish times f1 <= f2 <=...<= fn
2. 找出 p(1)...p(n) // 找出與自身 interval 沒有 overlap 的個數//Opt(j)
1. if (j==0) return 0
2. else return max{ interval j + j 之前的 OPT, OPT(j-1)}
=> 加速:記起來已算好的解 (memoization)
//Opt(j)
1. if (j==0) return 0
2. else if (M[j] is not empty) return M[j] // M 存放已算過的值
3. else return M[j] = max{ interval j + j 之前的 OPT, OPT(j-1)}
- Iteration: Bottom-up
1. M[0] = 0 // 小抄 table
2. for j = 1, 2, ..., n do
3. M[j] = max{vj + M[p(j)], M[j-1]} // v = value
Fibonacci sequence
With memoization:
1. init a array f => f[0...n] = -1 // -1: unfilled
2. set f[0] = 0, f[1] = 1fib(n, f:array)
1. if f[n] == -1 then
2. f[n] = fib(n-1, f) + fib(n-2, f)
3. return f[n] //if[n] already exists, directly return
Summary
Memoization
: Top-down, recursiveIteration
: Bottom up (the smallest sub-problom to the largest one), iterative