演算法筆記

Algorithm nodes

4 min readOct 14, 2020

不定期更新，若有錯誤請留言告知，感恩

What’s a good Algorithm? -> run quickly, small memory

Content:
+ Analyze an “efficient” algorithm
+ Notation O, 𝛺, 𝛩 (Theta)
+ 常見演算法
+ Graph
+ Divide and conquer
+ Dynamic programming
+ References

Analyze an “efficient” algorithm

Good scale-ability, Platform-independent (與執行的機器無關), Instance-Independent (與 input 無關)
We focus on the worse-case
E.g. better than brute-force
Or if polynomial running time, it is efficient

Method: check primitive computational step 確認基本運算步驟

Polynomial running time: with constants c>0, d>0 and input N, the running time is bounded by cN^d

Notation O, 𝛺, 𝛩 (Theta)

找最小的 upper bound 和最大的 lower bound 才有意義

Big-O (upper bound)

E.g. T(n) = 3n+2, F(n) = n

If T(n) = 3n+2 = O(n)

=> 3n+2 < c*n

當c = 4 時=> 存在

Properties of Notation O, 𝛺, 𝛩 (Theta)

Transitivity
Rule of sums

使用時機：在算一個演算法需要花的時間時，不用計算所有步驟的加總，計算最花時間的步驟即可

Rule of products

使用時機：For loop, while loop

Transpose symmetry
Reflexivity
Symmetry

常見演算法

1. Gale-Shapley

Gale-Shapley => best matching

1. initialize each person to be free
2. while(some man m is free and hasn’t proposed to every woman) do
3. w = highest ranked woman in m’ s list to whom m has not proposed yet
4. if(w is free) then
5. m and w become engaged
6. else if(w prefers m to her fiancé m’) then
7. m and w become engaged
8. m’ becomes free
9. return the set S of engaged pairs

性質：主動有利、每次執行產生相同結果

應用：大學入學分配

Graph

請到這篇

Graph 的演算法及資料結構

Graph’s algorithm and data structure

chengr4.medium.com

Divide and conquer

Divide: break the input into several parts of the same type
Conquer: solve the problem in each part recursively
Combine: combine the solutions to sub-problems into an overall solution

Binary search

Check the middle element
Search the sub-array recursively

Merge sort

John von Neumann
1945
Merge sort i s often the best choice for sorting a linked list
Time: O(nlogn)

Mergesort(A, p, r)
// A[p..r]: initally unsorted
1. if (p < r) then
2. q = (p+r)/2 // 找 floor
3. Mergesort(A, p, q)
4. Mergesort(A, q+1, r)
5. Merge(A, p, q, r)Running time:
Divide: lines 1-2 => O(1)
Conquer: lines 3-4 => ?
Combine: lines 5 => O(n)

Counting Inversions

應用： Music site consults database to find people with similar tastes

Divide: separate the list into two pieces
Conquer: count inversions in each half recursively
Combine: count inversions where ai, aj are in different halves, and return sum of three quantities

讓「比較」加速的方式：在 conquer 的過程中順便排序

Sort-and-Count(L, p, q)
// L[p..q]: initially unsorted
1. if(p=q) then return 0
2. else
3. m = (p+q)/2 // floor
4. rp = Sort-and-Count(L, p, m)
5. rq = Sort-and-Count(L, m+1, q)
6. rm = Merge-and-Count(L, p, m, q)
7. return r = rp + rq + rm

Closest Pair of Points

Given: A set of n points on a plane, pi is located at (xi, yi)
Find: 兩兩間最近的點 => Euclidean distance (直線距離)

Divide: draw vertical line L so that half points on each side.
Conquer: find the closest pair d in each side recursively.
Combine: find the closest pair with one point in each side; return best of 3 solutions.

Find the closest pair with one in each side

L = {(x, y): x = x* = 左側最大的 x}
d = the smaller one of these two pairs // 左右側選最小距離的點

Observation: only need to consider points within d of line L (只需要考慮離 L 左右一個 d 距離的所有點)

從上到下只要找15個點就會有結果

Dynamic programming

Dynamic “programming” came from the term “mathematic programming” => 最佳化的意思

Principles of dynamic programming => Memoization or iteration over subproblems
以數學歸納法的方式思考
In a nutschell, DP is recursion without repetition

Key for DP

DP can be used if the problem satisfies the following properties

A polynomial number of subproblems not exponential (?)
最大的解是由比較小的解組成
(Important!) 有一個自然的計算順序

Standard operation procedure for DP

Start with devide-and-conquer
Show that the number of different instances of your recurrence is bounded by a polynomial
Specify an order of evaluation for the recurrence so you always have what you need

Weighted Interval Scheduling

Give: A set of n intervals with start/finish times and weights(values)
Find: A subset S of mutually compatible intervals with max total values

可以與 The Interval Scheduling Problem (Greedy algorithm) 進行比較

First attempt (第一感): Induction based on time?
t'(< t) 所貢獻的所有 values (Induction step two) + t 所貢獻的 value

=> t’要取多少不好控制 (Bad idea)

Second attempt: Induction based on interval index
1. First of all, sort intervals in ascending order of finish times
2. 遞迴 OPT(j) = 找第 1..j 個 interval 的最大值
=> OPT(j) = max{ interval j + j 之前的 OPT, OPT(j-1)}計算:
//Preprocessing
1. Sort intervals by finish times f1 <= f2 <=...<= fn
2. 找出 p(1)...p(n) // 找出與自身 interval 沒有 overlap 的個數//Opt(j)
1. if (j==0) return 0
2. else return max{ interval j +  j 之前的 OPT, OPT(j-1)}

=> 加速：記起來已算好的解 (memoization)

//Opt(j)
1. if (j==0) return 0
2. else if (M[j] is not empty) return M[j] // M 存放已算過的值
3. else return M[j] = max{ interval j +  j 之前的 OPT, OPT(j-1)}

Iteration: Bottom-up

1. M[0] = 0 // 小抄 table
2. for j = 1, 2, ..., n do
3.   M[j] = max{vj + M[p(j)], M[j-1]} // v = value

Fibonacci sequence

With memoization:

1. init a array f => f[0...n] = -1 // -1: unfilled
2. set f[0] = 0, f[1] = 1fib(n, f:array)
1. if f[n] == -1 then
2. f[n] = fib(n-1, f) + fib(n-2, f)
3. return f[n] //if[n] already exists, directly return

Summary

Memoization: Top-down, recursive
Iteration: Bottom up (the smallest sub-problom to the largest one), iterative