Algorithm Design and Analysis - Overview

今天上了卜东波老师的第一次算法课,对算法的概况有了一个大体的了解,在这里对今天的内容做一些总结。【Three solutions:Induction, Improvemrnt and Enumeration】

一、Mind Map

MindMap

二、三种基本的算法策略

基本上,所有的算法题都可以用三种策略解决。

  • Divide-and-conquer: Let's start from the "smallest" problem first, and investigate whether a large problem can reduce to smaller subproblems.

分治法,将问题分解为小的问题,从最小问题出发解决问题。

See: Algorithm_Design_and Analysis-Divide-and-Conquer

  • "Intelligent" Enumeration: Consider an optimization problem. If the solution can be constructed step by step, we might enumerate all possible complete solutions by constructing a partial solution tree. Due to the huge size of the search tree, some techniques should be employed to prune it.

动态规划就是一种枚举所有可能性的算法,see:Algorithm Design and Analysis - Dynamic programming

枚举法,枚举所有的解。可使用trick(如设定下界)来加快。

  • Improvement: Let's start from an initial complete solution, and try to improve it step by step.

若无法将问题分解,就从质量不太好的完整解出发,逐步优化

三、例题

EX1.Calculating the greatest common divisor (gcd) 求最小公约数

The greatest common divisor of two integers a and b, when at least one of them is not zero, is the largest positive integer that divides the numbers without a remainder.

INPUT: two n-bits numbers a, and b (a >= b)
OUTPUT: gcd(a; b)

已知gcd(1; 0) = 1;如何求解gcd(1949; 101)?

可将复杂问题分解成小的问题(辗转相除法):gcd(1949; 101) = gcd(101; 1949 mod 101) = gcd(101; 30)= gcd(30; 101 mod 30) = gcd(30; 11)= gcd(11; 30 mod 11) = gcd(11; 8)= gcd(8; 3) = gcd(3; 2) = gcd(2; 1) = gcd(1; 0) = 1

1
2
3
4
5
function Euclid(a; b)
if b = 0 then
return a;
end if
return Euclid(b; a mod b);

EX2.traveling salesman problem (TSP) 周游城市的最短距离

INPUT: n cities V = {1; 2;...; n}, and a distance matrix D, where dij (1 <= i; j <= n) denotes the distance between city i and j.
OUTPUT: the shortest tour that visits each city exactly once and returns to the origin city.
map1

Trial 1: Divide and conquer

我们可通过计算 M(S,e) 来计算最小距离。S为要拜访的节点的集合,e为结束节点。

如,最短的距离我们可通过以下式子来计算:

min{ d2;1 +M({3; 4}; 2); d3;1 +M({2; 4}; 3); d4;1 +M({2; 3}; 4)}

1
2
3
4
5
6
7
8
9
10
//算是一种动态规划算法
function TSP(V;D)
return min{ M(V - {e}; e) + de1}; //e属于V且e不为1

function M(S; e)
if S = {v} then
M(S; e) = d1v + dve;
return M(S; e);
end if
return min{ M(S - {i}; i) + dei}; //i属于S,i不为e

Trial 2: Improvement strategy

从一个完整的解中开始,一步一步去完善它。Newton法,随机梯度下降和MC都用到了这个策略。但这种方法不一定能够获得最优解。可得到最优解的情况:线性规划,二次规划,网络流等。

1
2
3
4
5
6
7
8
9
10
11
function GenericImprovement(G;D)  //通用,逐步迭代完整解
Let s be an initial tour; //初始解的选择很重要
while TRUE do
Select a new tour s′ from the neighbourhood of s; //扰动越小越好
if s′ is shorter than s then
s = s′;
end if
if stopping(s) then //s满足退出条件
return s;
end if
end while

Trial 3: Intelligent" enumeration strategy

完整的解可以表示为n条边的排列。将边缘按照一定顺序排列,一个完整的解可以被表示为:X = [x1, x2,..., xm]。 例如:a -> b -> c -> d -> e -> a 可以被表示为 X = [1, 0, 0, 1, 1, 0, 0, 1, 0, 1]。 所有的环游都可以写成这种形式,我们就能枚举出所有的解。

map2
tree

子节点:表示一个完整的解

内部解:表示一个部分解,是一个已知item的子集

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
function GenericBacktrack(P0)  //枚举所有的解
Let A = fP0g. //Start with the original problem P0. Here, A
denotes the active subproblems that are unexplored.
best_so_far = 1; //当前知道的最短路程
while A ̸= NULL do //当还有节点要扩展时
Choose and remove a subproblem P from A;
Expand P into smaller subproblems P1, P2,..., Pk;
for i = 1 to k do
if Pi corresponds to a complete solution then
Update best_so_far if the corresponding objective function value is better; //对应一个完整解
else
Insert Pi into A; //对应一个部分解
end if
end for
end while
return best_so_far;

然而,这种方式计算量巨大,需要花费指数倍的时间,且我们无需在内存中存储整个树。因此,我们可以从中剔除一些低质量的或者不符合条件的部分解(贪心算法)。我们可以使用heuristic functions 和下界(lower bound functions)来评估部分解的质量。【这是EM算法的核心】

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
function IntelligentBacktrack(P0)
Let A = fP0g. // Start with the original problem P0. Here A denotes the active subproblems that are unexplored.
best_so_far = infinite;
while A ̸= NULL do
Choose a subproblem P in A with lower bound less than best_so_far;
Remove P from A;
Expand P into smaller subproblems P1, P2, ..., Pk,
for i = 1 to k do
if Pi corresponds to a complete solution then
Update best so far;
else
if lowerbound(Pi) <= best so far then Insert Pi into A;
end if
end if
end for
end while
return best_so_far;

四、小结

  1. 先观察问题的结构,解的形式,再设计算法
  2. 能分解成子问题是一个非常有效的信号
  3. 在优化问题中,下界信息非常重要