subsequence - 演算法筆記

longest increasing subsequence

sequence

「數列」。一串數字。

例如1 3 5 2 9，由五個數字組成的數列。

「序列」。一串字元。字元其實是數字。

例如a c e b i，由五個字元組成的序列。

subsequence

「子序列」。sequence之中，依照由左到右的順序，挑幾個數字出來，就是subsequence。其中sub-的意思是「附屬、次要」。

例如1 3 5 2 9的其中一個子序列是3 9。

例如1 3 5 2 9的其中一個子序列是1 5 2 9。

空集合（不取）、原序列（全取），都是子序列！

increasing

「遞增」。數字不斷增加。

例如3 9是遞增子序列。

例如1 5 2 9不是遞增子序列。

遞增的數學性質很強，可以設計高速演算法。

longest increasing subsequence（LIS）

「最長遞增子序列」。所有子序列當中，遞增的、最長的子序列，可能有許多個。

例如1 3 5 2 9的LIS是1 3 5 9。

找出其中一個LIS，有快速的演算法，接下來將為各位介紹。

找出所有的LIS，以上述演算法作為基礎。

longest increasing subsequence: dynamic programming

演算法

錯誤的做法：紀錄每種前綴的LIS。online演算法，一次讀取一個數字，窮舉先前每種前綴的LIS，看看哪一種接得最長。

正確的做法：紀錄每項數字為結尾的LIS。online演算法，一次讀取一個數字，窮舉先前每項數字為結尾的LIS，看看哪一種接得最長。

時間複雜度O(N²)，N是序列長度。

sequence           : -7 10  9  2  3  8  8  1

sequence           :(-7)10  9  2  3  8  8  1
LIS ends with s[0] : -7

sequence           : -7(10) 9  2  3  8  8  1
LIS ends with s[0] : -7
LIS ends with s[1] : -7 10

sequence           : -7 10 (9) 2  3  8  8  1
LIS ends with s[0] : -7
LIS ends with s[1] : -7 10
LIS ends with s[2] : -7     9

sequence           : -7 10  9 (2) 3  8  8  1
LIS ends with s[0] : -7
LIS ends with s[1] : -7 10
LIS ends with s[2] : -7     9
LIS ends with s[3] : -7        2

sequence           : -7 10  9  2 (3) 8  8  1
LIS ends with s[0] : -7
LIS ends with s[1] : -7 10
LIS ends with s[2] : -7     9
LIS ends with s[3] : -7        2
LIS ends with s[4] : -7        2  3

sequence           : -7 10  9  2  3 (8) 8  1
LIS ends with s[0] : -7
LIS ends with s[1] : -7 10
LIS ends with s[2] : -7     9
LIS ends with s[3] : -7        2
LIS ends with s[4] : -7        2  3
LIS ends with s[5] : -7        2  3  8

sequence           : -7 10  9  2  3  8 (8) 1
LIS ends with s[0] : -7
LIS ends with s[1] : -7 10
LIS ends with s[2] : -7     9
LIS ends with s[3] : -7        2
LIS ends with s[4] : -7        2  3
LIS ends with s[5] : -7        2  3  8
LIS ends with s[6] : -7        2  3     8

sequence           : -7 10  9  2  3  8  8 (1)
LIS ends with s[0] : -7
LIS ends with s[1] : -7 10
LIS ends with s[2] : -7     9
LIS ends with s[3] : -7        2
LIS ends with s[4] : -7        2  3
LIS ends with s[5] : -7        2  3  8
LIS ends with s[6] : -7        2  3     8
LIS ends with s[7] : -7                    1

UVa 111 231 437 497 10131 10534

計算longest increasing subsequence的長度

這裡提供兩支程式碼，效果一樣！第一支程式碼是看s[i]能接在哪些數字後面。第二支程式碼是看s[i]後面能接上哪些數字。

找出一個longest increasing subsequence

額外建立一條陣列，記錄銜接關係。

找出所有的longest increasing subsequence

backtracking，窮舉銜接關係。LIS數量最多2ᴺ個。

找出字典順序最小的longest increasing subsequence

項次的字典順序：改為紀錄每項數字為開頭的LIS。

數字的字典順序：長度平手時，選擇數字較小者。

演算法

遞增、最長，兩件事分開處理。先處理遞增，再處理最長。

二元搜尋樹，節點是每項數字，擴充資訊是每項數字為結尾的LIS長度、以及其最大值。online演算法，一次讀取一個數字，搜尋次小的數字，累計LIS長度最大值。

時間複雜度O(NlogN)，N是序列長度。

演算法

當數字種類很少，紀錄每種數字當作結尾的LIS長度。

時間複雜度O(NR)，R是數字種類。

sequence :  1  1  2  3  5  4
s1 :
s2 :
s3 :
s4 :
s5 :

sequence : (1) 1  2  3  5  4
s1 : 1
s2 :
s3 :
s4 :
s5 :
/* 1 自成一個子序列, 無法接在別人後面 */

sequence :  1 (1) 2  3  5  4
s1 : 1
s2 :
s3 :
s4 :
s5 :
/* 1 自成一個子序列, 無法接在別人後面 */

sequence :  1  1 (2) 3  5  4
s1 : 1
s2 : 1 2
s3 :
s4 :
s5 :
/* 2 自成一個子序列, 亦可接在 s1 後面 */
/* s2 = max(1, s1 + 1) */

sequence :  1  1  2 (3) 5  4
s1 : 1
s2 : 1 2
s3 : 1 2 3
s4 :
s5 :
/* 3 自成一個子序列, 亦可接在 s1 s2 後面 */
/* s3 = max(1, s1+1, s2+1) */

sequence :  1  1  2  3 (5) 4
s1 : 1
s2 : 1 2
s3 : 1 2 3
s4 :
s5 : 1 2 3 5
/* 5 自成一個子序列, 亦可接在 s1 ... s4 後面 */
/* s5 = max(1, s1+1, ..., s4+1) */

sequence : 1  1  2  3  5 (4)
s1 : 1
s2 : 1 2
s3 : 1 2 3
s4 : 1 2 3 4
s5 : 1 2 3 5
/* 4 自成一個子序列, 亦可接在 s1 s2 s3 後面 */
/* s4 = max(1, s1+1, s2+1, s3+1) */

UVa 10051

演算法

當LIS長度很短，紀錄每種LIS長度的結尾數字（的位置）。結尾數字越小越好，以便接得更長。

時間複雜度O(NL)，L是LIS長度。

longest increasing subsequence: Robinson–Schensted–Knuth algorithm

演算法

紀錄每個數字在LIS當中的位置。位置越後面越好，以便接得更長。以binary search加速。

時間複雜度O(NlogL)，N是序列長度，L是LIS長度。

首先找到每個數字在LIS當中的最合適位置position[]。

sequence : -7 10  9  2  3  8  8  1
temp LIS :
position :

sequence :(-7)10  9  2  3  8  8  1
temp LIS : -7
position :  1		// -7 在 LIS 的第一個位置

sequence : -7(10) 9  2  3  8  8  1
temp LIS : -7 10
position :  1  2	// 10 在 LIS 的第二個位置，以此類推。

sequence : -7 10 (9) 2  3  8  8  1
temp LIS : -7  9
position :  1  2  2
/* 9 成為 LIS 的潛力比 10 大, 所以以 9 代替 10 */

sequence : -7 10  9 (2) 3  8  8  1
temp LIS : -7  2
position :  1  2  2  2
/* 2 成為 LIS 的潛力比 9 大, 所以以 2 代替 9 */

sequence : -7 10  9  2 (3) 8  8  1
temp LIS : -7  2  3
position :  1  2  2  2  3

sequence : -7 10  9  2  3 (8) 8  1
temp LIS : -7  2  3  8
position :  1  2  2  2  3  4

sequence : -7 10  9  2  3  8 (8) 1
temp LIS : -7  2  3  8
position :  1  2  2  2  3  4  4
/* 8 成為 LIS 的潛力比 8 大, 所以以 8 代替 8 */

sequence : -7 10  9  2  3  8  8 (1)
temp LIS : -7  1  3  8
position :  1  2  2  2  3  4  4  2
/* 1 成為 LIS 的潛力比 2 大, 所以以 1 代替 2 */

接下來從position[]之中找到LIS。從尾端往回找，先找到的就是正確的。因為LIS長度為4，所以先找到position[]之中的4。

sequence : -7 10  9  2  3  8 (8) 1
position :  1  2  2  2  3  4 (4) 2
LIS      :  -  -  -  8
/* search 4th, 8 is fourth LIS element */

sequence : -7 10  9  2 (3) 8  8  1
position :  1  2  2  2 (3) 4  4  2
LIS      :  -  -  3  8
/* search 3rd, 3 is third LIS element */

sequence : -7 10  9 (2) 3  8  8  1
position :  1  2  2 (2) 3  4  4  2
LIS      :  -  2  3  8
/* search 2nd, 2 is second LIS element */

sequence :(-7)10  9  2  3  8  8  1
position : (1) 2  2  2  3  4  4  2
LIS      : -7  2  3  8
/* search 1st, -7 is first LIS element */

最後得到其中一個LIS為-7 2 3 8，是最後出現的LIS。

順帶一提，如果想要得到最先出現的LIS，將原本序列前後顛倒，做longest decreasing subsequence。

UVa 481

非嚴格遞增

請先看看這個例子。

sequence : -7 10  9  2  3  8 (8) 1
temp LIS : -7  2  3  8
position :  1  2  2  2  3  4  4
/* 8 成為 LIS 的潛力比 8 大, 所以以 8 代替 8 */

這個時候就不能用8來代替8，而要用8去代替比8稍大的數字。如果找不到比8稍大的數字，則直接將數字加在後面。上面的例子修改過後，就變成這樣。

sequence : -7 10  9  2  3  8 (8) 1
temp LIS : -7  2  3  8  8
position :  1  2  2  2  3  4  5
/* 8 可以接在 8 後面 */

宏觀視角

這個演算法找出了所有的子序列，依照長度排列。

greedy method，隨時記錄優良的子序列，逐步更新子序列，越串越長，求出LIS。

sequence       5 2 9 4 8 3
===========================
read 5 |  1 |  5     		// 長度1　　　　　（以5結尾最長的）
---------------------------
read 2 |  2 |    2   		// 長度1　　　　　（以3結尾最長的）
---------------------------
read 9 |  3 |      9 		// 長度1
       |  4 |    2 9 		// 長度2，接第二串
       |  5 |  5   9 		// 長度2，接第一串（以9結尾最長的）
---------------------------
read 4 |  6 |        4		// 長度1
       |  7 |    2   4		// 長度2，接第二串（以4結尾最長的）
---------------------------
read 8 |  8 |          8	// 長度1
       |  9 |        4 8	// 長度2，接第六串
       | 10 |    2     8	// 長度2，接第二串
       | 11 |  5       8	// 長度2，接第一串
       | 12 |    2   4 8	// 長度3，接第七串（以8結尾最長的）
---------------------------
read 3 | 13 |            3	// 長度1
       | 14 |    2       3	// 長度2，接第二串（以3結尾最長的）

read 5 |  5
read 2 |  2
read 9 |  2 9
read 4 |  2 4	// 同時記錄了第4串和第7串
read 8 |  2 4 8
read 3 |  2 3 8	// 同時記錄了第12串和第14串

box stacking problem

有一堆尚未封裝的紙箱，把紙箱一個一個套起來存放。然而每個紙箱都有不同的長寬高。如果一個紙箱的三邊長，皆小於另一個紙箱的三邊長，這個紙箱才放得進去。請問最多能套幾層？

UVa 103