Alpha-beta剪枝

Alpha-beta剪枝是一種搜尋演算法，用以減少極小化極大演算法（Minimax演算法）搜尋樹的節點數。這是一種對抗性搜尋演算法，主要應用於機器遊玩的二人遊戲（如井字棋、象棋、圍棋）。當演算法評估出某策略的後續走法比之前策略的還差時，就會停止計算該策略的後續發展。該演算法和極小化極大演算法所得結論相同，但剪去了不影響最終決定的分枝^[1]。

歷史

Allen Newell和Herbert A. Simon在1958年，使用了John McCarthy所謂的「近似」alpha-beta演算法^[2]，此演算法當時「應已重新改造過多次」^[3]。亞瑟·李·塞謬爾（Arthur Samuel）有一個早期版本，同時Richards、Hart、Levine和/或Edwards在美國分別獨立發現了alpha-beta^[4]。McCarthy在1956年達特默思會議上提出了相似理念，並在1961年建議給他的一群學生，其中包括MIT的Alan Kotok^[5]。Alexander Brudno獨立發現了alpha-beta演算法，並在1963年發佈成果^[6]。Donald Knuth和Ronald W. Moore在1975年最佳化了演算法^[7]^[8]，Judea Pearl在1982年證明了其最佳性^[9]。

對原版極小化極大演算法的改進

Alpha-beta的優點是減少搜尋樹的分枝，將搜尋時間用在「更有希望」的子樹上，繼而提升搜尋深度。該演算法和極小化極大演算法一樣，都是分支限界類演算法。若節點搜尋順序達到最佳化或近似最佳化（將最佳選擇排在各節點首位），則同樣時間內搜尋深度可達極小化極大演算法的兩倍多。

在（平均或恆定）分枝因子為b，搜尋深度為d層的情況下，要評估的最大（即招法排序最差時）葉節點數目為O(b*b*...*b) = O(b^d)——即和簡單極小化極大搜尋一樣。若招法排序最佳（即始終優先搜尋最佳招法），則需要評估的最大葉節點數目按層數奇偶性，分別約為O(b*1*b*1*...*b)和O(b*1*b*1*...*1)（或O(b^d/2) = O(√b^d)）。其中層數為偶數時，搜尋因子相當於減少了其平方根，等於能以同深度搜尋兩次^[10]。b*1*b*1*...意義為，對第一名玩家必須搜尋全部招法找到最佳招式，但對於它們，只用將第二名玩家的最佳招法截斷——alpha-beta確保無需考慮第二名玩家的其他招法。但因節點生成順序隨機，實際需要評估的節點平均約為O(b^3d/4)^[2]。

一般在alpha-beta中，子樹會由先手方優勢或後手方優勢暫時佔據主導。若招式排序錯誤，這一優勢會多次切換，每次讓效率下降。隨着層數深入，局面數量會呈指數性增長，因此排序早期招式價值很大。儘管改善任意深度的排序，都以能指數性減少總搜尋局面，但排序臨近根節點深度的全部局面相對經濟。在實踐中，招法排序常由早期、小型搜尋決定，如通過迭代加深。

演算法使用兩個值alpha和beta，分別代表大分玩家放心的最高分，以及小分玩家放心的最低分。alpha和beta的初始值分別為正負無窮大，即雙玩家都以可能的最低分開始遊戲。在選擇某節點的特定分枝後，可能發生小分玩家放心的最小分小於大分玩家放心的最大分（beta <= alpha）。這種情況下，父節點不應選擇這個節點，否則父節點分數會降低，因此該分枝的其他節點沒有必要繼續探索。

偽代碼

下面為一有限可靠性版本的Alpha-beta剪枝的虛擬代碼^[10]：

 function alphabeta(node, depth, α, β, maximizingPlayer) // node = 节点，depth = 深度，maximizingPlayer = 大分玩家
     if depth = 0 or node是终端節點
         return 節點的啟發值
     if maximizingPlayer
         v := -∞
         for 每个子節點
             v := max(v, alphabeta(child, depth - 1, α, β, FALSE)) // child = 子節點
             α := max(α, v)
             if β ≤ α
                 break // β裁剪
         return v
     else
         v := ∞
         for 每个子節點
             v := min(v, alphabeta(child, depth - 1, α, β, TRUE))
             β := min(β, v)
             if β ≤ α
                 break // α裁剪
         return v

(* 初始調用 *)
alphabeta(origin, depth, -∞, +∞, TRUE) // origin = 初始節點

在這個有限可靠性的alpha-beta中，當v超出調用參數α和β構成的集合時（v < α或v > β），alphabeta函數返回值v。而與此相對，強化的有限可靠性alpha-beta限制函數返回在α與β包括範圍中的值。

參考文獻

George T. Heineman, Gary Pollice, and Stanley Selkow. Chapter 7: Path Finding in AI. Algorithms in a Nutshell. Oreilly Media. 2008: 217–223. ISBN 978-0-596-51624-6.
Judea Pearl, Heuristics, Addison-Wesley, 1984
John P. Fishburn. Appendix A: Some Optimizations of α-β Search. Analysis of Speedup in Distributed Algorithms (revision of 1981 PhD thesis). UMI Research Press. 1984: 107-111. ISBN 0-8357-1527-2.

^ Russell, Stuart J.; Norvig, Peter. Artificial Intelligence: A Modern Approach 3rd. Upper Saddle River, New Jersey: Pearson Education, Inc. 2010: 167 [2016-02-05]. ISBN 0-13-604259-7. （原始內容存檔於2011-02-28）.
^ ^2.0 ^2.1 McCarthy, John. Human Level AI Is Harder Than It Seemed in 1955. LaTeX2HTML 27 November 2006 [2006-12-20]. （原始內容存檔於2012-04-08）. 請檢查|date=中的日期值 (幫助)
^ Newell, Allen and Herbert A. Simon. Computer Science as Empirical Inquiry: Symbols and Search (PDF). Communications of the ACM. March 1976, 19 (3) [2006-12-21]. doi:10.1145/360018.360022. （原始內容 (PDF)存檔於2007-06-28）.
^ Edwards, D.J. and Hart, T.P. The Alpha–beta Heuristic (AIM-030). Massachusetts Institute of Technology. 4 December 1961 to 28 October 1963 [2006-12-21]. （原始內容存檔於2012-04-08）. 請檢查|date=中的日期值 (幫助)
^ Kotok, Alan. MIT Artificial Intelligence Memo 41. 2004-12-03 [2006-07-01]. （原始內容存檔於2012-04-08）.
^ Marsland, T.A.. Computer Chess Methods (PDF) from Encyclopedia of Artificial Intelligence. S. Shapiro (editor) (PDF). J. Wiley & Sons: 159–171. May 1987 [2006-12-21]. （原始內容 (PDF)存檔於2008-10-30）.
^ * Knuth, D. E., and Moore, R. W. An Analysis of Alpha–Beta Pruning (PDF). Artificial Intelligence. 1975, 6 (4): 293–326. doi:10.1016/0004-3702(75)90019-3. ^{[永久失效連結]} Reprinted as Chapter 9 in Knuth, Donald E. Selected Papers on Analysis of Algorithms. Stanford, California: Center for the Study of Language and Information - CSLI Lecture Notes, no. 102. 2000 [2016-02-05]. ISBN 1-57586-212-3. OCLC 222512366. （原始內容存檔於2010-11-15）.
^ Abramson, Bruce. Control Strategies for Two-Player Games (PDF). ACM Computing Surveys. June 1989, 21 (2): 137 [2008-08-20]. doi:10.1145/66443.66444. （原始內容 (PDF)存檔於2008-08-20）.
^ Pearl, Judea. The Solution for the Branching Factor of the Alpha–beta Pruning Algorithm and its Optimality. Communications of the ACM. August 1982, 25 (8): 559–564. doi:10.1145/358589.358616.
^ ^10.0 ^10.1 Russell, Stuart J.; Norvig, Peter, Artificial Intelligence: A Modern Approach 2nd, Upper Saddle River, New Jersey: Prentice Hall, 2003 [2016-02-05], ISBN 0-13-790395-2, （原始內容存檔於2011-02-28）

外部連結

[RN10-1] Russell, Stuart J.; Norvig, Peter. Artificial Intelligence: A Modern Approach 3rd. Upper Saddle River, New Jersey: Pearson Education, Inc. 2010: 167 [2016-02-05]. ISBN 0-13-604259-7. （原始內容存檔於2011-02-28）.

[JMC-2] 2.0 ^2.1 McCarthy, John. Human Level AI Is Harder Than It Seemed in 1955. LaTeX2HTML 27 November 2006 [2006-12-20]. （原始內容存檔於2012-04-08）. 請檢查|date=中的日期值 (幫助)

[NS-3] Newell, Allen and Herbert A. Simon. Computer Science as Empirical Inquiry: Symbols and Search (PDF). Communications of the ACM. March 1976, 19 (3) [2006-12-21]. doi:10.1145/360018.360022. （原始內容 (PDF)存檔於2007-06-28）.

[AIM30-4] Edwards, D.J. and Hart, T.P. The Alpha–beta Heuristic (AIM-030). Massachusetts Institute of Technology. 4 December 1961 to 28 October 1963 [2006-12-21]. （原始內容存檔於2012-04-08）. 請檢查|date=中的日期值 (幫助)

[AIM41-5] Kotok, Alan. MIT Artificial Intelligence Memo 41. 2004-12-03 [2006-07-01]. （原始內容存檔於2012-04-08）.

[Marsland-6] Marsland, T.A.. Computer Chess Methods (PDF) from Encyclopedia of Artificial Intelligence. S. Shapiro (editor) (PDF). J. Wiley & Sons: 159–171. May 1987 [2006-12-21]. （原始內容 (PDF)存檔於2008-10-30）.

[Knuth-7] * Knuth, D. E., and Moore, R. W. An Analysis of Alpha–Beta Pruning (PDF). Artificial Intelligence. 1975, 6 (4): 293–326. doi:10.1016/0004-3702(75)90019-3. ^{[永久失效連結]} Reprinted as Chapter 9 in Knuth, Donald E. Selected Papers on Analysis of Algorithms. Stanford, California: Center for the Study of Language and Information - CSLI Lecture Notes, no. 102. 2000 [2016-02-05]. ISBN 1-57586-212-3. OCLC 222512366. （原始內容存檔於2010-11-15）.

[Abramson-8] Abramson, Bruce. Control Strategies for Two-Player Games (PDF). ACM Computing Surveys. June 1989, 21 (2): 137 [2008-08-20]. doi:10.1145/66443.66444. （原始內容 (PDF)存檔於2008-08-20）.

[9] Pearl, Judea. The Solution for the Branching Factor of the Alpha–beta Pruning Algorithm and its Optimality. Communications of the ACM. August 1982, 25 (8): 559–564. doi:10.1145/358589.358616.

[RN03-10] 10.0 ^10.1 Russell, Stuart J.; Norvig, Peter, Artificial Intelligence: A Modern Approach 2nd, Upper Saddle River, New Jersey: Prentice Hall, 2003 [2016-02-05], ISBN 0-13-790395-2, （原始內容存檔於2011-02-28）

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

閱論編博弈論專題
定義	正則形式的博弈 · 擴充形式的博弈 · 圖博弈論 · 合作博弈 · 資訊集合 · 偏好
均衡概念（英語：Solution concept）	納殊均衡 · 強納殊均衡（英語：Strong Nash equilibrium） · 子博弈均衡（英語：Subgame perfect equilibrium） · 貝葉斯-納殊均衡 · 貝葉斯完美均衡（英語：Perfect Bayesian equilibrium） · 顫抖手完美均衡 · 恰當均衡（英語：Proper equilibrium） · ε-均衡 · 相關均衡 · 序貫均衡 · 准完美均衡（英語：Quasi-perfect equilibrium） · 進化穩定策略（英語：Evolutionarily stable strategy） · 風險佔優（英語：Risk dominance） · 帕累托最佳 · 自我應驗均衡（英語：Self-confirming equilibrium） · 馬爾可夫完美均衡（英語：Markov perfect equilibrium） · 默滕斯穩定均衡（英語：Mertens-stable equilibrium） · 核（英語：Core (game theory)） · 夏普利值（英語：Shapley value） · 吉布斯均衡（英語：Potentialg ame） · 量子響應均衡（英語：Quantal response equilibrium） · 謝林點
策略	優勢策略 · 純策略 · 混合策略 · 以牙還牙 · 冷酷觸發策略（英語：Grim trigger） · 策略複製論證（英語：Strategy-stealing argument） · 逆向歸納法（英語：Backward induction） · 前向歸納法（英語：Forward induction） · 馬爾可夫策略（英語：Markov strategy）
博弈類型	對稱博弈 · 完美資訊 · 序列博弈 · 重複博弈 · 信號博弈 · 廉價磋商（英語：Cheap talk） · 零和博弈 · 機制設計 · 隨機博弈 · 非傳遞博弈 · 全域博弈（英語：Global game） · 甄別博弈（英語：screening game） · 討價還價問題（英語：Bargaining problem） · 多人博弈（英語：n-player game） · 大型泊松博弈（英語：Large Poisson game） · 嚴格決定博弈 · 潛博弈（英語：Potential game） · 位勢博弈
博弈模型	圍棋 · 國際象棋 · 無限棋（英語：Infinite chess） · 西洋跳棋 · 井字棋 · 囚徒困境（可選擇的囚徒博弈（英語：Optional prisoner's dilemma） · 用餐者困境） · 旅行者困境 · 猜均值的2/3 · 協調博弈（英語：Coordination game） · 蜈蚣博弈 · 志願者困境 · 搭便車問題 · 拍賣美元 · 膽小鬼博弈 · 智豬博弈 · 性別戰 · 獵鹿博弈 · 賭便士（英語：Matching pennies） · 最後通牒博弈（海盜博弈） · 包、剪、揼 · 獨裁者博弈（信任遊戲） · 公共財博弈（英語：Public goods game） · 納殊討價還價問題（英語：Nash Bargaining Game） · 上校博弈 · 消耗戰 · 少數派博弈（El Farol酒吧問題） · 公平分配博弈（切蛋糕問題（英語：Fair cake-cutting）） · 古諾競爭 · 死結 · 庫恩撲克遊戲（英語：Kuhn poker） · 甄別博弈（英語：Screening Game） · 公主與怪獸遊戲（英語：Princess and monster game） · 約會問題（英語：Rendezvous problem） · 囚徒帽子謎題（英語：Prisoners and hats puzzle）
定理	極值定理 · 純化定理（英語：Purification theorem） · 無名氏定理 · 顯示定理（英語：Revelation principle） · 阿羅不可能定理 · 極小化極大演算法 · 納殊均衡 · 策梅洛定理
關鍵人物（英語：List of game theorists）	阿爾伯特·W·塔克 · 阿摩司·特沃斯基 · 阿里埃勒·魯賓斯坦 · 克勞德·香農 · 丹尼爾·卡內曼 · 戴維·K·萊文（英語：David K. Levine） · 戴維·M·克雷普斯（英語：David M. Kreps） · 唐納德·B·吉利斯（英語：Donald B. Gillies） · 朱·弗登博格（英語：Drew Fudenberg） · 埃里克·馬斯金 · 哈羅德·W·庫恩（英語：Harold W. Kuhn） · 赫伯特·亞歷山大·西蒙（司馬賀） · 埃爾維·穆蘭（英語：Hervé Moulin） · 讓·梯若爾 · 讓-弗朗索瓦·默滕斯（英語：Jean-François Mertens） · 珍妮弗·圖爾·蔡司（英語：Jennifer Tour Chayes） · 夏仙義·亞諾什·卡羅伊 · 約翰·梅納德·史密斯 · 安托萬·奧古斯丁·庫爾諾 · 約翰·福布斯·納殊 · 約翰·馮·諾伊曼 · 肯尼斯·阿羅 · 肯尼思·賓默爾 · 里奧尼德·赫維克茲 · 勞埃德·沙普利 · 梅爾文·德雷希爾（英語：Melvin Dresher） · 梅里爾·M·弗勒德 · 奧嘉·邦達雷娃（英語：Olga Bondareva） · 奧斯卡·莫根施特恩（英語：Oskar Morgenstern） · 保羅·米爾格龍 · 佩頓·楊（英語：Peyton Young） · 賴因哈德·澤爾騰 · 羅伯特·阿克塞爾羅 · 羅伯特·約翰·奧曼 · 羅伯特·B·威爾遜 · 羅傑·梅爾森 · 塞繆爾·鮑爾斯（英語：Samuel Bowles (economist)） · 蘇珊娜·斯科奇姆 · 托馬斯·克羅姆比·謝林 · 威廉·維克里
參見	全支付拍賣 · Alpha-beta剪枝 · 伯川德悖論（英語：Bertrand paradox (economics)） · 有限理性 · 組合博弈論 · 對抗分析（英語：Confrontation analysis） · 合作性競爭 · 棋局中的先手優勢（英語：First-move advantage in chess） · 博弈機制（英語：Game mechanics） · 博弈論詞彙表（英語：Glossary of game theory） · 博弈理論家列表（英語：List of game theorists） · 特殊博弈列表 · 雙輸 · 國際象棋的解局策略（英語：Solving chess） · 拓撲博弈（英語：Topological game） · 公地悲劇 · 小決定暴政