基於強化學習之單字母替換式密碼破密方法

單字母替換式密碼(Monoalphabetic Substitution Cipher)是許多密碼的基礎，例如，DES、AES。研究密碼的破密方法有助於設計出更安全的密碼演算法。有許多單字母替換式密碼的破密方法都依賴頻率分析法。例如，爬山演算法、模擬退火法、基因演算法、禁忌搜尋法、粒子群最佳化演算法等。其中，適應函數(fitness function)大多都仰賴單字母及雙字母組(bigram)的頻率分布。然而，由於無法保證明文的字母分布與所比對的字母分布統計會相同，亦無法保證會相似。尤其，當密文篇幅短的時候，字母分布有很大的可能不會和一般的字母分布統計相近。因此，依賴頻率分析法的破密法通常只能部分破密，後續可能需要很多人力來完成修正，或者破密結果也有可能不具有可讀性(readable)。由於上述原因，我們提出了一種基於強化學習的破密法。我們將密文中的字母應該替換成什麼字母的問題模擬成一個多臂吃角子老虎機的問題。透過嘗試錯誤法(trial and error)的方式來更新密鑰，並根據我們設計的樣式(pattern)來進行更有效率的探索及字典比對。我們基於字母正確率設計了報酬函數，並利用報酬函數來評估目前密鑰的優劣。實驗結果顯示，我們提出的破密法能成功破密且具一般性(generality)。單次測試的破密正確率可達100%。對於不同的隨機密鑰加密不同的隨機密文進行100次的測試，平均破密正確率約為96%，破密結果具可讀性。值得注意的是，根據我們的研究調查，本研究是第一個提出應用多臂吃角子老虎機理論於單字母替換式密碼破密法的研究。而且，我們的破密法並不依賴單字母或雙字母組的頻率分析。

關鍵字

單字母替換式密碼；強化學習；多臂吃角子老虎機

並列摘要

Monoalphabetic Substitution Cipher is the basis of many cryptography, for example, Data Encryption Standard(DES) and Advanced Encryption Standard(AES). Studying how to crack a cipher contributes to designing a more secure cryptography. Many cryptanalysis of monoalphabetic substitution cipher relies on frequency analysis. For instance, Hill Climbing Algorithm, Simulated Annealing, Genetic Algorithm, Tabu Search and Particle Swarm Optimization. The differences between those methods are the way they generate a new key and the way they update the best key. It is worth noting that they all used fitness function to evaluate how good the current key is, which was designed according to unigram and bigram frequency analysis. However, the distribution of the alphabets in plaintext is unknown and we could not guarantee that it will be similar to the distribution of the statistics. Especially, when the cipher is very short, the distribution of alphabets in the cipher might be extremely different from the statistics. Therefore, the methods which rely on frequency analysis usually crack the cipher partially. Sometimes human effort is needed to complete the cryptanalysis. For the above reasons, we proposed a method to crack monoalphabetic substitution ciphers based on reinforcement learning. We model the cryptanalysis as a multi-armed bandit problem. At the stage of exploit and explore, we generate the new key according to the pattern. The pattern was designed on the basis of the length of a word and the repeated alphabets in a word. Through trial and error and the update of action value, the key becomes better and better. To evaluate how good the key is, a reward function was designed. The reward means the correctness of the alphabet in the key according to the mapping result of a dictionary with twenty thousand words. The experiment results showed that our method reached about 96% average correctness in 100 times of testing with different ciphers, which is readable. And the highest correctness for a single testing is 100%. To our best knowledge, we are the first to crack monoalphabetic substitution ciphers by multi-armed bandit. Last but not least, our method does not rely on unigram or bigram frequency analysis.

並列關鍵字

Monoalphabetic substitution cipher ； Reinforcement learning ； Multi-armed Bandit

參考文獻

[26]Bulatović, Luka & Mijanović, Anđela & Asanović, Balša & Trajković, Nikola & Božović, Vladimir. (2019). Automated cryptanalysis of substitution cipher using Hill climbing with well designed heuristic function. Mathematica Montisnigri. 44. 135-143. 10.20948/mathmon-2019-44-11.

[1]Mao, W., Modern Cryptography: Theory & Practice. Upper Saddle River, NJ: Prentice Hall PTR, 2004.

Google Scholar

[2]Chris Savarese and Brian Hart, “The Caesar Cipher,” Historical Cryptography Web Site, Trinity College, 1999, (6 September 2007).

Google Scholar

[3]R. Morelli, “The Vigenere [sic] Cipher,” Historical Cryptography Web Site, Trinity College, (6 September 2007).

Google Scholar

[4]Carter, Brian, and Tanja Magoc. "Classical ciphers and cryptanalysis." space 1000 (2007): 1.

Google Scholar

國際替代計量

基於強化學習之單字母替換式密碼破密方法

全文下載

主題瀏覽