透過您的圖書館登入
IP:18.118.252.87
  • 學位論文

基於交易紀錄摘要之比特幣地址分類方法分析

An Evaluation of Bitcoin Address Classification based on Transaction History Summarization

指導教授 : 廖世偉

摘要


比特幣,是第一個具備分散式與去中心化特性的加密貨幣,這樣的特性使它成為世界上被眾多人使用的交易平台。其有效率的國際間流通與源於地址匿名的隱私性,讓比特幣上在過去十來年間,出現諸多不同的金融活動,例如:支付、投資、賭博,甚至洗錢。不幸的是,由於許多利用這個系統進行的犯罪活動很難被辨認與偵測出來,使得一些政府萌生出對它的不信任,而不支持比特幣的發展。因此,如何辨認出罪犯的比特幣地址是加密貨幣研究中一項重要的課題。 在本論文中,我們提出有別於過去文獻經常使用的特徵,來建構偵測行為異常之比特幣地址的分類模型。我們發現數個相當有效的特徵,稱為「額外統計特徵」,與「基本統計特徵」作區別。此外,我們還提出全新的特徵:高階矩與十分位數,能夠有效地捕捉一個地址之交易紀錄中的時間資訊。我們將有數據標註的比特幣地址資料集,透過這些方法取出特徵後,由監督式學習的機器學習演算法訓練分類模型。實驗的結果顯示我們提出的特徵對於比特幣地址分類的準確率有顯著的提升。我們衡量了八種分類演算法後,最佳的結果來自基於梯度提升決策樹的演算法,在 Micro-F1 分數與 Macro-F1 分數上皆達到87%。

並列摘要


Bitcoin is a cryptocurrency that features a distributed and decentralized mechanism, which has made Bitcoin a popular global transaction platform. The transaction efficiency among nations and the privacy benefiting from address anonymity of the Bitcoin network have attracted many activities such as payments, investments, gambling, and even money laundering in the past decade. Unfortunately, some criminal behaviors which took advantage of this platform were not identified. This has discouraged many governments to support cryptocurrency. Thus, the capability to identify criminal addresses becomes an important issue in the cryptocurrency network. In this paper, we propose new features in addition to those commonly used in the literature to build a classification model for detecting abnormality of Bitcoin network addresses. We found several useful conventional features, which we name as extra statistics. Also, we introduce new features includ- ing various high orders of moments of transaction time (represented by block height) and deciles of transaction time which summarize temporal informa- tion of the transaction history in an efficient way. The extracted features are trained by supervised machine learning methods on a labelled dataset of Bit- coin addresses. The experimental evaluation shows that these features have improved the performance of Bitcoin address classification significantly. We evaluate the results under eight classifiers and achieve the highest Micro-F1 / Macro-F1 of 87% / 87% with a gradient boosting decision tree algorithm.

參考文獻


[1] chainalysis.com: Chainalysis - blockchain analysis.
[2] coinmarketcap.com: Cryptocurrency market capitalizations.
[3] E. Androulaki, G. O. Karame, M. Roeschlin, T. Scherer, and S. Capkun. Evaluating user privacy in bitcoin. In International Conference on Financial Cryptography and Data Security, pages 34–51. Springer, 2013.
[4] M.Bartoletti,B.Pes,andS.Serusi.Dataminingfordetectingbitcoinponzischemes. In 2018 Crypto Valley Conference on Blockchain Technology (CVCBT), pages 75– 84, June 2018.
[5] L. Breiman. Random forests. Machine learning, 45(1):5–32, 2001.

延伸閱讀