資訊安全導向設計的輕量 AJAX 爬蟲

近期滲透測試的需求逐漸增長，但是現存的資訊安全漏洞掃瞄器的 API 爬蟲在現代以 Javascript 撰寫的動態 Ajax 網頁上表現非常不好。由於現今的網站都非常大，爬完整個網站是不現實的，於是在某個時間點終止爬蟲是必須的。我們在這篇論文提出了一個新的爬蟲模型，以爬蟲會在某未知的時間中止為前提，並針對爬 API 做設計，在固定時間下有比以前的爬蟲有更突出的表現。在我們的設計中，我們把爬蟲所需要的花費轉換成已經被研究透徹的隨機最短路徑 (SSP) 問題。我們的實驗結果顯示，我們的模型比起傳統的策略像是廣度優先搜索及深度優先搜索，可以爬到更多的 API。

關鍵字

資訊安全；爬蟲；輕量

並列摘要

The requirement of security penetration testing grows in recent, but the ap- plication programming interface (API) crawler of existing web vulnerability scanners have bad performance on modern websites which rely on Javascript technologies like Ajax. Moreover, modern websites are often huge and it is impossible to crawl the full website. Hence, stopping the crawling in some time is necessary. This thesis presents a crawling algorithm design which sup- port breaking off at arbitrary time and also focuses on the API crawling. In this design, we re-define the crawling problem and reduce it to a well studied stochastic shortest path (SSP) problem. We implement two simple baseline models and evaluate on two small websites and ten huge commercial website. The results shows our simple baselines yield higher amount of crawled APIs than the traditional strategies such as depth-first and breadth-first.

並列關鍵字

Security ； crawler ； Lightweight ； API

參考文獻

[3] D. P. Bertsekas. Dynamic programming and optimal control. Athena scientific Bel- mont, MA, 1995.

[6] M. E. Dincturk, G.-V. Jourdan, G. V. Bochmann, and I. V. Onut. A model-based approach for crawling rich internet applications. ACM Transactions on the Web (TWEB), 8(3):19, 2014.

[7] A. M. Fard and A. Mesbah. Feedback-directed exploration of web applications to derive test models. In ISSRE, volume 13, pages 278–287, 2013.

[11] A.Mesbah,A.VanDeursen,andS.Lenselink.Crawlingajax-basedwebapplications through dynamic analysis of user interface state changes. ACM Transactions on the Web (TWEB), 6(1):3, 2012.

[13] A. van Deursen, A. Mesbah, and A. Nederlof. Crawl-based analysis of web applica- tions: Prospects and challenges. Science of Computer Programming, 97:173–180, 2015.

國際替代計量

資訊安全導向設計的輕量 AJAX 爬蟲

主題瀏覽