透過您的圖書館登入
IP:18.218.127.141
  • 學位論文

最佳化AUC之LSTM-CRF於論文中演算法識別應用

AUC oriented Bidirectional LSTM-CRF Models to Identify Algorithms Described in an Abstract

指導教授 : 林守德

摘要


本論文目的是辨識論文中的摘要所提到的演算法,並且區分論文使用的演算法以及比較的演算法。一般我們尋找使用特定演算法的論文都是利用關鍵字搜尋,但關鍵字搜尋會找到所有出現的結果,包括在論文中拿來比較的演算法。而我們比較在意論文真正使用或是提出的演算法。這樣的問題過去是序列標注模型LSTM-CRF來標出我們想找到的演算法,但由於論文摘要中標籤的分布非常不均勻,傳統LSTM-CRF皆是追求最佳化準確度而比較不適合我們的問題。因此我們更改LSTM-CRF的目標函式從最佳化準確度改成最佳化曲線下面積。曲線下面積比較適合用來衡量不平衡資料並且在預測出現次數少的標籤類有更好的效果。我們在實驗中顯示最佳化曲線下面積的LSTM-CRF有顯著進步。最後展示我們的模型可以應用在了解近幾年演算法使用的排名,以及演算法使用的趨勢變化,並且能找到過去沒有在訓練資料出現的演算法。

並列摘要


In this thesis, we attempt to identify algorithms mentioned in the paper abstract. We further want to discriminate the algorithm proposed in this paper from algorithms only mentioned or compared, since we are more interested in the former. We model this task as a sequential labeled task and propose to use a state-of-the-art deep learning model LSTM-CRF as our solution. However, the data or labels are generally imbalanced since not all the sentence in the abstract is describing its algorithm. That is, the ratio between different labels is skewed. As a result, it is not suitable to use traditional LSTM-CRF model since it only optimizes accuracy. Instead, it is more reasonable to optimize AUC in imbalanced data because it can deal with skewed labels and perform better in predicting rare labels. Our experiment shows that the proposed AUC-optimized LSTM-CRF outperforms the traditional LSTM-CRF. We also show the ranking of algorithms used currently, and find the trend of different algorithms used in recent years. Moreover, we are able to discover some new algorithms that do not exist in our training data.

並列關鍵字

Sequence labeling LSTM CRF AUC Optimization

參考文獻


[1] Y. Bengio, P. Simard, and P. Frasconi. Learning long-term dependencies with gradient descent is difficult. IEEE transactions on neural networks, 5(2):157–166, 1994.
[3] T. Calders and S. Jaroszewicz. Efficient auc optimization for classification. 2007.
[5] F. A. Gers, J. Schmidhuber, and F. Cummins. Learning to forget: Continual prediction with lstm. 1999.
[7] J. A. Hanley and B. J. McNeil. The meaning and use of the area under a receiver operating characteristic (roc) curve. Radiology, 143(1):29–36, 1982.
[8] S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural computation, 9(8):1735–1780, 1997.

延伸閱讀