以資料選擇技術幫助大規模支持向量機排序

Learning to rank has become a popular research topic in several areas including information retrieval and machine learning. Pair-wise ranking, which learns all the order preferences between every two examples, is one typical method for solving the ranking problem. In pair-wise ranking, RankSVM is a widely-used machine learning algorithm and has been successfully applied to the ranking problem in the previous work. However, RankSVM suffers a critical problem which is the long training time because of the huge number of pairs. In this thesis, we propose a data selection technique, Pruned RankSVM, that selects the most informative pairs before training. If we use partial pairs instead of total ones, we can train a large-scale data set efficiently. In the experiment, we show the performance of Pruned RankSVM is overall comparable with RankSVM while using significantly fewer pairs. To show the efficiency of Pruned RankSVM, we also compare with one point-wise ranking algorithm : support vector regression. Experimental results demonstrate that Pruned RankSVM outperforms support vector regression on most data sets.

並列關鍵字

learning to rank ； pair-wise ranking ； RankSVM ； data selection

參考文獻

[1] K. Balog, L. Azzopardi, and M. de Rijke. Formal models for expert finding in

enterprise corpora. pages 43–50. ACM Press, 2006.

Learning to rank using gradient descent. In Proceedings of the 22nd Annual

International Conference on Machine Learning, pages 89–96. ACM Press, 2005.

functions. In Advances in Neural Information Processing Systems 19, pages 193–

國際替代計量

以資料選擇技術幫助大規模支持向量機排序

全文下載

主題瀏覽