透過您的圖書館登入
IP:18.223.20.57
  • 學位論文

基於區域確定性的群集設計主動學習樣本資料挑選的方法

Efficient Active Learning Based on Localized Uncertainty Clusters

指導教授 : 李新林
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


無資料

關鍵字

分群

並列摘要


There are many approaches which incorporate clustering into active learning for avoiding selecting the similar data points in the active learning. Yet, the traditional clustering methods do not consider increasing the accuracy of the active learning. There are two main purposes in this thesis. One is that more uncertain representatives of clusters are generated in clustering to increase the accuracy of the classification, and the other is that clustering with an unknown number of clusters. In our method, data points with similar in the local uncertainty, the small difference of coordinates and the large overlapped neighborhood will be collected into the same cluster. And the idea of certainty-based active learning(CBAL), which a local classifier is built by using neighbors, is used for finding the appropriate size of neighborhood. In our approach, the generated representatives can represent the effect of all data points in the same cluster to the classifier. Moreover, more uncertain representatives are generated, which will increase the accuracy of active learning. In addition, we propose a new clustering method which uses the value of local distance-based outlier factor(LDOF) to expand size of clusters and uses the distance measurement metric based on local uncertainty and overlapped neighborhood (LNC formula) to measure the similarity between data points. Finally, the experimental results show that the proposed method can select more uncertain training data in the synthetic dataset. And, in the UCI datasets , the accuracy and running time are also better.

並列關鍵字

clustering

參考文獻


Advances in Neural Information Processing Systems (NIPS), volume 15,
pp. 561-568. MIT Press, 2003.
with Improved Label Complexity”, appears in: arXiv:1108.1766v1,
for Error-Reduction Sampling in Active Learning”, appears in: Data
[4] Viet-Vu Vu, “ Active Learning for Semi-Supervised K-Means Clustering”,

延伸閱讀