以反向k鄰近點查詢處理技術解決二分匹配問題

反向k鄰近點查詢所傳回的物件，乃指查詢點是離該物件前k近的地點之一。本論文提出方法是以反向k鄰近點查詢處理技術來解二分匹配的問題，並以快速找出近似最佳解為目標。二分匹配問題是指兩組資料集合，分屬不同集合的兩個物件才可以配對，每筆配對會伴隨著一個權重值，最佳解是要找出一群每個物件最多只出現一次的配對集合，而且在配對數最多的前提下權重值加總最大(或最小)。我們從實際應用發現此問題的一種變形，一組集合的一個物件在配對時必須滿足其權重值限制，另一組集合則允許同一物件出現多次的配對，最佳化目標並未改變。我們的方法主要是利用反向k鄰近點查詢的結果，分析個別物件之間的潛在配對數及權重值，以分析所得的優先次序來決定配對，兼顧計算時間及答案品質的需求。在效能評估上，我們實作多個方法並比較其效能，我們的方法花費的時間最少，所得結果的品質與最佳解差距不大。

關鍵字

反向k鄰近點；二分匹配；配對

並列摘要

Reverse k Nearest Neighbor (RkNN) query returns the objects that take the query point as one of their k closest points. The method proposed in this thesis solves bipartite matching problem with RkNN query processing techniques and aims at finding the near-optimal solution. Bipartite matching problem indicates that given two datasets, only two objects from each can be matched together. Each matched pair would be associated with a weight. The optimal solution is to find a set of matched pairs in which every object appears at most once and the total sum of weights is maximized (or minimized) on the premise of the most pairs. We identify one of its variations from real applications. Objects in one set must satisfy their constraints on the weights, while another set allows one object to appear more than once in the matched pairs. The goal of optimization is not changed. Our method mainly employs RkNN query results to analyze the potential number of matched pairs and weights among individual objects. It decides the matched pairs in order of the priority obtained from the analysis to reconcile both the needs for computation time and quality of answer. On performance evaluation, we implement and compare several methods. Our method spends the least on computation time and the quality of its result is close to the optimal solution.

並列關鍵字

matching ； bipartite matching ； RkNN

參考文獻

[1] E. Achtert, H.P. Kriegel, P. Kröger, M. Renz and A. Züfle, “Reverse k-Nearest Neighbor Search in Dynamic and General Metric Databases,” International Conference on Extending Database Technology, pp. 886-897, 2009.

[3] R.W. Calvo, F.d. Luigi, P. Haastrup and V. Maniezzo, “A Distributed Geographic Information System for the Daily Car Pooling Problem,” Computers and Operations Research, pp. 2263-2278, 2004.

[5] M.A. Cheema, X. Lin, W. Zhang and Y. Zhang, “Influence Zone: Efficiently Processing Reverse k Nearest Neighbors Queries,” International Conference on Data Engineering, pp. 577-588, 2011.

[6] M.A. Cheema, X. Lin, Y. Zhang, W. Wang and W. Zhang, “Lazy Updates: An Efficient Technique to Continuously Monitoring Reverse kNN,” Very Large Data Base Endowment, pp. 1138-1149, 2009.

[7] M.A. Cheema, W. Zhang, X. Lin and Y. Zhang, “Efficiently Processing Snapshot and Continuous Reverse k Nearest Neighbors Queries,” International Journal on Very Large Data Bases, pp. 703-728, 2012.

國際替代計量

以反向k鄰近點查詢處理技術解決二分匹配問題

全文下載

主題瀏覽