有效率找出k個滿足最多客戶需求/喜愛產品之演算法

在規劃新產品時，考慮市場已存在的競爭產品與客戶偏好等市調資料，可分析估算出產品目標行銷的客戶群數量，以提供產品行銷的決策依據。為了達到此目的，本論文提出以不同的方式估算產品客戶數量，定義出兩種新的產品定位查詢處理問題：分別是找出k個滿足最多客戶需求之產品，以及找出k個滿足最多客戶喜愛之產品。　　對一個特定種類的產品，給定一組顧客對產品屬性偏好的資料集合、一組市場中存在的產品集合、及公司可提供的一組候選產品集合，k個滿足最多客戶需求之產品是由整體擁有最多預期客戶數的k個候選產品所組成。當產品至少有3個屬性時，此問題為一個NP-hard的問題，因此我們提出一個貪婪演算法來解決此問題。此外，為了確保找出此問題的最佳解，本研究提出k個產品的預期客戶數上限值之估計方法，可有效縮減最佳解的搜尋空間，藉由該削減策略，發展出一個可有效率精確找出最佳解的演算法。　　另一方面，一個產品的反轉式前t名查詢(reverse top-t query)可找出將該產品視為前t個喜愛產品的客戶，這些客戶可視為該產品的潛在客戶。給定一組候選產品集合以及顧客對產品屬性偏好的資料集合，k-t滿足最多客戶喜愛產品是由整體可涵蓋最多潛在客戶數的最多k個候選產品所組成，且根據是否有已存在競爭產品分成兩個問題版本。我們運用前t名查詢(top-t query)與天際線查詢(skyline query)推導出的特性來縮減此問題最佳解的搜尋空間。另外，本研究提出產品潛在客戶數上限值的估計方法，以降低對候選產品執行反轉式前t名查詢的計算成本。由於此兩個問題版本均為NP-hard的問題，因此我們運用多個有效的削減策略，設計出的貪婪演算法所找到的近似解涵蓋之潛在客戶數，保證可達到最佳解涵蓋最多潛在客戶數的(1-1/e)近似率。　　此外，針對這兩部分的研究，我們皆設計一系列的實驗，以人造模擬資料及真實資料來驗證我們所提出的演算法的有效性及執行效率。

關鍵字

資料與知識管理演算法；決策支援；演算法及系統效能評估；查詢處理

並列摘要

To estimate the number of customers in target markets by taking both product competition and customer preference into consideration provides important information for decision making of product plans on product marketing. For achieving this purpose, this dis-sertation defines and solves two query processing problems of product positioning, named k-Most Demanding Products (k-MDP) discovering and k-t Most Favorite Products (k-t MFP) discovering, from different perspectives on estimating the number of customers. Given a set of customers demanding a certain type of products with multiple features, a set of existing products of the type, and a set of candidate products which can be offered by a company, the k-MDP discovering problem is to choose k products from these candi-date products such that the expected number of the total customers for the k products is maximized. We show the problem is NP-hard when the number of the features of a prod-uct is 3 or more. One greedy algorithm is proposed to find approximate solution for the problem. We also attempt to find the optimal solution of the problem by estimating the upper bound of the expected number of the total customers for a set of k candidate prod-ucts for reducing the search space of the optimal solution. An exact algorithm is then pro-vided to find the optimal solution of the problem efficiently by using this pruning strate-gy. On the other hand, a reverse top-t query for a product returns a set of customers, named potential customers, who regard the product as one of their top-t favorites. Given a set of products and a set of customers with different preferences on the features of the products, the k-t MFP discovering problem is to select at most k products such that the total number of potential customers is maximized. Two versions of the k-t MFP discovering problem are proposed according to whether existing products are considered in the problem. We exploit several properties of the top-t queries and skyline queries to reduce the solution space. In addition, an upper bound of the potential customers is estimated to reduce the computation cost of performing the reverse top-t query for a product. Because each ver-sion of the problem can be shown as an NP-hard problem, we provide one greedy algo-rithm with an approximation ratio (1-1/e) of the maximum total number of potential cus-tomers by using the designed pruning strategies. Furthermore, for both topics of studies, a series of experiments on synthetic datasets and real datasets are performed to show the effectiveness and efficiency of our proposed algorithms.

並列關鍵字

algorithms for data and knowledge management ； decision support ； performance evaluation of algorithm and systems ； query processing

參考文獻

[AT05] G. Adomavicius and A. Tuzhilin, “Toward the Next Generation of Recommender Systems: A Survey of the State-of-the-Art and Possible Extensions,” IEEE Transactions on Knowledge and Data Engineering, Vol. 17, No. 6, pp. 734-749, 2005.

[BK01] S. Borzsonyi, D. Kossmann, and K. Stocker, “The Skyline Operator,” In Pro-ceedings of the 17th International Conference on Data Engineering, pp. 421-430, 2001.

[BS97] M. Balabanovic and Y. Shoham, “Fab: Content-Based, Collaborative Recom-mendation,” Communications of the ACM, Vol. 40, No. 3, pp. 66-72, 1997.

[CG99] Y. H. Chien and E. I. George, “A Bayesian Model for Collaborative Filtering,” In Proceedings of the 7th International Workshop on Artificial Intelligence and Statistics, 1999.

[F98] U. Feige, “A Threshold of ln n for Approximating Set Cover,” Journal of the ACM, Vol. 45, No. 4, pp. 634-652, 1998.

國際替代計量

有效率找出k個滿足最多客戶需求/喜愛產品之演算法

主題瀏覽