透過您的圖書館登入
IP:18.119.125.7
  • 學位論文

有限存貨下的動態定價

Dynamic Pricing With Limited Inventory

指導教授 : 陳和麟
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


在這篇論文中,我們研究動態定價問題,並提供兩個有趣的情境。我們主要針對線上學習的情境設計演算法,並且考慮有限存貨的情況。對一位賣家而言,他的目標是在有限的時間,將有限的商品賣出,並以達到最高的累積收益為目標。為了描述兩個我們關心的變數對這個問題所造成的影響,我們建構了兩個理論模型。第一個模型,賣家對每個買家的資訊有所了解,在他針對買家定價前,他會先得到一些買家相關的資訊,這類的情境在網路購物等環境下較為常見。我們對買家的類型 (賣家所看到的資訊) 並沒有機率分佈的假設,加上商 品有限的假設,讓評估線上的動態定價演算法變得困難。我們提出了一個標準來評估線上動態定價演算法,針對此標準設計了一個演算法,並且提供該演算法期望收益的理論保證。第二個模型,我們假設每個買家可能在賣場待上一段時間,而即便他們看到一個可接受的價格,他們也可能策略性的等待更低的價格。針對這個模型,我們提供了一個新的買賣機制,並基於該機制設計一個線上動態定價演算法。同樣的,我們提供了該演算法在期望收益上的的理論保證。

並列摘要


This thesis introduces scenarios for the well-known dynamic pricing problem, and presents corresponding learning algorithms. Different form the previous works, we mainly focus on the scenario that initially, the seller is given a finite inventory, and want to sell them out in a finite period of time. We build two different theoretical models to describe this problem under different concerns. For the first model, the seller observe a context vector of each consumer before deciding the posted price for her, also the context of each consumer is adversarially given. In general, the objective of the seller is to maximize the revenue, however, it’s not as trivial under the adversarial setting with limited inventory. We introduce a criterion to evaluate the performance of an learning algorithm, and then design an algorithm with performance guarantee on top of such criterion. For the second model, all consumers may stay in the market for a period of time, and they may wait for lower payment in order to maximize their utility. In this model, we introduce a new selling mechanism with good properties, and design a learning algorithm with performance guarantee based on the new mechanism.

參考文獻


[1] Y. Abbasi-Yadkori, D. Pál, and C. Szepesvári. Improved algorithms for linear stochastic bandits. In Advances in Neural Information Processing Systems, pages 2312–2320, 2011.
[2] S. Agrawal and N. Goyal. Thompson sampling for contextual bandits with linear payoffs. In the 30 th International Conference on Machine Learning, pages 127– 135, 2013.
[3] P. Auer. Using confidence bounds for exploitation-exploration trade-offs. Journal of Machine Learning Research, 3:397–422, 2002.
[4] P. Auer and N. C.-B. and Paul Fischer. Finite-time analysis of the multi armed bandit problem. Machine Learning, 47(2-3):235–256, 2002.
[5] Y. Aviv, Y. Levin, and M. Nediak. Counteracting strategic consumer behavior in dynamic pricing systems. In Consumer-Driven Demand and Operations Management Models, pages 323–352. Springer, 2009.

延伸閱讀


國際替代計量