透過您的圖書館登入
IP:3.23.101.60
  • 學位論文

探討以核密度估計取代時間序列計數資料之轉換:以Twitch直播為例

Using Kernel Density Estimation as An Alternative Approach of Time Series Count Data: A Case Study of Twitch.tv

指導教授 : 任立中
共同指導教授 : 蔡政安(Chen-An Tsai)

摘要


事件發生的時點資料是一種常見的原始資料型態,基於分析目的,經常對於時點資料作資料轉換,時間序列計數資料是轉換結果之一。轉換成一組時間序列計數資料之前,需事先決定時間間隔,其轉換的概念類似於直方圖的繪製;然而,以估計資料分佈的角度來看,核密度估計比直方圖來得合適,因此,本研究欲探討核密度估計是否能取代時間序列計數資料。為了比較兩種方法的差異,本研究將衡量兩事件的相關性設定為分析情境,先以一例子來解釋時間序列計數資料所衡量相關性的意義以及可能產生的偏差。接著,提出以核密度估計為基礎的衡量方法,解釋所衡量相關性的意義,並與時間序列計數資料所衡量的結果進行比較。再進一步,以統計模擬來驗證核密度估計的資料轉換方法確實較能正確地反映兩事件的相關性。最後,本研究以Twitch直播資料進行實證分析,套用核密度估計的轉換方法來分析直播主行為與觀眾反應的相關性,呈現出直播風格的異質,再將議題延伸至探討觀眾反應與訂閱行為的相關性。

並列摘要


Time event occurrence data is one common type of original data type. For such type of data, data transformation is often performed before data analysis. Time series count data is one of the results. Timespan should be set up before generating time series count data. The concept of generating time series count data is similar to the concept of creating a histogram. However, kernel density plots generally work better than histograms. As a result, we would like to explore whether kernel density estimation is an alternative approach of Time Series Count Data. To compare these two approaches, this study regards the measurement of correlation of two events over time as the analytic situation. At first, we give an example to explain the meaning and the bias of the measurement of correlation based on time series count data. Then, we propose an alternative measurement of correlation of two events over time based on kernel density estimation, and compare the difference in measurements between two approaches. Furthermore, we validate the KDE approach by using simulation. Finally, we conduct a case study of Twitch.tv by applying the two approaches to measure the correlation of behaviors between streamer and audience, presenting the heterogeneity of streamer style, and exploring the correlation of audience reactions and subscription behavior.

參考文獻


參考文獻
[1]. Barbieri F., Espinosa-Anke L., Ballesteros M., Soler-Company J., & Saggion H. (2017). Towards the Understanding of Gaming Audiences by Modeling Twitch Emotes. Proceedings of the 3rd Workshop on Noisy User-generated Text, 11-20.
[2]. Bhattacharyya, A. (1943). On a Measure of Divergence between Two Statistical Populations Defined by Their Probability Distributions. Bulletin of the Calcutta Mathematical Society, 35, 99-109.
[3]. Hashimoto, S., Yoshiki, S., Saeki, R., Mimura, Y., Ando, R., & Nanba, S. (2016). Development and application of traffic accident density estimation models using kernel density estimation. Journal of Traffic and Transportation Engineering (English Edition), 3(3), 262-270.
[4]. Novak, T. P. & Hoffman, D. L. (1997). Measuring the flow experience among web users. Paper Presented at Interval Research Corporation. Retrieved April 2, 2006.

延伸閱讀