使用伽馬散度之推廣t-分布隨機鄰域嵌入法

本論文主要在推廣t_1-分布隨機鄰域嵌入法。在此方法的原始論文中，是使用t_1-分布來嵌入低維度的空間來當作模型。t_1-分布有相當厚尾的性質，主要是用來減少擁擠問題的效應。然而對於不同的資料，其大小、維度、性質都不盡相同，不見得都需要用到t_1-分配來建模。因此本論文就把原本t-SNE使用的t_1-分布推廣到t_nu分布。為了衡量資料分布與模型分布間的差異，在t-SNE的原始論文中使用的是KL-散度。在這個部分中，我們亦作了換使用伽馬散度的推廣，其中KL-散度是伽馬散度的一個特例。再來推導出使用伽馬散度的t_nu-SNE的梯度，用來做可以執行的程式，並應用在兩個實際的例子上。

關鍵字

伽馬散度； t-SNE ；隨機鄰域嵌入法；視覺化方法；非監督式降維

並列摘要

This thesis presents an extended version of the t-SNE visualization method. In the original paper of t-SNE, the t_1-distribution was used to embed the data into a low dimensional space. The distribution t_1 has very fat tails that can reduce the effect of the crowding problem. However, data sets may vary in different aspects, such as the data set size, dimensionality, or feature properties, etc. It seems not adequate to use only the t_1 distribution for modeling the low dimensional similarity. Hence, it is natural to extend the degrees of freedom in t-SNE to general t_nu.To measure the discrepancy between data distribution and model distribution, the original paper used KL-divergence. In this work, we also make an extension by using gamma-divergence, which includes KL-divergence as a special case. The gradient of minimum gamma-divergence t_nu-SNE is derived and used in the implementation algorithm. Two numerical examples are presented.

並列關鍵字

gamma-divergence ； (t-SNE)stochastic neighborhood embedding ； unsupervised dimension reduction ； visualization

參考文獻

[1] K. Bunte, S. Haase, M. Biehl, and T. Villmann. Stochastic neighbor embedding (sne) for dimension reduction and visualization using arbitrary divergences. Neurocomputing, 08 2012.

Google Scholar

[2] T.-L. Chen, D.-N. Hsieh, H. Hung, I.-P. Tu, P.-S. Wu, Y.-M. Wu, W.-H. Chang, and S.-Y. Huang. γ-sup: A clustering algorithm for cryo-electron microscopy images of asymmetric particles. The Annals of Applied Statistics, 8, 05 2012.

Google Scholar

[3] F. Chollet. Deep Learning with Python 5.3-using-a-pretrained-convnet.ipynb. https://github.com/fchollet/deep-learning-with-python-notebooks/blob/master/5.3-using-a-pretrained-convnet.ipynb.

Google Scholar

[4] F. Chollet. Deep Learning with Python. MANNING, 2017.

Google Scholar

[5] H. Fujisawa and S. Eguchi. Robust parameter estimation with a small bias against heavy contamination. Journal of Multivariate Analysis, 99:2053–2081, 10 2008.

Google Scholar

國際替代計量

使用伽馬散度之推廣t-分布隨機鄰域嵌入法

未授權

主題瀏覽