透過您的圖書館登入
IP:18.119.118.206
  • 學位論文

應用多標籤分類於情緒文件檢索之研究

Multi-Label Classification for Emotion Document Retrieval

指導教授 : 禹良治

摘要


在現今網際網路普及的時代,資訊檢索已成為大眾在龐大網際網絡資訊庫中尋找資料必經的程序。本論文以專業心靈健康諮詢網站為研究對象,提出一個能夠讓諮詢者以一般書寫文章的方式進行查詢,達到以文找文的資訊檢索模式。本論文提出的檢索模式,考慮專業心靈健康諮詢網站已建立諮詢問題多標籤(Multi-Label)分類的特性,使用獨立成份分析(Independent Component Analysis, ICA)辨識使用者查詢(Query)中所含的情緒標籤,接著再與一般常用的BM25檢索模型結合,綜合考量標籤與文字特徵計算使用者查詢與文件的相似度,以幫助使用者找出與其情緒問題相關的文件。實驗結果顯示獨立成份分析可區分不同情緒標籤之特徵以提升多標籤文件分類之效能,而加入標籤資訊於檢索模型則可進一步以提升檢索的準確度。

並列摘要


The aim of Information Retrieval (IR) is to retrieve a set of documents relevant to users’ queries from a database. This thesis builds a retrieval model using the query-by-example scheme. The document database used herein is a mental health website, PsychPark. Since each document in PsychPark has been annotated with emotion labels (topics), we use the independent component analysis (ICA) for multi-label classification. The identified labels are then combined with the BM25 retrieval model to calculate the similarity between users’ queries and documents. The experimental results show that the use of ICA can identify the features of different labels to improve the performance of multi-label document classification. Additionally, incorporating the label information can further improve the precision of information retrieval.

參考文獻


1.Liang-Chih Yu, Chung-Hsien Wu, Fong-Lin Jang. “Psychiatric document retrieval using a discourse-aware model”. Artificial Intelligence 173, pp. 817-829, 2009.
2.Aiyesha Ma, Ishwar Sethi, Nilesh Patel. “Multi-Label Classification Method for Multimedia Tagging”. International Journal of Multimedia Data Engineering and management,1(3), pp. 57-75, 2007.
3.G. Tsoumakas, I. Katakis. “Multi-Label Classification: An Overview”. International Journal of Data Warehousing and Mining 3(3), pp. 1-13, 2007.
5.Zhaohui Zheng, Xiaoyun Wu, rohiniSrihari. “Feature Selection for text categorization on imbalanced data”. ACM SIGKDO 6(1), 2004.
6.Lawrence B. Holder,Ingrid Rusell, Zdravko Markov, Anthony G. Pipe, Brian Carse. “Current and Future Trends in Feature Selection and Extraction for Classification Problems”. International Journal of Pattern Recognition and Artificial Intelligence. 19(2), pp. 133-142, 2005.

被引用紀錄


簡慈儀(2008)。50歲以上民眾初次罹患腦中風疾病之危險因素探討-以中部某醫學中心病患為例〔碩士論文,亞洲大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0118-0807200916274780

延伸閱讀