在現今網際網路普及的時代,資訊檢索已成為大眾在龐大網際網絡資訊庫中尋找資料必經的程序。本論文以專業心靈健康諮詢網站為研究對象,提出一個能夠讓諮詢者以一般書寫文章的方式進行查詢,達到以文找文的資訊檢索模式。本論文提出的檢索模式,考慮專業心靈健康諮詢網站已建立諮詢問題多標籤(Multi-Label)分類的特性,使用獨立成份分析(Independent Component Analysis, ICA)辨識使用者查詢(Query)中所含的情緒標籤,接著再與一般常用的BM25檢索模型結合,綜合考量標籤與文字特徵計算使用者查詢與文件的相似度,以幫助使用者找出與其情緒問題相關的文件。實驗結果顯示獨立成份分析可區分不同情緒標籤之特徵以提升多標籤文件分類之效能,而加入標籤資訊於檢索模型則可進一步以提升檢索的準確度。
The aim of Information Retrieval (IR) is to retrieve a set of documents relevant to users’ queries from a database. This thesis builds a retrieval model using the query-by-example scheme. The document database used herein is a mental health website, PsychPark. Since each document in PsychPark has been annotated with emotion labels (topics), we use the independent component analysis (ICA) for multi-label classification. The identified labels are then combined with the BM25 retrieval model to calculate the similarity between users’ queries and documents. The experimental results show that the use of ICA can identify the features of different labels to improve the performance of multi-label document classification. Additionally, incorporating the label information can further improve the precision of information retrieval.