透過您的圖書館登入
IP:3.21.248.40
  • 學位論文

Text Trend Analysis via Significant Term A Based on Indonesia News

Text Trend Analysis via Significant Term A Based on Indonesia News

指導教授 : 王經篤
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


The thesis provides the frequency distribution of significant terms over past time periods for text trend analysis via an Indonesia newspaper. The approach consists of two steps:(1) Data Preprocessing (2) Term History Generation. The former adopts agent techniques to download the news articles automatically and extracts the contents of these articles. The later uses an existing external memory approach to extract significant terms while computing the term history simultaneously. One significant term, in this study, is a series of words that were significant enough to present one event, action or concept. The term history of one term is the frequency distribution of that term over consecutive time periods as a time series data. The experimental resources includes one year of Indonesia newspaper, ”Serambi”, containing 28, 071 articles. Experimental result shows that it is attractive and meanful for foreigners who desire to know the trend and situation happened in Aceh province of Indonesia, where the majority of Serambi newspaper concerned with. Keywords: significant term, trend analysis, text mining

並列摘要


The thesis provides the frequency distribution of significant terms over past time periods for text trend analysis via an Indonesia newspaper. The approach consists of two steps:(1) Data Preprocessing (2) Term History Generation. The former adopts agent techniques to download the news articles automatically and extracts the contents of these articles. The later uses an existing external memory approach to extract significant terms while computing the term history simultaneously. One significant term, in this study, is a series of words that were significant enough to present one event, action or concept. The term history of one term is the frequency distribution of that term over consecutive time periods as a time series data. The experimental resources includes one year of Indonesia newspaper, ”Serambi”, containing 28, 071 articles. Experimental result shows that it is attractive and meanful for foreigners who desire to know the trend and situation happened in Aceh province of Indonesia, where the majority of Serambi newspaper concerned with. Keywords: significant term, trend analysis, text mining

參考文獻


[15] Keh-Jiann Chen and Ming-Hong Bai. Unknown word detection
[17] Jyh-Jong Tsay and Jing-Doo Wang. Improving automatic chinese
[33] Jing-Doo Wang. External memory approach to compute the
[1] Donald Metzler Bruce Croft and Trevor Strohman. Search Engines:
Information Retrieval in Practice. Addison-Wesley, 2009.

延伸閱讀