帳號:guest(3.133.126.39)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):林映秀
作者(外文):Lin, Ying-Hsiu
論文名稱(中文):學術論文摘要的自動文步分析
論文名稱(外文):Automatically Identify Moves in Academic Abstracts
指導教授(中文):張俊盛
指導教授(外文):Chang, Jason S.
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊系統與應用研究所
學號:9565701
出版年(民國):98
畢業學年度:97
語文別:英文
論文頁數:46
中文關鍵詞:文步結構摘要特徵值機器學習模型
外文關鍵詞:Move StructureAbstractFeatureMaximum Entropy model
相關次數:
  • 推薦推薦:0
  • 點閱點閱:130
  • 評分評分:*****
  • 下載下載:5
  • 收藏收藏:0
在本論文中,我們利用督導式機器學習方法,經由自動抽取論文摘要中重要的特徵值,自動建立文步標示與特徵值間的關係並以此訓練機器學習模型。以此機器學習模型可以自動標示論文摘要中的文步結構。
在訓練階段,我們利用少量人工標記文步的論文摘要,從中取出重要的特徵值,再訓練機器學習模型學習特徵值與文步之間的關係,以進行對論文摘要的自動文步分析。在執行階段,應用訓練好的機器學習模型,我們分別針對電腦科學領域與應用語言學領域標示研究者所撰寫的論文摘要文步,我們利用人工評估的方法,在兩個不同領域下,皆可達到平均超過80%的正確率,顯示本方法可以成功標記論文摘要文步。
This paper presents a novel method for automatically identifying the move structure in academic abstracts to assist non-native speaker of English in academic writing. In our approach, we use a small set of manually tagged abstracts as training corpus and analyze the significant features. Maximum Entropy model (ME) is employed to classify the move structure in the given abstracts. It involves automatically learning of the syntactic features, and automatically building a statistical model. The proposed method outperforms the previous research with a significantly higher accuracy. Our methodology clearly shows that the ME could suitably model the abstract structure, and implies that a more flexible move tagger can be easily applied to different research domains using a small set of manually tagged abstracts.
摘要 I
ABSTRACT II
ACKNOWLEDGEMENT III
TABLE OF CONTENTS IV
CHAPTER 1 INTRODUCTION 1
CHAPTER 2 RELATED WORK 5
2.1 MACROSTRUCTURE OF RAS 5
2.2 STRUCTURAL REPRESENTATION OF ABSTRACTS 7
2.3 IDENTIFYING MOVES AS TEXT CLASSIFICATION 8
CHAPTER 3 METHOD 11
3.1 PROBLEM STATEMENT 11
3.2 LEARNING MOVE SENTENCE RELATION 13
3.2.1 Collecting Abstracts from the web 13
3.2.2 Manually label abstracts with moves 14
3.2.3 Select the features for machine learning 16
3.2.4 Train a machine learning model 20
3.3 RUN-TIME AUTOMATIC MOVE-TAGGING 22
CHAPTER 4 EXPERIMENTAL SETTING 25
4.1 EXPERIMENTAL SETTING 25
4.2 EVALUATION METRICS 27
CHAPTER 5 EVALUATION RESULTS 31
5.1 EVALUATION RESULTS 31
5.2 DISCUSSION AND ERROR ANALYSIS 35
CHAPTER 6 CONCLUSION AND FUTURE WORK 39
REFERENCES 40
APPENDIX A –GUIDELINES FOR HUMAN ANNOTATION OF ABSTRACT 43
ANSI. (1979). American national standard for writing abstracts. Z39.14-1979,
American National Standards Institute (ANSI).

Anthony, L. & Lashkia, G. V. (2003). Mover: A machine learning tool to assist in the
reading and writing of technical papers. IEEE Trans. Prof. Commun., 46, pp.
185-193.

Bhatia, V. K. (1993). Analysing genre: Language use in professional settings. Applied
Linguistics and Language Studies Series, London & NY: Longman.

Dos Santos, M. B. (1996). The textual organization of research paper abstracts in
applied linguistics., Text,16, 481-500.

Della Pietra, S., Della Pietra, V., Lafferty, J., Technol, R., & Brook, S. (1997).
Inducing features of random fields. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 19(4), 380-393.

Edmundson, H. P. (1969). New Methods in Automatic Extracting. Journal of the
Association for Computing: Machinery, 16(2), 264-285.

Hill, S. S., Soppelsa, B. F., & West, G. K. (1982). Teaching ESL students to read and
write experimental-research papers. TESOL Quarterly, 333-347.

Hirohata, K., Okazaki, N., Ananiadou, S., Ishizuka, M., & Biocentre, M. I. (2008).
Identifying Sections in Scientific Abstracts using Conditional Random Fields.

Lin, J., Karakos, D., Demner-Fushman, D., & Khudanpur, S. (2006). Generative
Content Models for Structural Analysis of Medical Abstracts. In Proceedings of the HLT/NAACL 2006 Workshop on Biomedical Natural Language Processing
(BioNLP’06), pages 65–72, New York City, USA.

Larry McKnight and Padmini Arinivasan. (2003). Categorization of sentence types in
medical abstracts. In AMIA 2003 Symposium Proceedings, pages 440–444.

Lau, H. H. (2004). The structure of academic journal abstracts written by Taiwanese
PhD students. Taiwan Journal of TESOL, 1(1), 1-25.

Naomi Graetz. (1985). Teaching EFL students to extract structural information from abstracts. In Jan M. Ulijn and Anthony K. Pugh, editors, Reading for Professional Purposed: Methods and Materials in Teaching Languages, pages 123–135. Acco, Leuven, Belgium.

Patrick Ruch, Celia Boyer, Christine Chichester, Imad Tbahriti, Antoine Geissb¨uhler,
Paul Fabry, Julien Gobeill, Violaine Pillet, Dietrich Rebholz-Schuhmann,
Christian Lovis, and Anne-Lise Veuthey. (2007). Using argumentation to extract
key sentences from biomedical abstracts. International Journal of Medical
Informatics, 76(2–3):195–200.

Salehi, J. D., Kurose, J. F., & Towsley, D. (1996). The effectiveness of affinity-based
scheduling in multiprocessor network protocol processing (extended version). IEEE/ACM Transactions on Networking (TON), 4(4), 516-530.

Swales, J.M. (1981). Aspects of article introductions: Language Studies Unit,
University of Aston in Birmingham.

Swales, J.M. (1990). Genre analysis: English in Academic and Research Settings.
Cambridge University Press.

Shimbo, M., Yamasaki, T., & Matsumoto, Y. (2003). Using sectioning information
for text retrieval: a case study with the MEDLINE abstracts.

Samraj, B. (2005). An exploration of a genre set: Research article abstracts and
introductions in two disciplines. English for specific purposes, 24(2), 141-156.

Teufel, S. (1999). Argumentative Zoning: Information Extraction from Scientific
Text.Unpublished PhD thesis, University of Edinburgh.

Teufel, S., & Moens, M. (2002). Summarizing Scientific Articles: Experiments with
Relevance and Rhetorical Status. Computational Linguistics, 28(4), 409-445.

Ulla Connor & Anna Mauranen (1999). Linguistic Analysis of Grant Proposals:
EuropeanUnion Research Grants

Wu, J. C., Chang, Y. C., Liou, H. C., & Chang, J. S. (2006). Computational analysis
of move structures in academic abstracts.

Yamamoto, Y., & Takagi, T. (2005). A sentence classification system for multi-document summarization in the biomedical domain.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *