透過您的圖書館登入
IP:3.135.213.214
  • 期刊
  • OpenAccess

Frequency, Collocation, and Statistical Modeling of Lexical Items: A Case Study of Temporal Expressions in Two Conversational Corpora

並列摘要


This study examines how different dimensions of corpus frequency data may affect the outcome of statistical modeling of lexical items. Our analysis mainly focuses on a recently constructed elderly speaker corpus that is used to reveal patterns of aging people's language use. A conversational corpus contributed by speakers in their 20s serves as complementary material. The target words examined are temporal expressions, which might reveal how the speech produced by the elderly is organized. We conduct divisive hierarchical clustering analyses based on two different dimensions of corporal data, namely raw frequency distribution and collocation-based vectors. When different dimensions of data were used as the input, results showed that the target terms were clustered in different ways. Analyses based on frequency distributions and collocational patterns are distinct from each other. Specifically, statistically-based collocational analysis generally produces more distinct clustering results that differentiate temporal terms more delicately than do the ones based on raw frequency.

參考文獻


Chen, C.-H.(2009).Corpus, lexicon, and construction: A quantitative corpus approach to Mandarin possessive construction.International Journal of Computational Linguigstics and Chinese Language Processing.14,305-340.
CKIP. (2004). CKIP Chinese word segmentation system. Retrieved June 2, 2011 from http://ckipsvr.iis.sinica.edu.tw/
Gries, S. T. (2007). Collostructional analysis: Computing the degree of association between words and words/constructions. Retrived May 30, 2011 from: http://www.linguistics.ucsb.edu/faculty/stgries/teaching/groningen/coll.analysis.r
R. D. C. Team. (2010). R: A language and environment for statistical computing. Retrieved June 1, 2010 from: http://www.R-project.org
Baayen, R. H.(2008).Analyzing linguistic data. A practical introduction to statistics using R.Cambridge:Cambridge University Press.

延伸閱讀