本文作者与北京大学计算语言学研究所(ICL/PKU)的同仁一道,自1986年起积25年之努力建成“综合型语言知识库”(简称CLKB)。现代汉语多级标注语料库是CLKB中的一项重要的语言知识库。本文在介绍CLKB的概要之后,论述ICL/PKU研制多级标注语料库的理念、已经取得的成果及其应用情况。
The Comprehensive Language Knowledge Base (CLKB) which has been under construction by the authors and the colleagues of Institute of Computational Linguistics at Peking University since 1986. Mandarin Chinese multi-level annotated corpus is one of the important language knowledge bases of CLKB. After a brief introduction of CLKB, this paper describes the leading ideas, the achievement and application of our multi-level annotated corpus.