Mandarin Chinese is a highly flexible and context-sensitive language. Not only is it difficult to process this type of language in computers, but segmentation also poses problems due to the unclear delimitation of lexical units in Chinese sentences. This paper regards segmentation as a part of parsing with logic programming techniques. For the treatment of maximal freedom of empty categories in Mandarin Chinese, C-Command and Subjacency Conditions are embedded implicitly in the integrated segmentation-parsing model to decide which constituents are moved and/or deleted. A grammar formalism is proposed that has the specific features of uniform treatment of movements, an arbitrary number of movements, automatic detection of grammar errors beforehand and clear declarative semantics. A parser generator is used to translate the grammar rules and generate the optimized codes. Graph unifications that support multiple-valued, negated and disjunctive features are adopted to express the co-occurrence restrictions and information transfers among constituents in this model. Represented with this environment are many common linguistic phenomena that occur in Chinese sentences such as topic-comment structures, ba-constructions, bei-constructions, relative clause constructions, appositive clause constructions and serial verb constructions. The parsing of long Chinese sentences is also dealt with in this paper.
中文是一種使用非常彈性且前後文相關的語言,因此電腦很難處理中文語句。除此之外,由於中文句子語彙之間並沒有明顯的分隔符號,斷詞爲另一個困難的問題。這篇論文採用邏輯程式的技術,將斷詞視爲剖析的一部分。爲了處理中文空詞高自由度的使用,論文將c-command和subjacency兩項限制條件,放在整合的剖析-斷詞模型中,以決定那些成分被移走且/或刪除。論文也提出一種語法型式化語言,其具有均一處理移位現象及任意個數的移位、預先自動偵測語法錯誤、和清楚的敘述語等特點。剖析器產生裝置將語法規則轉換成程式碼,並作最佳化。圖形聯倂支持多値、反面、離接等結構,在這個模型中,被採用來表示成分間的共存限制和資訊傳遞。許多常見的語言現象如主題-評論結構、把字句、被字句、關係子句、同位句、遞續結構等,都在這個環境中表示出来。最後,本文也討論中文長句的處理。