  • 學位論文


Patent Terminology Classification Applying Relaxation Labeling

指導教授 : 蘇豐文


智慧財產權扮演著企業與國家發展競爭力的關鍵角色,然而專利文件能在法律上有效保障智慧財產權。由於專利文件的數量龐大,專利申請範圍(claim)的寫作格式也不盡相同,現今大多數的專利檢索、專利分析及專利侵權迴避工作仍需仰賴專家以人工方式進行。然而在專利分析的相關工作上,目前仍缺乏有效的自動化方法輔助。 在過去的研究中,各種不同的文件分類器(text classifier)廣泛地應用在文件分類工作上,快速有效地協助人工分類,解決了文件分類的需求。在本研究中,我們將文件分類器應用在專利文件中,用來自動化分類專利文件的專有名詞,協助分類專利文件。 在本篇論文中,提出了一個應用在專利文件中,以鬆弛演算法(relaxation labeling)來分類專有名詞(terminology)的方法。首先,針對CMP專利文件進行剖析,藉由專利文件的特性,利用自然語言處理技術和正規表示法,擷取出專利申請範圍中的專有名詞與其結構、屬性、材料方面的描述資訊,並進一步建立出CMP詞彙的分類架構(taxnonmy)。接著,從訓練資料中計算出分類架構中不同類別的機率與相關係數。透過鬆弛演算法的計算,可以得知每一個專有名詞最合適被分類的類別。 我們以兩個實驗來驗證此方法應用在分類專利文件的專有名詞的正確性。經由實驗驗證得知,在有效的剖析和擷取出專利申請範圍的語意內容下,此方法能達到一定的正確比率,提供使用者另一種分類專業領域詞彙的方法,輔助進行專利檢索、辨別相似名詞與建立專業領域詞庫等等相關工作。


Intellectual property (IP) is a power tool for economic growth of country, it is also the competitive advantage of innovation for businesses. As the view of law, patent is used to protect IP sufficiently. With the growing of patent documents and different writing styles of claims in patents, patent analysis works including patent retrivel, synonym identifying and domain thesaurus building are extremely manual works. In this thesis, an approach of text classification we propose is relaxation labeling. The technique is used to classify the terminologies in patent documents. We have pre-classified the taxonomy of CMP domain from training data in advance. The terminologies and the information about relation, attribute and material have been extracted by NLP technique and regular expression. The probability and compatibility coefficients which are parameters in the relaxation labeling model have been estimated from training data. In the progress of relaxation labeling, the probability of each class for each unclassified term was updating. The most appropriate class will be obvious when the model is converged. Based on the extracted semantic information, the experiment results clearly show that relaxation labeling is sufficient for terminology classification and achieve certain accuracy. We believe that relaxation labeling might have been usage in patent analysis work.


[1] Kamil Idris. Intellectual property-a power tool for economic growth, World Intellectual Property Organization, Switzerland. 2002.
[5] Svetlana Sheremetyeva. “Natural Language Analysis of Patent Claims,” Proceedings of ACL Workshop on Patent Corpus Processing, Japan. 2003.
[6] Shih-Yao Yang, Szu-Yin Lin, Shih-Neng Lin, Shian-Luen Cheng, and Von-Wun Soo. “An Ontology-based Multi-Agent Platform For Patent Knowledge Management”, International Journal of Electronic Business Management, Vol3, No.3, pp.181-192. 2005.
[7] Von-Wun Soo, Shih-Yao Yang, Szu-Yin Lin, Shih-Neng Lin and Shian-Luen Cheng. “A Cooperative Multi-Agent Platform for Invention based on Ontology and Patent Document Analysis,” Proceeding of the 9th International Conference on Computer Supported Cooperative Work in Design (CSCWD), UK. 2005.
[8] Gerd Nanz and Lawrence E. Camilletti. “Modeling of Chemical-Mechanical Polishing: A review,” IEEE Transactions on Semiconductor Manufacturing, Volume 8, No. 4, pp.382-389. 1995.


