專利權具有排他性,保護發明人能在法定範圍內,對其完全公開的發明技術內容享有獨佔權,目的在排除他人在未經同意就實施其發明技術的行為,更是避免企業對於以存在的發明有重複的投資與研發。為求知己知彼,發明家或是企業主都必須了解競爭對手甚至是清楚世界發明的脈動與趨勢。專利資訊接露專利權的警訊,企業主可透過專利檢索幫助企業追蹤技術與發展方向,以降低龐大的侵權成本。 本研究提出一個混合式演算法結合蜜蜂交配最佳化與支持向量機演算法,不選擇解決支持向量機選擇參數分類問題。本研究方法步驟如下:首先利用中研院研發的CKIP中文自動斷句系統,將專利文件的題目摘要內容做斷詞分析;接著,根據文件與關鍵字頻率計算詞頻與反轉文件頻率(TF-IDF)篩選該專利文件中的重要關鍵詞彙;再來,使用本論文主要研究的以蜜蜂交配演算法選取參數的支持向量機,將特徵擷取後的高維稀疏矩陣作為分類器的輸入值進行專利文件分類。最後,本研究以化學機械研磨(CMP)領域的專利文件為案例來測試自動分類系統之效。
The patent right has the property of exclusiveness. The inventors can protect their right in the legal range and has monopoly for their open inventions. People are not allowed to use the invention before the inventors permit them to use. Companies try to avoid the research and development investment in the invention that has been protected by patent. Patent retrieval and categorization technologies are used to appear the patent information to reduce the cost of tort. In this research, we propose a novel method which integrates Honey-Bee Mating Optimization with Support Vectors Machines for patent categorization. First, the CKIP method is utilized to extract phrases of the patent summary and title. Then, we calculate the probability that a specific key phrase contains a certain concept based on Term Frequency - Inverse Document Frequency (TF-IDF) methods. By combining frequencies and probabilities of key phases generated from Honey-Bee Mating Optimization, our proposed method is expected to obtain better representative input values for SVM model. Finally, this research uses patents of Chemical Mechanical Polishing (CMP) as case examples to illustrate and demonstrate the proposed methodology at work with superior results.