Prior art retrieval refers to the process of identifying relevant prior arts for a given patent (or patent application). Prior art retrieval is essential to support patentability (or novelty) and invalidity searches, and its effectiveness greatly affects the validity of these searches. In this study, we aim at improving the effectiveness of prior art retrieval by proposing a summary-based prior art retrieval (SPAR) technique. Our rationale is that sentences in a patent document are not equally important in describing the invention claimed in the patent. Thus, we employ the text summarization approach to develop an automatic patent summarization technique for selecting important sentences in a query patent document and then use the patent summary for prior art retrieval. For evaluation purposes, we collect 78,225 patent documents from the United States Patent and Trademark Office (USPTO) website and conduct a series of experiments using a traditional full-text-based prior art retrieval technique as the performance benchmark. Our evaluation results suggest that our proposed SPAR technique significantly outperforms its benchmark technique. Moreover, our evaluation results also indicate that the inclusion of feature selection and our non-prior art selection method for patent summarization learning improve the effectiveness of prior art retrieval.
HASH(0x1cc60a50)