透過您的圖書館登入
IP:18.227.24.209
  • 會議論文

巧妙運用全文檢索套件加速關連式資料庫內文查詢效能之實作探討

A Clever Use of Full-Text Search Component to Expedite Text Query Performance for Relational Database

摘要


全文檢索在資訊系統發展達一定程度,已經成為複合查詢中不可或缺的一個選項。因應資料的多元性,新一代關連式資料庫(RDB)欄位衍生出一些特殊欄位型態,例如:長字串型態(Long VarChar)、大型文字資料欄位(TEXT或可以存放文字檔案的BYTE)等,某些情況下,特殊資料更以檔案型態獨立存放在資料庫以外的檔案伺服器中,這些衍生性欄位,提高了資料參考與運用的豐富性,也讓全文檢索在資訊系統中更顯重要。RDB透過SQL指令進行資料維護與查詢,結構嚴謹的SQL,為確保交易存取的完整性,每個SQL指令都必須經過層層驗證處理,並以最佳效能擷取資料。但SQL最佳化的策略多數無法涵蓋這類型特殊文字欄位的全文搜尋,即使勉強針對長字串建立索引,不僅浪費資料儲存空間,對查詢效能提升幫助仍有限,更遑論大型文字資料根本無從建立索引。 全文檢索套件的引入,是在不增加RDB負載的情形下,整合各種異質資料來源,將所有文字資料逐字(或以詞彙為單位)處理,快速產製一個完全獨立於資料來源的「索引資料庫」。應用程式將直接透過索引資料庫進行關鍵字查詢。全文檢索的查詢效能遠高於RDB的文字索引,其功能面甚至較RDB提供更多元的複合查詢選項。全文檢索套件與RDB,彼此獨立運作卻又能緊密串連。檢索時,系統資源互不干擾,即使上千人進行大量關鍵詞查詢的同時,也不會影響RDB的正常運作,但必要時,透過主索引值的適當傳遞,索引資料庫也可以探勘(Drill Through)到RDB,啟動與該主索引值相關的明細資料。兩者巧妙的運用,使用者可以明顯感受到查詢效能大幅提升、資料處理穩定度獲得改善、查詢功能面也更為周延,卻能避開資料離線處理的風險。 本文除了介紹全文檢索套件的基本架構、平台建置、及系統導入模式外,將以政大實際運用與建置的實例,即:選課課程查詢、公文查詢,說明全文檢索套件導入的運用層面,及其具體成效。

並列摘要


The full-text search function in an information system has been well developed and become an indispensible option in composite queries in a database system. The data could be text-based fields such as LongVarChar, TEXT, etc, or even an external file outside the database. But when the search is issued via a SQL command, one usually needs to employ an appropriate indexing scheme, especially for long strings, to ensure acceptable performance. In this work, we introduce the use of a full-text search component along with a relational database in an information system. This search component creates an independent index database off-line for various sources of text-based data that can be directly searched with keywords. This kind of full-text search can not only be performed more efficiently but also provides more functionality that is not available in traditional relational database. In such an information system, it allows thousands of people to query simultaneously independent of the relational database. The query results are designed to allow drilling through the relational database when necessary and retrieve relevant information in more details. When the two types of databases are integrated in an appropriate way, users can easily observe the improvement of efficiency and the stability of information processing. In this paper, we will introduce the basic architecture and the construction process of the full-text search component that we have adopted. We will also use the information system at National Chengchi University as an example to demonstrate how usefulness of this kind of system in various kinds of applications such as course selection inquiries, formal document inquiries, and so on.

並列關鍵字

RDB textsearch Tornado informix IRMS

延伸閱讀