利用搜尋及探勘相關公開文獻的生醫詞彙快速探索私有實驗資料的資訊檢索系統

在從事生物醫學相關研究的草創階段中，找尋相關科學文獻並且將研究人員過去所私有的生物醫學實驗資料對應到科學文獻對於研究者是一件重要的工作。本研究提出一搜尋系統，主要探討在大量的生物醫學文獻中進行快速檢索，其中採用的公開科學文獻資料庫為美國國家圖書館所建立的平台—PubMed，並且採用字詞探勘技術—名稱辨識進行蛋白質名稱的字詞抽取；再以本研究所提出的規則法對蛋白質名稱進行正規化取得識別碼，並以此識別碼連接至實驗數據資料庫。其目的為讓使用者在短時間之內能夠以圖形化的方式呈現私有的實驗數據與相關科學文獻等資訊，藉此得知實驗數據在其檢索出的相關文件中所佔有的重要性與關聯性，進一步發現可研究的議題或更有價值的資訊。

關鍵字

資訊檢索；字詞探勘；生物資訊

並列摘要

In the beginning of biomedical research works, mapping researchers’ proprietary experiment data to public research literatures is an important work. In this paper, a search engine is proposed to retrieve large scale biomedical literatures which are collected from PubMed in an efficient way. Moreover, we apply a name entity recognition tool which is a kind of text-mining technique to extract protein names from the biomedical literatures. Afterward, the protein names are normalized to IDs which can be linked to the researchers’ proprietary experiment databases and using web techniques automatically plot the charts for the relevant proprietary data; through these processes, the researchers can efficiently get the relevance between their proprietary data and the public papers also can help them to find more available research works.

並列關鍵字

Information retrieval ； Text-mining ； Bioinformatics

參考文獻

2. Rebholz-Schuhmann, D., A. Oellrich, and R. Hoehndorf, Text-mining solutions for biomedical research: enabling integrative biology. Nature Reviews Genetics 2012. 13: p. 829-839.

3. Waterston, R.H., On the sequencing of the human genome. Proceedings of the National Academy of Sciences of the United States of America, 2002. 99(6): p. 3712-3716.

4. Garber, M., et al., Computational methods for transcriptome annotation and quantification using RNA-seq. Nature Method, 2011. 8(6): p. 469-477.

5. Cheng, D., et al., PolySearch: a web-based text mining system for extracting relationships between human diseases, genes, mutations, drugs and metabolites. Nucleic Acids Research, 2008. 36(2): p. 399-405.

6. Yu, H., Selective sampling techniques for feedback-based data retrieval. Data Mining and Knowledge Discovery, 2011. 22(1-2): p. 1-30.

被引用紀錄

陳熙（2015）。文獻與私有生醫實驗資料連結之關聯度排序演算法〔碩士論文，國立臺灣大學〕。華藝線上圖書館。https://doi.org/10.6342/NTU.2015.02102

國際替代計量

利用搜尋及探勘相關公開文獻的生醫詞彙快速探索私有實驗資料的資訊檢索系統

全文下載

主題瀏覽