利用資訊擷取的方法來尋找生物醫學文獻中基因與疾病的關係

目前生物科技發展日新月異,從過去對基因序列研究到目前基因功能性的研究,我們對人類的了解也越益進步. 此外,目前生物晶片的發明,與技術的發展與價格的下降,讓我們有機會一窺人類身上基因活動的全貌,也因此各種基因相關的資訊也隨之接踵而來. 此外,基因科技的發展對臨床醫師也帶來極大的衝擊,從過去對單一基因的研究,到目前數百到數千甚至數萬個基因研究,在質與量方面都帶來相當大的衝擊. 而MEDLINE是臨床醫師與生物醫學研究者最重要的文獻資料庫,總數超過一千萬篇的文章,帶來許多寶貴的資訊. 但是,動輒數以萬計的文章,與數以萬計的基因,對我們人類的認知能力也是極大的一個挑戰.因此,我們急欲利用一些資訊擷取的技術,來幫助我們消化數量龐大的生物文獻,而對臨床工作人員來說,基因與疾病的關係,是我們最迫切想要知道的訊息,所以本研究致力於應用資訊擷取技術來尋找文獻中基因與疾病的關係. 本研究主要針對如何建立基因與疾病的機率模型,我們建立了兩個機率模型,並比較其優缺點.

關鍵字

基因；疾病；機率

並列摘要

With the development of biomolecular technology, there is getting more and more information derived from genome research. Besides, the microarry was introduced to allow people study genome wide pattern of gene expression profile, the scientists have the opportunity to study the function of genes. At the same time, the functional genomic research also bring a great impact to clinicians which usually study single gene or study disease at biochemistry level. In traditional, the MEDLINE always is the major resource for clinicians research. Recently, the explosion amount of the genomic related research bring for clinicians is too complicated to understand it. For examples, when talking about one disease, there are approximate over ten thousand of articles and hundred genes in it. It is almost impossible for clinicians to digest the knowledge. So it is urgent that there must be some computational tools developed to help clinicians observing the gene and disease relationship In this research, we focus on constructing the probabilistic model of gene and disease relationship. By using two models to represent the knowledge from biomedical literature database, we can compare the two models in system performance and precision.

並列關鍵字

gene ； disease ； probability

參考文獻

Reference List

2. Andrade MA, Valencia A. Automatic extraction of keywords from scientific text: application to the knowledge domain of protein families. Bioinformatics 14: 600-7, 1998.

3. Baldi P, Long AD. A Bayesian framework for the analysis of microarray expression data: regularized t -test and statistical inferences of gene changes. Bioinformatics 17: 509-19, 2001.

7. Eckman BA, Kosky AS, Laroco LA Jr. Extending traditional query-based integration approaches for functional characterization of post-genomic data. Bioinformatics 17: 587-601, 2001.

8. Friedman C, Kra P, Yu H, Krauthammer M, Rzhetsky A. GENIES: a natural-language processing system for the extraction of molecular pathways from journal articles. Bioinformatics 17 Suppl 1: S74-82, 2001.

被引用紀錄

楊正銘（2004）。以文字探勘技術應用於疾病分類之輔助系統-以出入院病歷摘要為例〔碩士論文，臺北醫學大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0007-1704200714571381

國際替代計量

利用資訊擷取的方法來尋找生物醫學文獻中基因與疾病的關係

未授權

主題瀏覽