雙倍體生物單體型之推演─以距離關係系譜樹為基礎

單體型在人類遺傳學中扮演著重要的角色，與遺傳疾病的研究息息相關。為了得到人類單倍套染色體的基因藍圖，其中一個方法，嘗試利用運算技術從可得的遺傳型資料推演出單體型的基因資訊。這樣的問題稱之為雙倍體生物單體型之推演。然而，如何找到真正符合生物學上需求的單體型基因資訊，仍然是關鍵所在。到目前為止，已有許多問題模型被提出，像是HIPP、MPPH和MRHC。這些被提出來的問題模型從不同的角度詮釋單體型推演這個問題，卻有一個共通的特色，就是都是以表達特徵性質的資料來做分析。與眾不同地，本論文提出一個用距離關係來做分析的問題模型，並命名此模型為DMPH。在本論文，我們證明DMPH是APX-hard的問題，並且為此模型，提出了一個綜合相鄰連接法和基因演算法的解題方法。最後，我們以模擬實驗去驗證DMPH模型之合理性，並探討其計算方法之效能。

關鍵字

單體型；基因型；單體型推演；特徵資料；距離資料； APX-hard ；相鄰連結法；基因演算法

並列摘要

Haplotypes play an important role in human genetics, especially for disease association studies. To develop a full haplotype map of the human genome, one way to overcome the technological limitations is to infer haplotype data computationally from genotype data. This problem is named the haplotype inference. Nevertheless, to find the authentic haplotype information for biological purpose is still a key point, and perhaps a bottleneck. Up to now, many models have been proposed, such as HIPP, MPPH, and MRHC. They solve the haplotype inference problem from different viewpoints, but they all discuss the problem with the character-based data. Out of the ordinary, this paper proposes a model, named DMPH, which is discussed in a distance-based manner. We prove that our proposed model is APX-hard and provide a method combining the technology of the neighbor-joining method and the genetic algorithm. Finally, we provide experimental results to test and verify the model we proposed and the method we used.

並列關鍵字

haplotypes ； genotype ； halotype inference ； HIPP ； MPPH ； MRHC ； character-based data ； distance-based ； APX-hard ； neighbor-joining ； genetic algorithm