Triplex Forming Oligonucleotide (TFO) is extensively used in gene therapy in recent years. It is a short segment of at least thirteen consecutive purines (A or G) or at least thirteen consecutive pyrimidines (C or T) in a DNA sequence. It is not easy to observe a TFO in a DNA sequence. Hence, determining the probability of finding a TFO in a given DNA sequence is the focus of many academics’ attentions. In past years, some methods evaluating this probability have been proposed. However, most of these methods lack of accuracy. In this paper we propose a coin tossing model to describe the occurrence of a TFO in a randomly selected DNA sequence. The expectation and variance of the number of base pairs required to observe a TFO are calculated exactly. The coin tossing model possesses the nice property of approximate memoryless. Hence, the number of tosses needed until the appearance of some specific pattern has a distribution approximate to exponential one. Consequently, the probability of finding a TFO in a given sequence of DNA can be approximated by an exponential distribution. Some simulations are also made to confirm this result.
Triplex Forming Oligonucleotide (TFO) is extensively used in gene therapy in recent years. It is a short segment of at least thirteen consecutive purines (A or G) or at least thirteen consecutive pyrimidines (C or T) in a DNA sequence. It is not easy to observe a TFO in a DNA sequence. Hence, determining the probability of finding a TFO in a given DNA sequence is the focus of many academics’ attentions. In past years, some methods evaluating this probability have been proposed. However, most of these methods lack of accuracy. In this paper we propose a coin tossing model to describe the occurrence of a TFO in a randomly selected DNA sequence. The expectation and variance of the number of base pairs required to observe a TFO are calculated exactly. The coin tossing model possesses the nice property of approximate memoryless. Hence, the number of tosses needed until the appearance of some specific pattern has a distribution approximate to exponential one. Consequently, the probability of finding a TFO in a given sequence of DNA can be approximated by an exponential distribution. Some simulations are also made to confirm this result.