組合群試問題的研究

所謂傳統的群式問題(classical group testing problem)，是要從含有正克隆(clones)及負克隆的群體中識別出正的克隆。其所使用的工具是群試驗(group tests)，而如何減少群試驗的使用量是主要被重視的問題。應用當中也經常衍生出其它型態的克隆，比如抑制型克隆(inhibitor)。它能攪擾正克隆的特性，使其不能發揮正常的功用，因此一個含有正克隆的群試驗可能無法顯現正克隆的存在，若它同時也含有抑制克隆。此外，在NDA篩選的環境中，有些特定的克隆能組合出具有相當特性的複合體，我們稱它為克隆複合體(complex)；因此，相對於克隆模型，複合模型探討的是正複合體的識別。逐步演算法(sequential algorithm)及非調整型演算法(nonadaptive algorithm)是兩個普遍的群試演算法。前者當中的試驗是逐一進行的，且下一個試驗可依據之前進行過的試驗的結果而去設計；後者當中所含的試驗是同時進行的，也就是只依據問題給定的訊息及假設去設計所有的試驗，且使其能達到識別所有正元素的能力。一般而言，逐步演算法所涉及的試驗量比非調整型演算法的來得少，但其完成所有試驗所需的時間比非調整型的來得多。群試抑制模型(the inhibitor model)指的是含有抑制型元素的群試模型；而群試複合模型(the complex model)是指所探討的問題是建立在複合體上。在群試研究的文獻中，這兩個模型已各別有完善的發展。在此論文中，我們提出抑制元素和複合體共存的群試環境(the inhibitor complex model)，並專攻於非調整型演算法的設計。我們採用新提出的概念「覆蓋性(inclusiveness)」去設計演算法。「分離性(disjunctness)」是另一個常被使用的概念，我們證明一個結合此兩種概念的設計能顯著地改善譯解試驗結果的時間。除了探討正複合體的識別，我們進一步從事複合體分類的工作，也就是設計演算法去區分所有的複合體(正的、負的及抑制型的)。(d,r;z]-分離性群試設計((d,r;z]-disjunct pooling design)是一個發展良好的演算法設計工具，我們證明此工具足以用來處理複合體的分類工作而且也結合了快速的譯解程序。總結下來，(H,d;z]-分離性、(d,r;z]-分離性和(h,r;y]-覆蓋性是三種主要用來設計非調型演算法的工具。我們在此也討論它們的建構方法及所需試驗量的下界。最後，我們探討推廣化的傳統群試問題--圖形重建問題。群試問題可被視為一種搜尋式問題的範例，且通常會在正元素的總量上做一個假設。而圖形的重建是另一種搜尋式的問題，其主要工作是要識別出隱藏的圖形，而已知條件是它是眾多可能圖形中的其中一個。其上用來作為識別的工具是一種類似於群試驗是詢問：一個詢問給的訊息是一個點的子集合是否包含隱藏圖形上的某個邊的所有的點。圖形重建的問題已有一些結果，比如隱藏的圖形是一個漢米爾頓圈(Hamiltonian cycle)、配對圖(matching)、星圖(star)或局部完全圖(clique)。針對這些問題，我們利用一些策略去改進這些結果，例如仿射平面法(the affine plane method)及配對結構法(maximal matching method)。

關鍵字

群試理論；群試設計；抑制及複合體群試模型；非調整型演算法； DNA序列重組；圖的重建及搜尋

並列摘要

In the classical group testing problem, there is a set N of n clones, each of which is either positive or negative. The main task of the problem is to identify all positive ones by group tests, and in identifying all positive clones, minimizing the number of group tests is the main issue. Motivated by applications, many studies have introduced a third type of clones called “inhibitors” whose effect is in a sense to obscure the positive clones in pools. Furthermore, in many applications, a subset of clones (rather than a single clone), called a complex, can induce a positive effect. There are two general types of group testing algorithms: sequential and nonadaptive. A sequential algorithm conducts the tests one by one where the outcomes of all previous tests can be treated as a reference to the later one, while a nonadaptive algorithm specifies all tests in advance and thus all tests can be conducted simultaneously. Generally, sequential algorithms require fewer number of tests than nonadaptive ones, but performing all tests in a sequential algorithm spends more time than performing all tests in a nonadaptive one. The group testing model which takes inhibitors (respectively complexes) into consideration is referred to as an inhibitor model (respectively a complex model). These two models have been well studied in the group testing liter ature. In this thesis, we first study group testing problems in a new pooling design environment by allowing the coexistence of inhibitors and complexes and devote our attention to nonadaptive algorithms. To identify positive items, we attach a novel property “inclusiveness” to a design. This property and a well-studied property “disjunctness” lead to a significant improvement in the decoding procedure. In addition to the identification problem where only positive items are identified, we also attempt to classify all items. We prove that the well-studied “(d, r; z]-disjunct matrices” are sufficient for the classification problems and associated with fast decoding procedures. In the identification and classification problems, (H : d; z)-disjunct, (d, r;z]-disjunct, and (d, r; z]-disjunct and (h, r;y]-inclusive with z > y are three main properties of matrices that are employed as nonadaptive pooling designs. We study their constructions and the lower bounds on the number of rows (tests). Finally, we study the graph reconstruction problem which is a generalization of the classical combinatorial group testing problem. A group testing problem is a search paradigm where it is usually assumed that there are at most d positive items among given items. A graph reconstruction problem is to reconstruct a hidden graph G from a given family of graphs by asking queries of the form “Whether a set of vertices induces an edge of G”. Reconstruction problems on families of Hamiltonian cycles, matchings, stars and cliques on n vertices have been studied where algorithms of using at most 2nlgn, (1+o(1))(nlgn), 2n and 2n queries were proposed, respectively. We exploit some strategies such as affine plane method to improve them to (1+o(1))(nlgn), (1+o(1))(1/2nlgn), n+2lgn and n+lgn, respectively.

並列關鍵字

Group Testing ； Pooling Design ； Inhibitor Complex Model ； Nonadaptive Algorithm ； Contig Sequencing ； Graph Search

參考文獻

[1] N. Alon and V. Asodi, Learning a hidden subgraph, SIAM J. Discrete

hidden matching, SIAM J. Comput. 33 (2004) 487-501.

[3] I. Anderson, Combinatorial Designs: Construction Methods, Ellis Hor-

[4] D. Angluin and J. Chen, Learning a hidden hypergraph, J. Mach. Learn.

[5] D. Angluin and J. Chen, Learning a hidden graph using O(log n) queries

國際替代計量

組合群試問題的研究

全文下載

主題瀏覽