總體基因體學實驗通常通過測序16S和18S rRNA來推斷微生物群落。 分類指派(Taxonomic Assignment)是這些研究的基本步驟。 先前研究中用於測量現有生物分類方法性能的準確性或其他指標有兩個主要問題:基於序列計數和二元誤差量測。 這些使得評估結果具有誤導性,且缺乏完整資訊。 在這項研究中,我們調查兩個問題的不利影響,然後提出新的性能指標:平均分類距離(ATD)和ATD_by_Taxa 以及ATD圖來解決上述兩個問題。 通過比較舊指標和新指標的評估結果,我們發現新的指標於三個測試資料的結果更具信息性,可比性和可靠性。
Metagenomics experiments often make inference on microbial communities by sequencing the 16S and 18S rRNA. Taxonomic assignment is a fundamental step in such studies. The accuracy or other metrics used by previous studies for measuring performance of existing taxonomic assignment methods had two major problems: Sequence count based metrics and Binary error measurement. These made the evaluation results misleading and less informative. In this study, we investigate the bad influences of two problems and then purposed new performance metrics, Average Taxonomy Distance(ATD) and ATD_by_Taxa together with the ATD plot to deal with the problems. By comparing the evaluation results in old metrics and in our new metrics, we found the results more informative, comparable and robust across three test data sets.