|||
#编者信息
熊荣川
明湖实验室
xiongrongchuan@126.com
http://blog.sciencenet.cn/u/Bearjazz
Phylogenomics increasingly involves the screening of thousands of phylogenetic trees using specialised sorting algorithms that assign phylogenetic trees a classification based on features of interest, e.g., strongly supported monophyletic relationships of taxa in question (i.e., the “target” taxa). Here, phylogenetic trees in flat files (e.g., Newick format) are sorted (i.e., classified) based on text-pattern matching. This principle is not to be confused with the tree sort process, common in computer science, of rearranging binary data elements in an ordered structure (Knuth, 1971). Currently available utilities, e.g., PhyloSort (Moustafa & Bhattacharya, 2008) and SICLE (DeBlasio & Wisecaver, 2013) screen a set of phylogenetic trees for the presence of clades that unite a set of user-defined target taxa (as indicated in tip labels, i.e., leaves, on the tree) based on clade support that exceeds a defined threshold, and sort these trees accordingly; SICLE (DeBlasio & Wisecaver, 2013) specifically identifies all nearest neighbours (sister clades) of a single user-defined target. However, these tools do not consider the proportion of non-target leaves and overall taxon composition in a tree during the sorting process. Moreover, tools implemented in a graphical user interface e.g., PhyloSort (Moustafa & Bhattacharya, 2008) do not allow for automation of multiple analyses, thus limiting scalability.
| 基因组系统学越来越多地涉及使用专门的算法筛选成千上万的系统发生树,这些分类算法根据感兴趣的特征(例如,强烈支持单系关系的 “目标”类群)为系统发生树分配一个类别。在这里,纯文本文件(如newick格式)中的系统发生树根据文本模式匹配进行排序(即分类)。这一原则不应与计算机科学中常见的以有序结构重新排列二进制数据元素的树排序过程相混淆(Knuth, 1971)。目前可用的实用程序,例如,PhyloSort(Moustafa & Bhattacharya, 2008)和SICLE(DeBlasio & Wisecaver, 2013),根据超过定义阈值的支系支持率,筛选出一组系统进化树,以确定是否存在一组用户定义的目标分类群(系统发育树末梢单元),并对其进行相应排序。SICLE(DeBlasio & Wisecaver, 2013)还能够专门识别由用户定义的单个目标的所有最近邻居(姐妹群)。但是,这些工具在排序过程中不考虑树中非目标末梢的比例和总体分类单元组成。此外,在图形用户界面中实现的工具(例如,Physort)(Moustafa & Bhattacharya, 2008)不允许实现自动化复合分析,从而限制了可扩展性(规模性)。 |
Stephens T G , Bhattacharya D , Ragan M A , et al. PhySortR: A fast, flexible tool for sorting phylogenetic trees in R[J]. PeerJ, 2016, 4(5):e2038. |
Archiver|手机版|科学网 ( 京ICP备07017567号-12 )
GMT+8, 2024-9-25 15:41
Powered by ScienceNet.cn
Copyright © 2007- 中国科学报社