沉闷科学的掘墓人分享 http://blog.sciencenet.cn/u/Bearjazz

博文

每日翻译20190424

已有 1221 次阅读 2019-4-24 07:06 |个人分类:翻译作品|系统分类:科研笔记| newick格式, 筛选, 特定支系


#编者信息

熊荣川

明湖实验室

xiongrongchuan@126.com

http://blog.sciencenet.cn/u/Bearjazz

Phylogenomics increasingly involves the   screening of thousands of phylogenetic trees using specialised sorting   algorithms that assign phylogenetic trees a classification based on features   of interest, e.g., strongly supported monophyletic relationships of taxa in   question (i.e., the “target” taxa). Here, phylogenetic trees in flat files   (e.g., Newick format) are sorted (i.e., classified) based on text-pattern   matching. This principle is not to be confused with the tree sort process,   common in computer science, of rearranging binary data elements in an ordered   structure (Knuth, 1971). Currently available utilities, e.g., PhyloSort   (Moustafa & Bhattacharya, 2008) and SICLE (DeBlasio & Wisecaver,   2013) screen a set of phylogenetic trees for the presence of clades that   unite a set of user-defined target taxa (as indicated in tip labels, i.e.,   leaves, on the tree) based on clade support that exceeds a defined threshold,   and sort these trees accordingly; SICLE (DeBlasio & Wisecaver, 2013)   specifically identifies all nearest neighbours (sister clades) of a single   user-defined target. However, these tools do not consider the proportion of   non-target leaves and overall taxon composition in a tree during the sorting   process. Moreover, tools implemented in a graphical user interface e.g.,   PhyloSort (Moustafa & Bhattacharya, 2008) do not allow for automation of   multiple analyses, thus limiting scalability.

 

基因组系统学越来越多地涉及使用专门的算法筛选成千上万的系统发生树,这些分类算法根据感兴趣的特征(例如,强烈支持单系关系的   “目标”类群)为系统发生树分配一个类别。在这里,纯文本文件(如newick格式)中的系统发生树根据文本模式匹配进行排序(即分类)。这一原则不应与计算机科学中常见的以有序结构重新排列二进制数据元素的树排序过程相混淆(Knuth, 1971)。目前可用的实用程序,例如,PhyloSortMoustafa & Bhattacharya, 2008)和SICLEDeBlasio & Wisecaver, 2013),根据超过定义阈值的支系支持率,筛选出一组系统进化树,以确定是否存在一组用户定义的目标分类群(系统发育树末梢单元),并对其进行相应排序。SICLEDeBlasio &   Wisecaver, 2013)还能够专门识别由用户定义的单个目标的所有最近邻居(姐妹群)。但是,这些工具在排序过程中不考虑树中非目标末梢的比例和总体分类单元组成。此外,在图形用户界面中实现的工具(例如,Physort)(Moustafa &   Bhattacharya, 2008)不允许实现自动化复合分析,从而限制了可扩展性(规模性)。

Stephens T G , Bhattacharya D , Ragan M A   , et al. PhySortR: A fast, flexible tool for sorting phylogenetic trees in   R[J]. PeerJ, 2016, 4(5):e2038.

 



https://blog.sciencenet.cn/blog-508298-1175167.html

上一篇:每日翻译20190423
下一篇:每日翻译20190425
收藏 IP: 117.188.218.*| 热度|

0

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

数据加载中...
扫一扫,分享此博文

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-5-18 01:40

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部