||
系统发育树是解决生态和进化问题的重要工具之一。构建较大的系统发育树的难点在于公共DNA序列的数据总量和数据结构。基因组数据和DNA条形码数据是两个不同类型的数据。目前它们还不太容易整合在一起,用于重新构建系统发育树。另外,两类数据的划分却也常常被研究人员忽视。Douglas Chesters博士通过生物信息学系统方法,重构了第一个昆虫(昆虫纲)物种水平的系统发育树。该工作对核转录组、线粒体基因组和DNA条形码分别建立了独立的数据矩阵;之后,逐级分析直至种一级进行系统发育树重建。该工作目前涉及的系统发育树包括了760 科、13,865 属和 49,358个种。这个昆虫纲物种水平的系统发育树的基部部分与之前部分发表的工作相似。这项工作对昆虫物种树的实用性进行评估,比较了一些基于DNA分类学的分类方法。结果发现:当以大的生命树作为参考数据时,DNA分类学思路的准确度较高。本文报道了如何将不同类型序列数据整合入超级矩阵这一技术难点的解决方法,将大大促进昆虫物种多样化的研究,并可以通过DNA更好描述来研究的群落的多样性。
此项工作是大数据库物种界定工作的延续和扩展,相关内容在首届全国生物系统学论坛上作特邀报告展示。Douglas Chesters博士目前是中国科学院动物研究所功能昆虫群进化研究组副研究员,独立承担国家自然科学基金委面上项目一项和外国青年科学家项目1项,并得到中国科学院“国际人才计划”的资助。
开源软件: A Linux implementation of the protocol described here ismade freely available under the GNU general public license athttps://sourceforge.net/projects/sophi/, and in supplementary material.
论文:Construction of a Species-Level Tree-of-Life for the Insects and Utility in Taxo.pdf
ACCEPTED
Construction of a Species-LevelTree-of-Life for the Insects and Utility in Taxonomic Profiling
Syst Biol (2016) syw099.
DOI: https://doi.org/10.1093/sysbio/syw099
While comprehensive phylogenies have proven an invaluabletool in ecology and evolution, their construction is made increasinglychallenging both by the scale and structure of publically available sequences.The distinct partition between gene-rich (genomic) and species-rich (DNAbarcode) data is a feature of data that has been largely overlooked, yetpresents a key obstacle to scaling supermatrix analysis.
I present a phyloinformatics framework for draftconstruction of a species-level phylogeny of insects (Class Insecta).Matrix-building requires separately optimized pipelines for nucleartranscriptomic, mitochondrial genomic, and species-rich markers, whereastree-building requires hierarchical inference in order to capturespecies-breadth while retaining deep-level resolution. The phylogeny of insectscontains 49358 species, 13865 genera, 760 families, 31 orders. Deep-levelsplits largely reflected previous findings for sections of the tree that aredata rich or unambiguous, such as inter-ordinal Endopterygota and Dictyoptera,the recently evolved and relatively homogeneous Lepidoptera, Hymenoptera,Brachycera (Diptera) and Cucujiformia (Coleoptera). However, analysis of bias,matrix construction and gene-tree variation suggests confidence in somerelationships (such as in Polyneoptera) is less than has been indicated by thematrix bootstrap method. To assess the utility of the insect tree as a tool inquery profiling, several tree-based taxonomic assignment methods are compared.Using mined test datasets of known species membership, a tendency is observedfor greater accuracy of species-level assignments where using a fixed,comprehensive tree-of-life in contrast to methods generating smaller de novoreference trees.
Described herein is a solution to the discrepancy in the waydata is fit into supermatrices. The resulting tree facilitates wider studies ofinsect diversification and application of advanced descriptions of diversity incommunity studies, amongst other presumed applications.
Data Integration, Data Mining, Insects, Phylogenomics, Phyloinformatics, Tree of Life
IssueSection:
Archiver|手机版|科学网 ( 京ICP备07017567号-12 )
GMT+8, 2024-11-23 16:52
Powered by ScienceNet.cn
Copyright © 2007- 中国科学报社