zhuchaodong的个人博客分享 http://blog.sciencenet.cn/u/zhuchaodong



已有 790 次阅读 2023-5-22 11:45 |个人分类:论文简介|系统分类:论文交流

Phylogeny-based assignment of functional traits to DNA barcodes

原创 Douglas Chesters 昆虫分类区系专委会  个2

In 2003, DNA barcoding was proposed as a means to profile species level diversity (e.g. MOTU) of invertebrates, and to assign taxonomic names. Since there are many easy-to-use software tools to build phylogenies from molecular sequences, it also became common to report phylogenies for sets of DNA barcodes. Thus, these three types of diversity: species, taxonomic, and phylogenetic, became standard measures for studies of alpha and beta diversity in invertebrates. During this period in plant sciences, functional traits moved center stage, as a way to move beyond simple structural measures of diversity and attempting to understand mechanisms behind community structure. Oddly, there has been very little intersection of these two developments, except for in microbiology where tools and databases were developed for assignment of metabolic trait profiles to 16S metabarcodes. Tingting Xie et al. address this gap, with two major aims(fig.1). Firstly, they propose approaches for assignment of traits to DNA barcodes. For a second aim of the study, they propose a practical pipeline for incorporation of traits into DNA barcode based profiling.

To achieve the first aim, the authors developed three approaches to DNA based assignment of traits, from references to queries. Each of these approaches was built around existing tools for DNA barcoding, phylogenetics and trait modelling. The first method they called ‘Phylogenetic Assignment’. In this, a phylogeny was constructed that incorporated reference members which had known traits, and the traits were modelled on the phylogeny, and the model used to predict states of queries. The second method they called ‘Blast Neighborhood’ and equivalent to an established distance-based approach to DNA based taxonomic assignment. For Blast Neighborhood, two Blast searches are conducted, the first retrieves a distance threshold for the neighborhood which is the distance from the query to the nearest reference, and the second retrieves all references within this threshold. Then, the states present in this neighborhood are assigned to the query. Finally, the authors conduct a simple Blast top hit and assign to the queries the states of this. These three approaches are assessed using a ‘Leave-1-Out’ strategy, often used for assessing DNA barcoding accuracy.

The three methods required a high-resolution dataset upon which comparative tests could be conducted. The test dataset comprising DNA barcodes and traits at the specimen level. To make this dataset, thousands of bees were collected at several sites across China, and returned to the lab. The authors measured several morphometric traits from these specimens, particularly body length, inter-tegular distance, head width, wing length, and hair length. Further, COI DNA barcodes were sequenced from the same specimens. This gave a dataset with the resolution necessary to account for intraspecific variation in any traits.

In comparative tests on the specimen-level dataset, the authors report that the rate of successful assignment of traits is primarily determined by the genetic distance between the query and the nearest reference. However, the three approaches did not behave the same(fig.3). Particularly, Phylogenetic Assignment was found to have advantageous features. Particularly, it rarely returned a significant state assignment where success was unlikely, where the distance from query to reference was large. In other words, it has a much lower false positive rate than the other two methods. 

With an accurate approach to trait assignment established (Phylogenetic Assignment), the second major aim of the study was to develop a practical near term pipeline for the incorporation of functional traits into DNA based profiling. For this, the authors propose a species-level reference framework based around published trait records, public DNA barcodes and an integrative phylogeny, because of the wide availability of these types of data. The authors collated thousands of records for a wide variety of functional traits, including morphometric and life history. They also mined public DNA barcode data and use a phyloinformatics pipeline to integrate with backbone phylogenetic information, and giving a species level phylogeny. Using this trait+DNA barcode+phylogenetic reference framework(fig.3), the authors then measured the rates of assignment of the many traits, to a query dataset. The query dataset used was of Chinese bee DNA barcodes, which were delineated to MOTU. The rate of trait assignment was found to vary greatly for different types of trait. The highest rates of state assignment were observed for conserved life history traits such as sociality, nest location and parasitism. Whereas labile morphometric traits were assigned at a much lower rate (fig.4).

Xie et al. (2023) propose to unite two key themes in ecology which to date have been unlinked: functional traits, and DNA barcoding. Leveraging the high throughput in community profiling afforded by DNA barcoding, and the high power of functional traits for explaining community composition, this opens powerful new capabilities in understanding the structure and function of biodiversity patterns.

Tingting Xie, Michael C. Orr, Dan Zhang, Rafael R. Ferrari, Yi Li, Xiuwei Liu, Zeqing Niu, Mingqiang Wang, Qingsong Zhou, Jiasheng Hao, Chaodong Zhu, Douglas Chesters. (2023) Phylogeny-based assignment of functional traits to DNA barcodes outperforms distance-based, in a comparison of approaches. Molecular Ecology Resources.





在确定了一个具有高正确率的性状注释方法后(基于系统发生关系的注释方法),本研究进一步提出将功能性状体系纳入基于DNA的分析中并制定出了可行性方案。基于此,本研究基于已发表的性状记录、公开的DNA条形码及整合的系统发生关系,构建了一个物种级的参考数据框架。为实现该研究目标,本研究收集了成千上万条性状数据,包括形态及生活史等功能性状数据。然后,通过基于系统发生关系的生物信息学(phyloinformatics )流程,将挖掘的公共DNA条形码数据与骨干树相融合,构建一个物种级的系统发发生树。最后,结合性状、DNA条形码以及系统发生关系的参考框架,尝试对中国蜜蜂 DNA 条形码的数据集进行性状注释(图3)并评估其正确率。结果发现不同类型性状的注释正确率存在较大差异,保守的生活史性状相较于多变的形态性状数据,具有较高的正确率,例如社会习性、筑巢位置和是否寄生(图4)。











收藏 IP: 159.226.67.*| 热度|

2 郑永军 宁利中

该博文允许注册用户评论 请点击登录 评论 (0 个评论)


Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2023-10-1 06:24

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社