TickingClock的个人博客分享 http://blog.sciencenet.cn/u/TickingClock

博文

PNAS:AnchorWave~复杂基因组进行精细比对的新工具

已有 3022 次阅读 2022-7-20 15:56 |个人分类:每日摘要|系统分类:论文交流

AnchorWave: Sensitive alignment of genomes with high sequence diversity, extensive structural polymorphism, and whole-genome duplication

第一作者Baoxing Song

第一单位美国康奈尔大学

通讯作者Edward S. Buckler


 Abstract 

背景回顾Millions of species are currently being sequenced, and their genomes are being compared. Many of them have more complex genomes than model systems and raise novel challenges for genome alignment


提出问题:Widely used local alignment strategies often produce limited or incongruous results when applied to genomes with dispersed repeats, long indels, and highly diverse sequences. Moreover, alignment using many-to-many or reciprocal best hit approaches conflicts with well-studied patterns between species with different rounds of whole-genome duplication.  


软件开发Here, we introduce Anchored Wavefront alignment (AnchorWave), which performs whole-genome duplication–informed collinear anchor identification between genomes and performs base pair–resolved global alignment for collinear blocks using a two-piece affine gap cost strategy.


软件测试This strategy enables AnchorWave to precisely identify multikilobase indels generated by transposable element (TE) presence/absence variants (PAVs). When aligning two maize genomes, AnchorWave successfully recalled 87% of previously reported TE PAVs. By contrast, other genome alignment tools showed low power for TE PAV recall. AnchorWave precisely aligns up to three times more of the genome as position matches or indels than the closest competitive approach when comparing diverse genomes. Moreover, AnchorWave recalls transcription factor–binding sites at a rate of 1.05- to 74.85-fold higher than other tools with significantly lower false-positive alignments. 


结论AnchorWave complements available genome alignment tools by showing obvious improvement when applied to genomes with dispersed repeats, active TEs, high sequence diversity, and whole-genome duplication variation.


1.jpg


 摘 要 

目前,我们对数百万个物种的基因组进行了测序,亟需对这些基因组之间进行比较研究。大多数物种的基因组比模式物种复杂得多,这对基因组间的比对形成了新的挑战。当对含有散在重复序列、大片段插入或缺失以及序列多样性很高的基因组时,大多基于局域比对的计算方法经常会有所限制,或者导致错误的比对结果。此外,采用many to many或者RBH的比对方法进行基因组比对时,会产生与物种全基因组复制或多倍化相矛盾的结果。本文中,作者开发了一套叫做AnchorWave的基因组比对工具,其对全基因组复制和基因组间的共线性区域进行锚点识别,使用WFA和2-piece affine gap cost策略进行碱基水平上的共线性区域比对。这种策略可以精确鉴定由于转座子存在/缺失变异产生的数个Kb级别的缺失。当进行两个玉米基因组比对时,AnchorWave能够成功比对出之前报道中约87%的TE PAVs。而其它类似的基因组比对工具对于TE PAV的比对率较低。再比对分化程度较高的基因组时,再鉴定位置匹配或者缺失方面,AnchorWave是其它比对方法的3倍以上。此外,AnchorWave比对转录因子结合位点是其它工具的1.05-74.85倍,并且假阳性的比对明显较低。AnchorWave软件的开发是对目前已有基因组比对软件的补充,在比对含有散在重复序列、转座子、序列多样性高以及WGD等变异的基因组时,AnchorWave的效果有较为明显的改进。 




图片 Edward S. Buckler 图片


个人简介:

弗吉尼亚大学,学士;

密苏里大学,博士。


研究方向:开发能够综合利用多种生物学知识,设计适应多种环境的可持续节能作物的方法。


doi: https://doi.org/10.1073/pnas.2113075119


Journal: PNAS

Published date: December 21, 2021


Cite:
Baoxing Song, Santiago Marco-Sola, Miquel Moreto, Lynn Johnson, Edward S. Buckler. Michelle C. Stitzer. AnchorWave: Sensitive alignment of genomes with high sequence diversity, extensive structural polymorphism, and whole-genome duplication. PNAS, 2021, 119(1): e2113075119. DOI: https://doi.org/10.1073/pnas.2113075119




https://blog.sciencenet.cn/blog-3158122-1348120.html

上一篇:Trends in Plant Science:体细胞胚胎发生的层级基因调控网络(补2022.07.19)
下一篇:Current Biology:CLE多肽在孢子植物地钱配子体中的干细胞促进作用
收藏 IP: 218.2.103.*| 热度|

0

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

数据加载中...

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-12-28 15:12

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部