mashengwei的个人博客分享 http://blog.sciencenet.cn/u/mashengwei

博文

原创小程序-让你的引物设计又快又准!

已有 10189 次阅读 2018-4-2 08:38 |系统分类:科研笔记| 小麦, 引物设计

 原创小程序-让你的引物设计又快又准!

本期作者:Rui Wang

这次我们按照计划继续我们的生信菜鸟养成记(以往的推送文章请看文章末尾列表),今天已经来到了这个系列的第六集,我们也开始上一些code。这些code由Jorge实验室张军利博士开发,专治各种引物设计的疑难杂症,本人已亲测,非常好用,起码比PolyMarker又快又准!


其实今天以下的主要内容也都来自军利兄在一次小麦基因克隆的workshop中教大家设计引物的课件,经军利兄允许,今天无私奉献给大家!想到当前无论是设计引物的软件还是自己编写的小程序均是以英文为主,所以今天的推送也就给大家原汁原味的用英文奉上。另外,对今天的推送中一些基本原理有什么问题请参见我们上次生信系列的推送:小麦生信菜鸟归来(五)—系列总结以及特异性引物设计

Steps to design genome-specific primers

以下这六步其实就是特异性引物设计的核心,其实所有软件和方法都是根据这六步开发的。我们不仅要会用,更要知其所以然。
1. Blast the marker sequence against the pseudomolecule andfind all the
homeologsand potentially paralogs: >90% similarity
2. Extract the sequences for all the homeologs and potentially paralogs
3. Multiple Sequence Alignment
4. Find all the variation sites among the homeologs and paralogs
5. Use variation sites or combination of variation sites that are unique to
yourtargets to design primers
6. Blast all the primers against the pseudomolecule v1.0 with word length 7 to see whether they also hit other chromosomes

 

Common practices of PCR primers

·  Length: 18 - 25 nt

· Melting temperature: around 60 °C

· GC clamp: G or C bases within the last five basesin the 3' end helps promote specific binding, but more than 3 G/C should beavoided

· NO secondary structures

· Avoid template secondary structure or othercomplex regions, such as retros

· Amplicon length: KASP and dCAPS are short (<300 bp), other markers usually < 1 kb

· Primer pair Tm difference < 5 °C


Primer Design Tips

以下中第二点不知有多少小伙伴知道,这个思路很巧妙,用过的都说好!
1. Usethe unique variation site as the primer 3' end
7A CGAGCTTGATGACGAAGAAGGA
T
7B CGAGCTTGATGACGAAGAAGGAC

2. Two variation sites in the first 4 nt from the 3' end: we canintroduce 1
mutation in the 3rd nt from 3' end 
(may need to use touchdown PCR)
7A CGAGCTTGATGACGAAGAAGGAT
7B CGAGCTTGATGACGAAGAAGGAC
     CGAGCTTGATGACGAAGAAGAAT
Nucleotide substitution principle:
A → C
T → CG → AC → T


Validate Primer Location Using Chinese Spring Nullitetrasomic (NT) Lines
If we tested our primers target for 7A:
N7AT7D (7D7B7D): Absent
N7BT7D (7A7D7D): Present
N7DT7B (7A7B7B): Present
Are our primers 7A-specific?


Common PCR-based genotyping methods for SNP markers

以下的三种标记种类大家可能熟悉CPAS和KASP,不知有多少小伙伴熟悉dCAPS, 这个和上面第二点所用到的思路是一样的。

1. CAPS (Cleaved Amplified Polymorphic Sequences)

  • One SNP allele creates or removes a naturally occurring restriction site

  • Codominant

2. dCAPS (Derived CAPS)

  • For SNPs that donot create a natural restriction site

  • Uses introduced mismatches in one PCR primer to create a restriction  site forone allele

  • Codominant

3.  KASP (Kompetitive Allele Specific PCR)

  • A homogenous,fluorescence-based genotyping variant of PCR

  • Codominant

 为了让大家熟悉这个dCAPs,以下是一个例子:

IWB1998:CGAGCTTGATGACGAAGAAGGAGA[T/C]CGGGCAGACCCACGACGT

EcoRV: GAT'ATC

这里又有一个巧妙的思路:We can add some tails to make dCAPS primer longer to better separate after digestion:GAAGGTGACCAAGTTCATGCTCGAGCTTGATGACGAAGAAGGATA

 

Primer design software
· Primer3 (http://primer3.ut.ee/)
· Polymarker for KASP in wheat (
https://github.com/TGAC/biorubypolyploid tools)
· CAPS Designer (
https://solgenomics.net/tools/caps_designer/caps_input.pl)
· dCAPS Finder 2.0 (
http://biology4.wustl.edu/dcaps)
· indCAPS (
http://indcaps.kieber.cloudapps.unc.edu/)
· GSP (Genome Specific Primers) (
https://probes.pw.usda.gov/GSP/)

 

SNP Primer Design Pipeline

这个就是军利兄自己编写的Python小程序,以下也附上了在github上的源代码和相应说明文件。强烈推荐大家下载应用,说明文件也非常详细学起来也不难。

1. Apipeline to design KASP/CAPS/dCAPS primers for SNPs in wheat
2. A Python script which incorporates:

· Muscle: Multiple sequence alignment program

(http://www.drive5.com/muscle/)

· Primer3: program for designing PCR primers

(http://primer3.sourceforge.net/)

· blast+: BLAST the wheat genome

(https://blast.ncbi.nlm.nih.gov/Blast.cgi)

3. I have a github repository for this tool
https://github.com/pinbo/SNP_Primer_Pipeline 

下面是Pipeline的工作原理:

1.Blast each SNP sequence against the pseudomolecule and get hits that are

· >90% similarity and

· >90% of length of the best hit AND >50 bp

2.Get 500 bps on each side of the SNP for all the hits (SNP is at 501)
3. Multiple Sequence Alignment of the homeolog sequences with MUSCLE
4. Find all the variation sites that can differ the target from other homeologs
5. Use these sites as forced 3' end in Primer3 and design homeolog specific primers
6. Blast all the primers against the pseudomolecule v1.0 with word length 7 to see whether it also hits other chromosomes

· Criterion of matches: < 2 mismatches in thefirst 4 bps from 3'

 


引申阅读:小麦生信系列文章

1. 第一篇是为序,介绍了一个常用的生信网站Graingenes

2. 接下来三篇主要介绍了小麦物理图谱的介绍和应用,其中对小麦基因组数据库的总结介绍是非常基础且重要的知识。另外,也介绍了一些比较基因组学的知识和应用,包括野生二粒小麦,山羊草,拟南芥,和水稻。

3. 接下来几篇小编打算介绍三个主题:基因表达,特异性引物设计,以及突变体库。

4.其它相关推送:

 

欢迎关注小麦研究联盟”,了解小麦新进展投稿、转载、合作以及信息发布等请联系:wheatgenome




https://blog.sciencenet.cn/blog-1094241-1106888.html

上一篇:低温、高温对起身拔节期小麦的影响
下一篇:小麦领域Plant Cell上论文合辑-规律总结篇
收藏 IP: 58.213.93.*| 热度|

0

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

数据加载中...

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-4-24 15:27

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部