|||
0_based numbering: the initial element of a sequence is assigned the index 0;
1_based numbering: the initial element of a sequence is assigned the index 1;
BAM/BED data are 0-based and SAM data are 1-based.
An example shows below:
The screenshot came from a sam file. The sequence (10.4|823|29663|59.364374575|+|61.9047619048|73.3|1) was mapped to mm10, starting from 3259346 in chr10. As sam file is 1_base, which means first locus 3259346 in chr10 is the first nucleotide “A” of the sequence.
However, after I converted this to bed file using bamToBed, the iniial locus of the region converts from 3259346 to 3259345. And as Bed file is 0-based, so the second locus in chr10 is the first nucleotide “A”.
To double check, I extracted the the sequence of this region (chr10:3259345-3259372) in bed file as shown below. Comparing with the sequence (AAGGGGCTGGACTTGCATGCCATGGAT) in the sam file, we can know that the sequence indeed start from the second locus.
Archiver|手机版|科学网 ( 京ICP备07017567号-12 )
GMT+8, 2023-9-30 03:32
Powered by ScienceNet.cn
Copyright © 2007- 中国科学报社