kangyu的个人博客分享 http://blog.sciencenet.cn/u/kangyu

博文

fasta格式转化-fasta_formatter

已有 5289 次阅读 2014-8-28 23:08 |个人分类:Bioinformatics|系统分类:科研笔记

fasta_formatter -h
usage: fasta_formatter [-h] [-i INFILE] [-o OUTFILE] [-w N] [-t] [-e]
Part of FASTX Toolkit 0.0.13.2 by gordon@cshl.edu

  [-h]         = This helpful help screen.
  [-i INFILE]  = FASTA/Q input file. default is STDIN.
  [-o OUTFILE] = FASTA/Q output file. default is STDOUT.
  [-w N]       = max. sequence line width for output FASTA file.
                 When ZERO (the default), sequence lines will NOT be wrapped -
                 all nucleotides of each sequences will appear on a single
                 line (good for scripting).
  [-t]         = Output tabulated format (instead of FASTA format).
                 Sequence-Identifiers will be on first column,
                 Nucleotides will appear on second column (as single line).
  [-e]         = Output empty sequences (default is to discard them).
                 Empty sequences are ones who have only a sequence identifier,
                 but not actual nucleotides.

Input Example:
  >MY-ID
  AAAAAGGGGG
  CCCCCTTTTT
  AGCTN

Output example with unlimited line width [-w 0]:
  >MY-ID
  AAAAAGGGGGCCCCCTTTTTAGCTN

Output example with max. line width=7 [-w 7]:
  >MY-ID
  AAAAAGG
  GGGTTTT
  TCCCCCA
  GCTN

Output example with tabular output [-t]:
  MY-ID    AAAAAGGGGGCCCCCTTTTAGCTN

example of empty sequence:
(will be discarded unless [-e] is used)
 >REGULAR-SEQUENCE-1
 AAAGGGTTTCCC
 >EMPTY-SEQUENCE
 >REGULAR-SEQUENCE-2
 AAGTAGTAGTAGTAGT
 GTATTTTATAT




https://blog.sciencenet.cn/blog-803390-823110.html

上一篇:使用awk实现:二代测序文件fastq转换为fasta格式
下一篇:科学网博文中如何插入perl脚本实例文件?有知道的请回复,谢谢!
收藏 IP: 159.226.24.*| 热度|

0

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

数据加载中...

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-4-20 03:47

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部