||
关于宏基因组常用的有参分析流程,主要是快速获得物种组成和功能组成,之前分享了
今天再介绍来自同一作者的另一个软件,可以一步完成功能和代谢组成。
HUMAnN2: The HMP Unified Metabolic Analysis Network 2,它在宏基因组研究中非常有用,通过这个分析,不仅能获得微生物的物种丰度信息,还能准确有效地获得微生物代谢途径和功能模块信息。
主页: http://www.huttenhower.org/humann2
官方教程:https://bitbucket.org/biobakery/humann2/wiki/Home (版本2017-12-14)
中文版本翻译日期(2018-05-01)
HUMAnN是基于宏基因组、宏转录组数据分析微生物通路丰度的有效工具。这一过程称为功能谱,目的是描述群体成员的代谢潜能。可以回答微生物群体成员可能干什么,或在干什么的问题。
软件特点:
HUMAnN2工作流程图
如果你安装过python,且有pip安装工具,可以轻松安装humann2
# 软件安装
pip install humann2
# 或可选手动下载安装
wget //files.pythonhosted.org/packages/43/07/ec41577c3c1f9b578875ade8ed549d14fc2944c13cb7504579d542b62a69/humann2-0.11.1.tar.gz
# 前面仍不成功,推荐conda安装更快更好用
conda install humann2
# 测试安装
humann2_test
# 比如我使用conda安装程序至/conda/bin目录,且没有添加环境变量,可以使用绝对路径调用程序
# 下载数据库
wd=/conda/bin
$wd/humann2_databases --available
# 5.37GB
$wd/humann2_databases --download chocophlan full /data/humann2
# 5.87GB,解压后11G
$wd/humann2_databases --download uniref uniref90_diamond /data/humann2
依赖关系
# Diaomond http://ab.inf.uni-tuebingen.de/software/diamond/
wget http://github.com/bbuchfink/diamond/releases/download/v0.9.21/diamond-linux64.tar.gz
tar xzf diamond-linux64.tar.gz
sudo ln -fs `pwd`/diamond /usr/local/bin/
输入文件为fastq,输出文件为指定目录中有各定量表格
cd ~/ath/jt.terpene.meta/clean_data/JT-545
# 可接受压缩文件fastq,并自建目录
$wd/humann2 --input 25/JT-545_25.rmhost.1.fq.gz --output humann2_25 &
$wd/humann2 --input 26/JT-545_26.rmhost.1.fq.gz --output humann2_26 &
$wd/humann2 --input 27/JT-545_27.rmhost.1.fq.gz --output humann2_27 &
输出文件位于输入目录中的输出目录
# Gene Family $SAMPLENAME_Abundance-RPKs
UNMAPPED 187.0
UniRef50_unknown 150.0
UniRef50_unknown|g__Bacteroides.s__Bacteroides_fragilis 150.0
UniRef50_A6L0N6: Conserved protein found in conjugate transposon 67.0
UniRef50_A6L0N6: Conserved protein found in conjugate transposon|g__Bacteroides.s__Bacteroides_fragilis 57.0
UniRef50_A6L0N6: Conserved protein found in conjugate transposon|g__Bacteroides.s__Bacteroides_finegoldii 5.0
UniRef50_A6L0N6: Conserved protein found in conjugate transposon|g__Bacteroides.s__Bacteroides_stercoris 4.0
UniRef50_A6L0N6: Conserved protein found in conjugate transposon|unclassified 1.0
UniRef50_O83668: Fructose-bisphosphate aldolase 60.0
UniRef50_O83668: Fructose-bisphosphate aldolase|g__Bacteroides.s__Bacteroides_vulgatus 31.0
UniRef50_O83668: Fructose-bisphosphate aldolase|g__Bacteroides.s__Bacteroides_thetaiotaomicron 22.0
UniRef50_O83668: Fructose-bisphosphate aldolase|g__Bacteroides.s__Bacteroides_stercoris 7.0
#Pathway $SAMPLENAME_Abundance
UNMAPPED 140.0
UNINTEGRATED 87.0
UNINTEGRATED|g__Bacteroides.s__Bacteroides_caccae 23.0
UNINTEGRATED|g__Bacteroides.s__Bacteroides_finegoldii 20.0
UNINTEGRATED|unclassified 12.0
PWY0-1301: melibiose degradation 57.5
PWY0-1301: melibiose degradation|g__Bacteroides.s__Bacteroides_caccae 32.5
PWY0-1301: melibiose degradation|g__Bacteroides.s__Bacteroides_finegoldii 4.5
PWY0-1301: melibiose degradation|unclassified 3.0
PWY-5484: glycolysis II (from fructose-6P) 54.7
PWY-5484: glycolysis II (from fructose-6P)|g__Bacteroides.s__Bacteroides_caccae 16.7
PWY-5484: glycolysis II (from fructose-6P)|g__Bacteroides.s__Bacteroides_fi
# Pathway $SAMPLENAME_Coverage
UNMAPPED 1.0
UNINTEGRATED 1.0
UNINTEGRATED|g__Bacteroides.s__Bacteroides_caccae 1.0
UNINTEGRATED|g__Bacteroides.s__Bacteroides_finegoldii 1.0
UNINTEGRATED|unclassified 1.0
PWY0-1301: melibiose degradation 1.0
PWY0-1301: melibiose degradation|g__Bacteroides.s__Bacteroides_caccae 1.0
PWY0-1301: melibiose degradation|g__Bacteroides.s__Bacteroides_finegoldii 1.0
PWY0-1301: melibiose degradation|unclassified 1.0
PWY-5484: glycolysis II (from fructose-6P) 1.0
PWY-5484: glycolysis II (from fructose-6P)|g__Bacteroides.s__Bacteroides_caccae 0.7
PWY-5484: glycolysis II (from fructose-6P)|g__Bacteroides.s__Bacteroides_finegoldii 0.7
PWY-5484: glycolysis II (from fructose-6P)|unclassified 0.3
$OUTPUT_DIR/$SAMPLENAME_pathabundance.tsv
$DIR/$SAMPLENAME_bowtie2_aligned.sam
)$DIR/$SAMPLENAME_bowtie2_aligned.tsv
$DIR/$SAMPLENAME_bowtie2_index*
$DIR/$SAMPLENAME_bowtie2_unaligned.fa
$DIR/$SAMPLENAME_custom_chocophlan_database.ffn
$DIR/$SAMPLENAME_metaphlan_bowtie2.txt
$DIR/$SAMPLENAME_metaphlan_bugs_list.tsv
$DIR/$SAMPLENAME_$TRANSLATEDALIGN_aligned.tsv
$DIR/$SAMPLENAME_$TRANSLATEDALIGN_unaligned.fa
$DIR/$SAMPLENAME.log
# 显示参数
$wd/humann2_config --print
# 修改参数格式
$wd/humann2_config --update $SECTION $NAME $VALUE
# 如修改线程数
$wd/humann2_config --update run_modes threads 12
Basic usage: $ humann2_barplot --input $TABLE.tsv --feature $FEATURE --outfile $FIGURE
$TABLE.tsv = a stratified HUMAnN2 output file
$FEATURE = Feature from the table to plot (defaults to first feature)
$FIGURE = Where to save the figure
Run with -h to see additional command line options
可选择某个Feature进行柱状图可视化。—help参数可查看相关排序、标准化选项。
此步非常重要,我们无法多少个样品,humann2结果仅为一列。多样品需经本步合并为矩阵,方便下游统计分析和差异比较。
Basic usage: $ humann2_join_tables --input $INPUT_DIR --output $TABLE
$INPUT_DIR = a directory containing gene/pathway tables (tsv or biom format)
$TABLE = the file to write the new single gene table (biom format if input is biom format)
Optional: --file_name $STR will only join gene tables with $STR in file name
Run with -h to see additional command line options
--verbose
参数--threads $CORES
或修改默认设置--remove-temp-output
--nucleotide-database $DIR
--protein-database $DIR
--taxonomic-profile bugs_list.tsv
--output-basename $NAME
--remove-stratified-output
--pathways unipathway
--output-format biom
--identity-threshold <50.0>
--metaphlan-options="-t rel_ab"
diamond没有在环境变量,下载解压并确保添加到环境变量
没有找到metaphlan2的数据库,是metaphlan2新版本目录更改了位置,永久方法是建一个旧位置的硬链。
进入metaphlan2安装目录
mkdir db_v20
ln `pwd`/databases/mpa_v20_m200.* db_v20/
/mnt/bai/public/bin/diamond是个目录,不知为什么系统会找这个目录当前程序,系统我也装在了 /usr/local/bin/diamond 中。修改此目录为程序
1. 基础理论教程
2. 分析实战有参系列:
3. 分析实战De novo系列:
如果基础知识体系不完善,自学存在困难的小伙伴,急时上车也是不错的选择。成为实验中不可或缺的人,赶快点击阅读原文报名我们的培训,加速你入行!http://www.ehbio.com/Training
为鼓励读者交流、快速解决科研困难,我们建立了“宏基因组”专业讨论群,目前己有国内外1500+ 一线科研人员加入。参与讨论,获得专业解答,欢迎分享此文至朋友圈,并扫码加主编好友带你入群,务必备注“姓名-单位-研究方向-职称/年级”。技术问题寻求帮助,首先阅读《如何优雅的提问》学习解决问题思路,仍末解决群内讨论,问题不私聊,帮助同行。
学习扩增子、宏基因组科研思路和分析实战,关注“宏基因组”
点击阅读原文,跳转最新文章目录阅读
https://mp.weixin.qq.com/s/5jQspEvH5_4Xmart22gjMA
宏基因组培训班 点链接或阅读原文报名 http://www.ehbio.com/Training
Archiver|手机版|科学网 ( 京ICP备07017567号-12 )
GMT+8, 2024-4-23 16:40
Powered by ScienceNet.cn
Copyright © 2007- 中国科学报社