博文

snakemake 学习笔记3

已有 2725 次阅读 2019-4-3 09:20 |个人分类:便捷操作|系统分类:科研笔记

目标

这次, 我要实现这个路程图.

在这里插入图片描述

目标介绍

第一: 生成1.txt , 2.txt, 3.txt
第二: 向每个文件中加入”add a”字符, 命名为:1_add_a.txt, 2_add_a.txt, 3_add_a.txt
第三: 向文件中增加”add b”, 命名为:1_add_a_add_b.txt, 2_add_a_add_b.txt, 3_add_a_add_b.txt
第四: 向文件中增加”add c”, 命名为: 1_add_a_add_b_add_c.txt, 2_add_a_add_b_add_c.txt, 3_add_a_add_b_add_c.txt
第五: 将1_add_a_add_b.txt, 2_add_a_add_b.txt, 1_add_a_add_b_add_c.txt, 2_add_a_add_b_add_c.txt, 3_add_a_add_b_add_c.txt 合并为hebing.txt文件

1. 生成三个文件

(snake_test) [dengfei@localhost ex4]$ ls *txt
1.txt  2.txt  3.txt
(snake_test) [dengfei@localhost ex4]$ cat *txt
this is 1.txt
this is 2.txt
this is 3.txt

2. 在每个文件中增加”add a”

对应的Snakefile内容如下:

rule adda:
    input: "{file}.txt"
    output: "{file}_add_a.txt"
    shell: "cat {input} |xargs echo add a >{output}"

预览一下命令:snakemake -np {1,2,3}_add_a.txt

注意: 这里要把生成的文件{1,2,3}_add_a.txt写出来, 命令才可以运行.

(snake_test) [dengfei@localhost ex4]$ snakemake -np {1,2,3}_add_a.txt
Building DAG of jobs...
Job counts:
    count    jobs
    3    adda
    3

[Tue Apr  2 21:09:19 2019]
rule adda:
    input: 3.txt
    output: 3_add_a.txt
    jobid: 2
    wildcards: file=3

cat 3.txt |xargs echo add a >3_add_a.txt

[Tue Apr  2 21:09:19 2019]
rule adda:
    input: 2.txt
    output: 2_add_a.txt
    jobid: 0
    wildcards: file=2

cat 2.txt |xargs echo add a >2_add_a.txt

[Tue Apr  2 21:09:19 2019]
rule adda:
    input: 1.txt
    output: 1_add_a.txt
    jobid: 1
    wildcards: file=1

cat 1.txt |xargs echo add a >1_add_a.txt
Job counts:
    count    jobs
    3    adda
    3
This was a dry-run (flag -n). The order of jobs does not reflect the order of execution.

执行命令:

snakemake  {1,2,3}_add_a.txt

Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 1
Rules claiming more threads will be scaled down.
Job counts:
    count    jobs
    3    adda
    3

[Tue Apr  2 21:11:09 2019]
rule adda:
    input: 3.txt
    output: 3_add_a.txt
    jobid: 0
    wildcards: file=3

[Tue Apr  2 21:11:09 2019]
Finished job 0.
1 of 3 steps (33%) done

[Tue Apr  2 21:11:09 2019]
rule adda:
    input: 1.txt
    output: 1_add_a.txt
    jobid: 1
    wildcards: file=1

[Tue Apr  2 21:11:09 2019]
Finished job 1.
2 of 3 steps (67%) done

[Tue Apr  2 21:11:09 2019]
rule adda:
    input: 2.txt
    output: 2_add_a.txt
    jobid: 2
    wildcards: file=2

[Tue Apr  2 21:11:09 2019]
Finished job 2.
3 of 3 steps (100%) done
Complete log: /home/dengfei/test/snakemake/ex4/.snakemake/log/2019-04-02T211109.153566.snakemake.log

查看*add_a.txt文件:

(snake_test) [dengfei@localhost ex4]$ ls *add_a.txt
1_add_a.txt  2_add_a.txt  3_add_a.txt
(snake_test) [dengfei@localhost ex4]$ cat *add_a.txt
add a this is 1.txt
add a this is 2.txt
add a this is 3.txt

搞定.

3. 在每个文件中增加”add b”

对应的Snakefile内容如下:

rule adda:
    input: "{file}.txt"
    output: "{file}_add_a.txt"
    shell: "cat {input} |xargs echo add a >{output}"
rule addb:
    input:
        "{file}_add_a.txt"
    output:
        "{file}_add_a_add_b.txt"
    shell:
        "cat {input} | xargs echo add b >{output}"

预览一下命令:snakemake -np {1,2,3}_add_a_add_b.txt

(snake_test) [dengfei@localhost ex4]$ snakemake  {1,2,3}_add_a_add_b.txt
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 1
Rules claiming more threads will be scaled down.
Job counts:
    count    jobs
    3    addb
    3

[Tue Apr  2 21:13:57 2019]
rule addb:
    input: 2_add_a.txt
    output: 2_add_a_add_b.txt
    jobid: 0
    wildcards: file=2

[Tue Apr  2 21:13:57 2019]
Finished job 0.
1 of 3 steps (33%) done

[Tue Apr  2 21:13:57 2019]
rule addb:
    input: 1_add_a.txt
    output: 1_add_a_add_b.txt
    jobid: 1
    wildcards: file=1

[Tue Apr  2 21:13:57 2019]
Finished job 1.
2 of 3 steps (67%) done

[Tue Apr  2 21:13:57 2019]
rule addb:
    input: 3_add_a.txt
    output: 3_add_a_add_b.txt
    jobid: 2
    wildcards: file=3

[Tue Apr  2 21:13:57 2019]
Finished job 2.
3 of 3 steps (100%) done
Complete log: /home/dengfei/test/snakemake/ex4/.snakemake/log/2019-04-02T211357.666661.snakemake.log

执行命令:

snakemake  {1,2,3}_add_a_add_b.txt

查看流程图

命令:

snakemake --dag {1,2,3}_add_a_add_b.txt |dot -Tpdf >a.pdf

这里生成的a.pdf如下:

4. 在每个文件中增加”add c”

Snakemake命令:

rule adda:
    input: "{file}.txt"
    output: "{file}_add_a.txt"
    shell: "cat {input} |xargs echo add a >{output}"
rule addb:
    input:
        "{file}_add_a.txt"
    output:
        "{file}_add_a_add_b.txt"
    shell:
        "cat {input} | xargs echo add b >{output}"

rule addc:
    input:
        "{file}_add_a_add_b.txt"
    output:
        "{file}_add_a_add_b_add_c.txt"
    shell:
        "cat {input} | xargs echo add c >{output}"

流程图:

命令:

snakemake --dag {1,2,3}_add_a_add_b_add_c.txt |dot -Tpdf >a1.pdf

在这里插入图片描述

5. 将文件合并

rule adda:
    input: "{file}.txt"
    output: "{file}_add_a.txt"
    shell: "cat {input} |xargs echo add a >{output}"
rule addb:
    input:
        "{file}_add_a.txt"
    output:
        "{file}_add_a_add_b.txt"
    shell:
        "cat {input} | xargs echo add b >{output}"

rule addc:
    input:
        "{file}_add_a_add_b.txt"
    output:
        "{file}_add_a_add_b_add_c.txt"
    shell:
        "cat {input} | xargs echo add c >{output}"

rule hebing:
    input:
       a=expand("{file}_add_a_add_b_add_c.txt",file=["1","2","3"]),
       b=expand("{file}_add_a_add_b.txt",file=["1","2"])
    output:"hebing.txt"
    shell:"cat {input.a} {input.b} >{output}"

执行命令:

snakemake hebing.txt

执行结果:

Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 1
Rules claiming more threads will be scaled down.
Job counts:
    count    jobs
    3    addc
    1    hebing
    4

[Tue Apr  2 21:21:04 2019]
rule addc:
    input: 1_add_a_add_b.txt
    output: 1_add_a_add_b_add_c.txt
    jobid: 1
    wildcards: file=1

[Tue Apr  2 21:21:04 2019]
Finished job 1.
1 of 4 steps (25%) done

[Tue Apr  2 21:21:04 2019]
rule addc:
    input: 3_add_a_add_b.txt
    output: 3_add_a_add_b_add_c.txt
    jobid: 3
    wildcards: file=3

[Tue Apr  2 21:21:04 2019]
Finished job 3.
2 of 4 steps (50%) done

[Tue Apr  2 21:21:04 2019]
rule addc:
    input: 2_add_a_add_b.txt
    output: 2_add_a_add_b_add_c.txt
    jobid: 2
    wildcards: file=2

[Tue Apr  2 21:21:04 2019]
Finished job 2.
3 of 4 steps (75%) done

[Tue Apr  2 21:21:04 2019]
rule hebing:
    input: 1_add_a_add_b_add_c.txt, 2_add_a_add_b_add_c.txt, 3_add_a_add_b_add_c.txt, 1_add_a_add_b.txt, 2_add_a_add_b.txt
    output: hebing.txt
    jobid: 0

[Tue Apr  2 21:21:04 2019]
Finished job 0.
4 of 4 steps (100%) done
Complete log: /home/dengfei/test/snakemake/ex4/.snakemake/log/2019-04-02T212104.719887.snakemake.log

流程图:
在这里插入图片描述

搞定

转载本文请联系原作者获取授权，同时请注明本文来自邓飞科学网博客。
链接地址：https://blog.sciencenet.cn/blog-2577109-1171196.html

上一篇：snakemake-学习笔记2
下一篇：DMU 遗传参数评估 cookbook(pdf)

收藏 IP: 117.119.97.*| 热度|

当前推荐数：0

该博文允许注册用户评论请点击登录评论 (0 个评论)

数据加载中...

返回顶部

邓飞

扫一扫，分享此博文

育种数据分析之放飞自我分享 http://blog.sciencenet.cn/u/yijiaobai 关注：生物统计，数量遗传，混合线性模型，生物信息，R，Perl，Python，GWAS，GS相关方法，文章及代码

博文

snakemake 学习笔记3

目标

目标介绍

1. 生成三个文件

2. 在每个文件中增加”add a”

3. 在每个文件中增加”add b”

4. 在每个文件中增加”add c”

5. 将文件合并

搞定

当前推荐数：0

该博文允许注册用户评论请点击登录评论 (0 个评论)

邓飞

全部作者的其他最新博文

全部精选博文导读

相关博文

育种数据分析之放飞自我分享 http://blog.sciencenet.cn/u/yijiaobai 关注：生物统计，数量遗传，混合线性模型，生物信息，R，Perl，Python，GWAS，GS相关方法，文章及代码

博文

snakemake 学习笔记3

目标

目标介绍

1. 生成三个文件

2. 在每个文件中增加”add a”

3. 在每个文件中增加”add b”

4. 在每个文件中增加”add c”

5. 将文件合并

搞定

当前推荐数：0

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

邓飞

全部作者的其他最新博文

全部精选博文导读

相关博文

该博文允许注册用户评论请点击登录评论 (0 个评论)