||
本节,介绍一下箱线图实现显著性添加的方法,类似这种:
「单因素二水平T检验箱线图可视化」
「单因素三水平T检验箱线图可视化」
「单因素三水平柱形图」
「单因素三水平折线图」「二因素柱形图」
「二因素折线图」
这种试验,比如有两个品种,株高的差异,每个品种调查了10株,就构成了这样的试验数据。
「模拟数据:」
set.seed(123) y1 = rnorm(10) + 5 y2 = rnorm(10) + 15 dd = data.frame(Group = rep(c("A","B"),each=10),y = c(y1,y2)) dd str(dd) dd$Group = as.factor(dd$Group)
「数据:」
> dd Group y 1 A 4.439524 2 A 4.769823 3 A 6.558708 4 A 5.070508 5 A 5.129288 6 A 6.715065 7 A 5.460916 8 A 3.734939 9 A 4.313147 10 A 4.554338 11 B 16.224082 12 B 15.359814 13 B 15.400771 14 B 15.110683 15 B 14.444159 16 B 16.786913 17 B 15.497850 18 B 13.033383 19 B 15.701356 20 B 14.527209
这里,使用的是ggpubr包进行绘图:
library(ggplot2) library(ggpubr) ggboxplot(dd,x = "Group",y = "y")在这里插入图片描述
ggboxplot(dd,x = "Group",y = "y",color = "Group")
ggboxplot(dd,x = "Group",y = "y",color = "Group",add = "jitter")
这里,默认的统计方法是非参数统计Wilcoxon
,如果想用t.test
,见下面操作
ggboxplot(dd,x = "Group",y = "y",color = "Group",add = "jitter") + stat_compare_means()
t.test
作为统计方法ggboxplot(dd,x = "Group",y = "y",color = "Group",add = "jitter") + stat_compare_means(method = "t.test")
ggboxplot(dd,x = "Group",y = "y",color = "Group",add = "jitter") + stat_compare_means(method = "t.test",label = "p.signif")
二个水平可以用T检验,三个水平或者多个水平的数据,如何检验呢?
「模拟数据:」
# 构建三个水平 ANOVA set.seed(123) y1 = rnorm(10) + 5 y2 = rnorm(10) + 15 y3 = rnorm(10) + 15 dd = data.frame(Group = rep(c("A","B","C"),each=10),y = c(y1,y2,y3)) dd str(dd) dd$Group = as.factor(dd$Group)
「数据如下:」
> dd Group y 1 A 4.439524 2 A 4.769823 3 A 6.558708 4 A 5.070508 5 A 5.129288 6 A 6.715065 7 A 5.460916 8 A 3.734939 9 A 4.313147 10 A 4.554338 11 B 16.224082 12 B 15.359814 13 B 15.400771 14 B 15.110683 15 B 14.444159 16 B 16.786913 17 B 15.497850 18 B 13.033383 19 B 15.701356 20 B 14.527209 21 C 13.932176 22 C 14.782025 23 C 13.973996 24 C 14.271109 25 C 14.374961 26 C 13.313307 27 C 15.837787 28 C 15.153373 29 C 13.861863 30 C 16.253815
p = ggboxplot(dd,x = "Group",y = "y",color = "Group",add = "jitter") p
p + stat_compare_means(method = "anova")
my_comparisons = list( c("A", "B"), c("A", "C"), c("B", "C") ) p + stat_compare_means(comparisons = my_comparisons, # label = "p.signif", method = "t.test")
p + stat_compare_means(comparisons = my_comparisons, label = "p.signif", method = "t.test")
「模拟数据:」
# 两个因素的数据 set.seed(123) y1 = rnorm(10) + 5 y2 = rnorm(10) + 8 y3 = rnorm(10) + 7 y4 = rnorm(10) + 15 y5 = rnorm(10) + 18 y6 = rnorm(10) + 17 dd = data.frame(Group1 = rep(c("A","B","C"),each=10), Group2 = rep(c("X","Y"),each=30), y = c(y1,y2,y3,y4,y5,y6)) dd str(dd) dd$Group1 = as.factor(dd$Group1) dd$Group2 = as.factor(dd$Group2) str(dd)
「数据预览:」
> dd Group1 Group2 y 1 A X 4.439524 2 A X 4.769823 3 A X 6.558708 4 A X 5.070508 5 A X 5.129288 6 A X 6.715065 7 A X 5.460916 8 A X 3.734939 9 A X 4.313147 10 A X 4.554338 11 B X 9.224082 12 B X 8.359814 13 B X 8.400771 14 B X 8.110683 15 B X 7.444159 16 B X 9.786913 17 B X 8.497850 18 B X 6.033383 19 B X 8.701356 20 B X 7.527209 21 C X 5.932176 22 C X 6.782025 23 C X 5.973996 24 C X 6.271109 25 C X 6.374961 26 C X 5.313307 27 C X 7.837787 28 C X 7.153373 29 C X 5.861863 30 C X 8.253815 31 A Y 15.426464 32 A Y 14.704929 33 A Y 15.895126 34 A Y 15.878133 35 A Y 15.821581 36 A Y 15.688640 37 A Y 15.553918 38 A Y 14.938088 39 A Y 14.694037 40 A Y 14.619529 41 B Y 17.305293 42 B Y 17.792083 43 B Y 16.734604 44 B Y 20.168956 45 B Y 19.207962 46 B Y 16.876891 47 B Y 17.597115 48 B Y 17.533345 49 B Y 18.779965 50 B Y 17.916631 51 C Y 17.253319 52 C Y 16.971453 53 C Y 16.957130 54 C Y 18.368602 55 C Y 16.774229 56 C Y 18.516471 57 C Y 15.451247 58 C Y 17.584614 59 C Y 17.123854 60 C Y 17.215942
p = ggboxplot(dd,x = "Group1",y="y",color = "Group2", add = "jitter") p
p + stat_compare_means(aes(group = Group2),method = "t.test")
p + stat_compare_means(aes(group = Group2),method = "t.test",label = "p.signif")
p = ggboxplot(dd,x = "Group2",y="y",color = "Group1", add = "jitter",facet.by = "Group1") p
p + stat_compare_means(method = "t.test")
p + stat_compare_means(method = "t.test",label = "p.signif",label.y = 17)
直方图+标准误,之前用ggplot2
需要很长的代码,这里有更好的方案。
p = ggbarplot(dd,x = "Group1",y = "y",add = "mean_se",color = "Group1") p
p + stat_compare_means(method = "anova",,label.y = 15)+ stat_compare_means(comparisons = my_comparisons)
p = ggline(dd,x = "Group1",y = "y",add = "mean_se") p
p + stat_compare_means(method = "anova",,label.y = 15)+ stat_compare_means(comparisons = my_comparisons)
p = ggbarplot(dd,x = "Group1",y = "y",add = "mean_se",color = "Group2", position = position_dodge(0.8)) p
p + stat_compare_means(aes(group=Group2), label = "p.signif")
p = ggline(dd,x = "Group1",y = "y",add = "mean_se",color = "Group2", position = position_dodge(0.8)) p
p + stat_compare_means(aes(group=Group2), label = "p.signif")
# > 欢迎关注我的公众号:`育种数据分析之放飞自我`。主要分享R语言,Python,育种数据分析,生物统计,数量遗传学,混合线性模型,GWAS和GS相关的知识。 # 构建两个水平 T-test set.seed(123) y1 = rnorm(10) + 5 y2 = rnorm(10) + 15 dd = data.frame(Group = rep(c("A","B"),each=10),y = c(y1,y2)) dd str(dd) dd$Group = as.factor(dd$Group) library(ggplot2) library(ggpubr) ggboxplot(dd,x = "Group",y = "y") ggboxplot(dd,x = "Group",y = "y",color = "Group") ggboxplot(dd,x = "Group",y = "y",color = "Group",add = "jitter") ggboxplot(dd,x = "Group",y = "y",color = "Group",add = "jitter") + stat_compare_means() ggboxplot(dd,x = "Group",y = "y",color = "Group",add = "jitter") + stat_compare_means(method = "t.test") ggboxplot(dd,x = "Group",y = "y",color = "Group",add = "jitter") + stat_compare_means(method = "t.test",label = "p.signif") # 构建三个水平 ANOVA set.seed(123) y1 = rnorm(10) + 5 y2 = rnorm(10) + 15 y3 = rnorm(10) + 15 dd = data.frame(Group = rep(c("A","B","C"),each=10),y = c(y1,y2,y3)) dd str(dd) dd$Group = as.factor(dd$Group) p = ggboxplot(dd,x = "Group",y = "y",color = "Group",add = "jitter") p p + stat_compare_means(method = "anova") # Perorm pairwise comparisons # compare_means(y ~ Group, data = dd,method = "anova") my_comparisons = list( c("A", "B"), c("A", "C"), c("B", "C") ) p + stat_compare_means(comparisons = my_comparisons, # label = "p.signif", method = "t.test") p + stat_compare_means(comparisons = my_comparisons, label = "p.signif", method = "t.test") # 两个因素的数据 set.seed(123) y1 = rnorm(10) + 5 y2 = rnorm(10) + 8 y3 = rnorm(10) + 7 y4 = rnorm(10) + 15 y5 = rnorm(10) + 18 y6 = rnorm(10) + 17 dd = data.frame(Group1 = rep(c("A","B","C"),each=10), Group2 = rep(c("X","Y"),each=30), y = c(y1,y2,y3,y4,y5,y6)) dd str(dd) dd$Group1 = as.factor(dd$Group1) dd$Group2 = as.factor(dd$Group2) str(dd) ## 分组查看 p = ggboxplot(dd,x = "Group1",y="y",color = "Group2", add = "jitter") p p + stat_compare_means(aes(group = Group2),method = "t.test") p + stat_compare_means(aes(group = Group2),method = "t.test",label = "p.signif") ## 分组查看 p = ggboxplot(dd,x = "Group2",y="y",color = "Group1", add = "jitter",facet.by = "Group1") p p + stat_compare_means(method = "t.test") p + stat_compare_means(method = "t.test",label = "p.signif",label.y = 17) # 单分组 # 三水平直方图 p = ggbarplot(dd,x = "Group1",y = "y",add = "mean_se",color = "Group1") p p + stat_compare_means(method = "anova",,label.y = 15)+ stat_compare_means(comparisons = my_comparisons) # 有误差的折线图 p = ggline(dd,x = "Group1",y = "y",add = "mean_se") p p + stat_compare_means(method = "anova",,label.y = 15)+ stat_compare_means(comparisons = my_comparisons) # 二分组 p = ggbarplot(dd,x = "Group1",y = "y",add = "mean_se",color = "Group2", position = position_dodge(0.8)) p p + stat_compare_means(aes(group=Group2), label = "p.signif") # 有误差的折线图 p = ggline(dd,x = "Group1",y = "y",add = "mean_se",color = "Group2", position = position_dodge(0.8)) p p + stat_compare_means(aes(group=Group2), label = "p.signif")
❝欢迎关注我的公众号:
❞育种数据分析之放飞自我
。主要分享R语言,Python,育种数据分析,生物统计,数量遗传学,混合线性模型,GWAS和GS相关的知识。
Archiver|手机版|科学网 ( 京ICP备07017567号-12 )
GMT+8, 2024-12-10 00:42
Powered by ScienceNet.cn
Copyright © 2007- 中国科学报社