My teaching materials for biological statistics (2016)

已有 5191 次阅读 2016-6-3 16:29 |系统分类:教学心得

I post my teaching materials for the course of Biological Statistics in the University of Chinese Academy of Sciences. There are 13 pdf files in total, each for a three-hour-long lecture. In 2006, about 40% of the contents are from Zar’s book (Zar 1999), and about 10% from Sokal and Rohlf’s book (Sokal & Rohlf 1995), and the teaching materials of my previous supervisors David Schneider and Matt Litvak, as well as some of my own thoughts. In 2007, I cited a number of points from Quinn and Keough’s book (Quinn & Keough 2002). This book is not as classic as Zar’s or Sokal and Rohlf’s textbooks, yet it is deeper. In 2011, I gave up SAS and started to use R for teaching. I borrowed a number of examples from Crawley’s R book (Crawley2012) and Zuur’s series books (Zuur et al. 2009a; Zuur et al. 2009b; Zuur et al. 2007). Those books are very easy to digest as they provide R codes along with the well explained theories. Beside, Wikipedia and R help documents are also valuable sources for my teaching. In recent years I gradually put my own studies in the teaching materials, especially for generalized linear models and machine learning. I changed about 15% of my teaching materials every year. Now 11 years past, Zar and Sokal & Rohlf’s stuff remains less than 5%. The 13 files accumulatively cost me about 1500 hours since 2006.

I thank Dianmo Li, David Schneider, and Matt Litvak, who taught me biostatistics, and allowed me to use  their cases and materials for teaching.

The syllabus and the links for downloadingare:

1.            History and development ofbiostatistics (95 slides)

l  Introduction

2  The role of statistics in ecologicalresearch

2  Best practice in this class

2  Using the text books

2  Statistical language R

l  Brief history ofbiostatistics

l  Key persons

l  Basic concepts

2  Data types

2  Descriptive statistics

1_Introduction to biological statistics.pdf

2.            Probability distribution (105slides)

l  Probability theory

2  Axioms and corollaries 

2  Permutations and combinations

2  The Monty Hall problem

l  Common distributions ofrandom variables

2  Binomial distribution

2  Poisson distribution

2  Negative binomialdistribution

2  Uniform distribution

2  Normal distribution

2  Chi square distribution

2  F distribution

2_Probability distribution.pdf

3.            Hypothesis testing 1 (84slides)

l  What is hypothesis testing?

l  Standard procedures

l  Case studies

2  T test and Z test  

2  Situations of one tail andtwo tails

2  One sample hypothesis testsand two samples hypothesis tests

2  Paired test

l  Type I and Type II Errors

3_Hypothesis testing 1.pdf

4.            Hypothesis testing 2 (79slides)

l  Examples

l  Chi-square test

l  Effect size

l  Power of test

l  Sample size

l  Philosophy of hypothesistesting

4_Hypothesis testing 2.pdf

5.            Analysis of variance (ANOVA)1 (81 slides)

l  Rationale of ANOVA

l  Generic Recipe of generallinear model

l  One-way ANOVA

l  Random blocked design

l  Two-way ANOVA


6.            Analysis of variance (ANOVA)2 (79 slides)

l  Three-way ANOVA

l  Latin Square Design

l  Hierarchical (nested) ANOVA

l  Split Plot Design

l  Repeated measures ANOVA

l  Mixed effects models


7.            Simple linear regression andcorrelation (89 slides)

l  Rationale of simple linearregression

l  Least square

l  Regression coefficient(slope) and intercept

l  Significanceof a regression

l  Assumptions of regressionanalysis

l  Applications of simple linearregression

l  Rationale of simple linearcorrelation

l  Coefficient of correlation

l  Power and sample size incorrelation

7_Simple linear regression and correlation.pdf

8.            Analysis of covariance(ANCOVA) (84 slides)

l  Rationale of analysis ofcovariance

l  Assumptions of analysis ofcovariance

l  Compared with ANOVA andregression

l  Mixed effect model for ANCOVA

l  Case studies

l  Coding convention of R  


9.            Data transformation andNonparametric statistics (85 slides)

l  Data transformation

2  Logarithmic transformation

2  Square root transformation

2  Arcsine transformation

2  Reciprocal transformation

2  Square transformation

2  Box-Cox transformation

l  Rationale of nonparametricstatistics

l  Sign test

l  Wilcoxon signed rank test

l  Wilcoxon rank sum test

l  Kruskal-Wallis test  

l  Friedman’s test

l  Bootstraping

9_Data transformation and nonparametric statistics.pdf

10.          Multivariate analysis 1 (98slides)

l  Multiple regression

2  Linear regression

2  Non-linear regression

2  Evaluating multipleregression model

l  Multiple correlation

l  Partial correlation

l  Contribution,fraction, partial R2

l  Canonicalcorrelation analysis

10_Multivariate analysis 1.pdf

11.          Multivariate analysis 2 (158slides)

l  Cluster analysis

l  Discriminant analysis

l  Principal component analysis(PCA)

l  Factor analysis (FA)

l  Correspondence analysis (CA)

l  Redundancy analysis (RDA)

l  Canonical correspondenceanalysis (CCA)

l  Principal coordinate analysis(PCoA) or multidimensional scaling (MDS)

l  Non-metric multidimensionalscaling (NMDS)

11_Multivariate analysis 2.pdf

12.          Generalized linear model (89slides)

l  Rationale of generalizedlinear model

l  Logisticregression

2  Assumptions

2  Biological means of thecoefficients

2  Marginal effect of independentvariables

2  Goodness of fit

l  Maximum likelihood estimation(MLE)

2  MLE for mean of the normaldistribution

2  MLE for variance of thenormal distribution

2  A few likelihood functions

l  Structure of generalizedlinear model

2  Random component

2  Systematic component

2  Link function

l  Poisson regression

l  Negative binomial regression

12_Generalized linear models.pdf

13.          Some advanced models (81slides)

l  Generalized linear model(GLM)

l  Generalized additive model(GAM)

l  Multivariate AdaptiveRegression Splines (MARS)

l  Mixture discriminant analysis(MDA)

l  Classification and regressiontree

l  Generalized Boosting Models(GBM)

l  Artificial neural networks(ANN)

l  Random Forest

l  Genetic Algorithm for RuleSet Production (GARP)

l  Maximum entropy method (MaxEnt)

l  Bayesian method

l  Hierarchical modeling

13_Advanced models.pdf


Crawley,M.J. 2007. The R Book. John Wiley & Sons Ltd.

Crawley,M. J. 2012. The R book. SecondEdition. John Wiley & Sons Ltd.

Faraway,J. J. 2004. Linear models with R. CRC Press.

Quinn,G. P., and M. J. Keough 2002.Experimental design and data analysis forbiologists. Cambridge University Press.

Sokal,R. R., and F. J. Rohlf 1995. Biometry.Third Edition. W. H. Freeman and Company,New York.

Zar,J. H. 1999. Biostatistical Analysis.Pearson.

Zuur,A., E. N. Ieno, and E. Meesters 2009a. ABeginner's Guide to R. Springer.

Zuur,A., E. N. Ieno, N. Walker, A. A.Saveliev, and G. M. Smith 2009b. Mixed effectsmodels and extensions in ecologywith R. Springer

Zuur,A. F., E. N. Ieno, and G. M. Smith 2007.Analysing ecological data. Springer.


7 刘检明 王晶苑 邓飞 蔡庆华 李学友 李心诚 梅志平

该博文允许注册用户评论 请点击登录 评论 (5 个评论)


Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2020-9-26 05:57

Powered by

Copyright © 2007- 中国科学报社