Babyboom

Data Read and Separete

To read “txt” files, I use R function - read.table().

read.table('../datasets/babyboom.dat.txt',header = FALSE,
           dec = ',',na.strings = 'NA') -> babyboom_data

For dataset babyboom, variable descriptions are as follows:

V1: Time of birth recorded on the 24-hour clock
V2: Sex of the child (1 = girl, 2 = boy)
V3: Birth weight in grams
V4: Number of minutes after midnight of each birth

I use function subset to divide data babyboom into 2 subset, one of these is data of girl and another is boy.

baby_girls = subset(babyboom_data,V2==1,select = c(V1,V2,V3,V4))
baby_boys = subset(babyboom_data,V2==2,select = c(V1,V2,V3,V4))

The table below is shown some data of baby_boys.

knitr::kable(
  head(baby_boys, 6)
)

	V1	V2	V3	V4
3	118	2	3554	78
4	155	2	3838	115
5	257	2	3625	177
8	422	2	2846	262
9	431	2	3166	271
10	708	2	3520	428

One-Sample Test for Normality

For all baby

Kolmogorov-Smirnov D Test

Because of K-S test require the distribution of our test need to be continuous, that means there are must no duplicate values in our sample. But our sample(all baby) don’t satisfy this condition. So, I use R function “Jitter” to add noisy in our sample.

ks.test(jitter(babyboom_data$V3),'pnorm',mean(babyboom_data$V3),sd(babyboom_data$V3))

## 
##  One-sample Kolmogorov-Smirnov test
## 
## data:  jitter(babyboom_data$V3)
## D = 0.18328, p-value = 0.09131
## alternative hypothesis: two-sided

Lilliefors Test

library(nortest)
lillie.test(babyboom_data$V3)

## 
##  Lilliefors (Kolmogorov-Smirnov) normality test
## 
## data:  babyboom_data$V3
## D = 0.18336, p-value = 0.0007395

Anderson–Darling Test

ad.test(babyboom_data$V3)

## 
##  Anderson-Darling normality test
## 
## data:  babyboom_data$V3
## A = 1.7168, p-value = 0.0001788

Cramer–von Mises Test

library(nortest)
cvm.test(babyboom_data$V3)

## 
##  Cramer-von Mises normality test
## 
## data:  babyboom_data$V3
## W = 0.31125, p-value = 0.0002256

Shapiro–Wilk Test

shapiro.test(babyboom_data$V3)

## 
##  Shapiro-Wilk normality test
## 
## data:  babyboom_data$V3
## W = 0.89872, p-value = 0.0009944

Shapiro–Francia Test

library(nortest)
sf.test(babyboom_data$V3)

## 
##  Shapiro-Francia normality test
## 
## data:  babyboom_data$V3
## W = 0.89701, p-value = 0.001519

Pearson chi-square test

library(nortest)
pearson.test(babyboom_data$V3)

## 
##  Pearson chi-square normality test
## 
## data:  babyboom_data$V3
## P = 20.091, p-value = 0.005377

According to the results above, only K-S D test show a different result. Because I the mean and standard deviation of population is different with sample’s. I can conclude the weight of all baby is not normal distribution.

For Boys

shapiro.test(baby_boys$V3)

## 
##  Shapiro-Wilk normality test
## 
## data:  baby_boys$V3
## W = 0.94747, p-value = 0.2022

Since the p-value is larger that \(0.05\), we cannot reject the null hypothesis that the weights of baby boys is normal distribution.

For Girls

shapiro.test(baby_girls$V3)

## 
##  Shapiro-Wilk normality test
## 
## data:  baby_girls$V3
## W = 0.87028, p-value = 0.01798

Since the p-value is smaller that \(0.05\), we reject the null hypothesis that the weights of baby girls is normal distribution.

Test the hypothesis if the mean of the weight of girls is the same as the weight of boys.

One of our sample is not normal distribution, but the other one is. So I use non-parametrical test Wilcoxon rank sum test and K-S test to compare the means of our two samples.

Wilcoxon Rank sum test

wilcox.test(baby_boys$V3,baby_girls$V3,alternative = "two.sided", exact = FALSE)

## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  baby_boys$V3 and baby_girls$V3
## W = 273.5, p-value = 0.3519
## alternative hypothesis: true location shift is not equal to 0

Since the p-value is large than confidence level \(0.05\), we cannot reject the null hypothesis that the weight of baby boy’s is the same as the girl’s. These two sample are came from the same distribution. And the means of the weight of two sample are same.

Two Sample Kolmogorov-Smirnov Tests

ks.test(baby_boys$V3,baby_girls$V3,alternative = "two.sided", exact = FALSE)

## Warning in ks.test(baby_boys$V3, baby_girls$V3, alternative = "two.sided", : p-
## value will be approximate in the presence of ties

## 
##  Two-sample Kolmogorov-Smirnov test
## 
## data:  baby_boys$V3 and baby_girls$V3
## D = 0.23932, p-value = 0.5762
## alternative hypothesis: two-sided

Similarity as the Wilcoxon Rank sum test, Two Sample Kolmogorov-Smirnov Tests give me the same result.

Student-T test

Finally, I try to use Student-T test in this test, the result is same.

t.test(baby_boys$V3,baby_girls$V3,alternative = "two.sided")

## 
##  Welch Two Sample t-test
## 
## data:  baby_boys$V3 and baby_girls$V3
## t = 1.4211, df = 27.631, p-value = 0.1665
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -107.4273  593.1538
## sample estimates:
## mean of x mean of y 
##  3375.308  3132.444

Test the hypothesis if the variance of the weight of girls is the same as the weight of boys.

The two sample are not all from the normal distribution. F-test is sensitive to the sample from normal distribution and have different length. Then Bartlett Test and F-test is unavailable, I use Levene’s Test to test the Homogeneity of Variances.

library(carData)
library(car)
leveneTest(babyboom_data$V3,babyboom_data$V2)

## Warning in leveneTest.default(babyboom_data$V3, babyboom_data$V2):
## babyboom_data$V2 coerced to factor.

## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  1  1.8154 0.1851
##       42

P-value is larger than \(0.05\), so we cannot reject the null hypothesis that the variance of the weight of girls is the same as the weight of boys.

Then I tried to use F-test to test the null hypothesis and got the same result.

var.test(baby_boys$V3,baby_girls$V3,alternative = "two.sided")

## 
##  F test to compare two variances
## 
## data:  baby_boys$V3 and baby_girls$V3
## F = 0.45933, num df = 25, denom df = 17, p-value = 0.07526
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
##  0.1802395 1.0839460
## sample estimates:
## ratio of variances 
##          0.4593257

One-Sample Tests for Exponentiality

One-sample Kolmogorov–Smirnov test

library(exptest)
ks.exp.test(babyboom_data$V4,nrepl = 2000)

## 
##  Kolmogorov-Smirnov test for exponentiality
## 
## data:  babyboom_data$V4
## KSn = 0.23477, p-value = 5e-04

ks.test(babyboom_data$V4,"pexp")

## 
##  One-sample Kolmogorov-Smirnov test
## 
## data:  babyboom_data$V4
## D = 0.99326, p-value < 2.2e-16
## alternative hypothesis: two-sided

#nrepl is the number of replications in Monte Carlo simulation.

Cramer–von Mises test

library(exptest)
cvm.exp.test(babyboom_data$V4,nrepl = 2000)

## 
##  Cramer-von Mises test for exponentiality
## 
## data:  babyboom_data$V4
## Wn = 0.75668, p-value = 1

Atkinson test

library(exptest)
atkinson.exp.test(babyboom_data$V4,nrepl = 2000)

## 
##  Atkinson test for exponentiality
## 
## data:  babyboom_data$V4
## T = 0.016934, p-value = 0.001615

Lorenz test

library(exptest)
lorenz.exp.test(babyboom_data$V4,nrepl = 2000)

## 
##  Lorenz test for exponentiality
## 
## data:  babyboom_data$V4
## L = 0.27801, p-value = 0.4086

Shapiro-Wilk test for exponentiality:

library(exptest)
shapiro.exp.test(babyboom_data$V4,nrepl = 2000)

## 
##  Shapiro-Wilk test for exponentiality
## 
## data:  babyboom_data$V4
## W = 0.084434, p-value = 1

# n: from 3 to 100

Kimber-Michael test for exponentially:

library(exptest)
kimber.exp.test(babyboom_data$V4,nrepl = 2000)

## 
##  Kimber-Michael test for exponentiality
## 
## data:  babyboom_data$V4
## D = 0.19583, p-value < 2.2e-16

Using different method I got the different result, according to the p-values, there of them are smaller than \(0.05\), but the others are larger than \(0.05\). So, I’m not sure the sample is fitted in exponential distribution.

Test the hypothesis if the births per hour for each hour is distributed by Poisson distribution

Firstly, I build the frequency matrix about the births per hour, then using Goodness-of-fit Tests to continue the next test.

hist(babyboom_data$V4,breaks = seq(0,1440,by=60),plot = FALSE)->b
b

## $breaks
##  [1]    0   60  120  180  240  300  360  420  480  540  600  660  720  780  840
## [16]  900  960 1020 1080 1140 1200 1260 1320 1380 1440
## 
## $counts
##  [1] 1 3 1 0 4 0 0 2 2 1 3 1 2 1 4 1 2 1 3 4 3 2 1 2
## 
## $density
##  [1] 0.0003787879 0.0011363636 0.0003787879 0.0000000000 0.0015151515
##  [6] 0.0000000000 0.0000000000 0.0007575758 0.0007575758 0.0003787879
## [11] 0.0011363636 0.0003787879 0.0007575758 0.0003787879 0.0015151515
## [16] 0.0003787879 0.0007575758 0.0003787879 0.0011363636 0.0015151515
## [21] 0.0011363636 0.0007575758 0.0003787879 0.0007575758
## 
## $mids
##  [1]   30   90  150  210  270  330  390  450  510  570  630  690  750  810  870
## [16]  930  990 1050 1110 1170 1230 1290 1350 1410
## 
## $xname
## [1] "babyboom_data$V4"
## 
## $equidist
## [1] TRUE
## 
## attr(,"class")
## [1] "histogram"

#build the frequency matrix
as.data.frame(table(b$counts))->fre_matrix
fre_matrix

##   Var1 Freq
## 1    0    3
## 2    1    8
## 3    2    6
## 4    3    4
## 5    4    3

library(grid)
library(vcd)
goodfit(fre_matrix$Freq,type = "poisson","MinChisq")->gf
#plot(gf,main="Count data vs Poisson distribution")
summary(gf)

## Warning in summary.goodfit(gf): Chi-squared approximation may be incorrect

## 
##   Goodness-of-fit test for poisson distribution
## 
##              X^2 df  P(> X^2)
## Pearson 4.555953  7 0.7139696

Obviously, p-value is larger than \(0.05\), as a result I cannot reject the null hypothesis that the births per hour is distributed by Poisson distribution.

Euroweight

Read Data

To read “txt” files, I use R function - read.table().

read.table('../datasets/euroweight.dat.txt',header = FALSE,
           dec = '.',na.strings = 'NA') -> euroweight_data

For dataset euroweight, variable descriptions are as follows:

V1: ID - this is the case number
V2: weight - weight of the euro coin in grams
V3: batch - number of the package

One-Sample Tests for Normality

For whole sample

shapiro.test(euroweight_data$V2)

## 
##  Shapiro-Wilk normality test
## 
## data:  euroweight_data$V2
## W = 0.97547, p-value < 2.2e-16

The p-value of whole sample is large than \(0.05\), so we need to reject the null hypothesis.

For each group

I write a function to test the normality for more than one group in a sample. The results are shown in below.

source('../shapiro.test.mulity.R')
shapiro.test.multi(euroweight_data,"V2","V3")

## Loading required package: magrittr

##             No        Group         W      p.value       norm.test
## 1            1            1 0.9955066 6.830017e-01            Norm
## 2            2            2 0.9909001 1.218770e-01            Norm
## 3            3            3 0.8634321 4.089445e-14 Other_situation
## 4            4            4 0.9955047 6.826586e-01            Norm
## 5            5            5 0.9910340 1.289928e-01            Norm
## 6            6            6 0.9840595 6.756499e-03 Other_situation
## 7            7            7 0.9907008 1.119834e-01            Norm
## 8            8            8 0.9367201 6.827698e-09 Other_situation
## 9 Test Method: Shapiro-Wilk        NA           NA            <NA>

Test the hypothesis that the mean of the weight of coins is the same in different packages

As the table shown in above, not all group in the sample is distributed by normal distribution. So I use non-parametrical test pairwise.wilcox.test and Kruskal-Wallis test.

Kruskal-Wallis test

kruskal.test(euroweight_data$V2~euroweight_data$V3)

## 
##  Kruskal-Wallis rank sum test
## 
## data:  euroweight_data$V2 by euroweight_data$V3
## Kruskal-Wallis chi-squared = 97.5, df = 7, p-value < 2.2e-16

According to the result, I can conclude we need to reject the null hypothesis that the mean of weight in each group are same.

Pairwise Wilcox test

pairwise.wilcox.test(euroweight_data$V2,euroweight_data$V3)

## 
##  Pairwise comparisons using Wilcoxon rank sum test 
## 
## data:  euroweight_data$V2 and euroweight_data$V3 
## 
##   1       2       3       4       5       6       7      
## 2 1.00000 -       -       -       -       -       -      
## 3 0.04297 0.00025 -       -       -       -       -      
## 4 0.00141 0.10329 7.8e-12 -       -       -       -      
## 5 0.00108 0.10329 2.6e-12 1.00000 -       -       -      
## 6 0.76768 0.04297 1.00000 2.9e-07 1.7e-07 -       -      
## 7 1.00000 1.00000 0.00012 0.10329 0.10202 0.04297 -      
## 8 1.00000 0.10329 0.73578 1.4e-06 7.1e-07 1.00000 0.10202
## 
## P value adjustment method: holm

Pairwise T test

pairwise.t.test(euroweight_data$V2,euroweight_data$V3)

## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  euroweight_data$V2 and euroweight_data$V3 
## 
##   1       2       3       4       5       6       7      
## 2 1.00000 -       -       -       -       -       -      
## 3 0.01455 0.00014 -       -       -       -       -      
## 4 0.00285 0.11938 3.2e-11 -       -       -       -      
## 5 0.00203 0.10225 1.7e-11 1.00000 -       -       -      
## 6 1.00000 0.11938 0.47138 3.9e-06 2.4e-06 -       -      
## 7 1.00000 1.00000 0.00017 0.11019 0.09317 0.11942 -      
## 8 1.00000 0.32960 0.18828 4.6e-05 3.0e-05 1.00000 0.33590
## 
## P value adjustment method: holm

As we can see, there are several group are not normal distribution in the sample, compare the results to Pairwise Wilcox test, we can find that different results are in the pairs which is related with group 3,6,8.

Iris

Read Data

To read “txt” files, I use R function - read.table().

read.table('../datasets/iris.txt',header = FALSE,
           dec = '.',na.strings = 'NA',sep = ",") -> iris_data

For dataset iris, variable descriptions are as follows:

sepal length in cm
sepal width in cm
petal length in cm
petal width in cm
class

Test the normality of length of flowers grouping them by the type of iris

Similar to the previous example, I use function shapiro.test.mulity for normality test.

source('../shapiro.test.mulity.R')
shapiro.test.multi(iris_data,"V1","V5")

##             No           Group         W   p.value norm.test
## 1            1     Iris-setosa 0.9776985 0.4595132      Norm
## 2            2 Iris-versicolor 0.9778357 0.4647370      Norm
## 3            3  Iris-virginica 0.9711794 0.2583147      Norm
## 4 Test Method:    Shapiro-Wilk        NA        NA      <NA>

As shown in the table above, length of flowers for each group are distributed by normal distribution.

Test the hypotheses about similarity of distributions of characteristics of flowers of different types

For sepal length

pairwise.wilcox.test(iris_data$V1,iris_data$V5)

## 
##  Pairwise comparisons using Wilcoxon rank sum test 
## 
## data:  iris_data$V1 and iris_data$V5 
## 
##                 Iris-setosa Iris-versicolor
## Iris-versicolor 1.7e-13     -              
## Iris-virginica  < 2e-16     5.9e-07        
## 
## P value adjustment method: holm

The results above shows that these pair of sample are not come from one normal distribution.

For petal length

pairwise.wilcox.test(iris_data$V3,iris_data$V5)

## 
##  Pairwise comparisons using Wilcoxon rank sum test 
## 
## data:  iris_data$V3 and iris_data$V5 
## 
##                 Iris-setosa Iris-versicolor
## Iris-versicolor <2e-16      -              
## Iris-virginica  <2e-16      <2e-16         
## 
## P value adjustment method: holm

The result is as same as sepal length.

Test the hypotheses if the means and variances of the characteristics of flowers of different types are equal

For sepal length

Means

pairwise.t.test(iris_data$V1,iris_data$V5)

## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  iris_data$V1 and iris_data$V5 
## 
##                 Iris-setosa Iris-versicolor
## Iris-versicolor 1.8e-15     -              
## Iris-virginica  < 2e-16     2.8e-09        
## 
## P value adjustment method: holm

The result above shown the means of sepal length of flowers in different group are not equal.

Variances

bartlett.test(iris_data$V1,iris_data$V5)

## 
##  Bartlett test of homogeneity of variances
## 
## data:  iris_data$V1 and iris_data$V5
## Bartlett's K-squared = 16.006, df = 2, p-value = 0.0003345

The result shown us that the variance of sepal length of flowers in different group don’t have homogeneity.

For sepal width

Means

pairwise.t.test(iris_data$V2,iris_data$V5)

## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  iris_data$V2 and iris_data$V5 
## 
##                 Iris-setosa Iris-versicolor
## Iris-versicolor < 2e-16     -              
## Iris-virginica  2.1e-09     0.0032         
## 
## P value adjustment method: holm

The result above shown the means of sepal width of flowers in different group are not equal.

Variances

bartlett.test(iris_data$V2,iris_data$V5)

## 
##  Bartlett test of homogeneity of variances
## 
## data:  iris_data$V2 and iris_data$V5
## Bartlett's K-squared = 2.2158, df = 2, p-value = 0.3302

The result shown us that the variance of sepal width of flowers in different group have homogeneity.

For petal length

Means

pairwise.t.test(iris_data$V3,iris_data$V5)

## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  iris_data$V3 and iris_data$V5 
## 
##                 Iris-setosa Iris-versicolor
## Iris-versicolor <2e-16      -              
## Iris-virginica  <2e-16      <2e-16         
## 
## P value adjustment method: holm

The result above shown the means of petal length of flowers in different group are not equal.

Variances

bartlett.test(iris_data$V3,iris_data$V5)

## 
##  Bartlett test of homogeneity of variances
## 
## data:  iris_data$V3 and iris_data$V5
## Bartlett's K-squared = 55.494, df = 2, p-value = 8.905e-13

The result shown us that the variance of petal length of flowers in different group don’t have homogeneity.

For petal width

Means

pairwise.t.test(iris_data$V4,iris_data$V5)

## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  iris_data$V4 and iris_data$V5 
## 
##                 Iris-setosa Iris-versicolor
## Iris-versicolor <2e-16      -              
## Iris-virginica  <2e-16      <2e-16         
## 
## P value adjustment method: holm

The result above shown the means of petal width of flowers in different group are not equal.

Variances

bartlett.test(iris_data$V4,iris_data$V5)

## 
##  Bartlett test of homogeneity of variances
## 
## data:  iris_data$V4 and iris_data$V5
## Bartlett's K-squared = 37.996, df = 2, p-value = 5.615e-09

The result shown us that the variance of petal width of flowers in different group don’t have homogeneity.

Height

Read Data

To read “xlsx” files, I use R function - read_excel in package readxl.

library(readxl)
read_excel('../datasets/height.xlsx',sheet = 1) -> height_data

For dataset height, variable descriptions are as follows:

height of football players
height of basketball players

Test the normality of heights of football and basketball players

For football players

shapiro.test(height_data$HtFt)

## 
##  Shapiro-Wilk normality test
## 
## data:  height_data$HtFt
## W = 0.93655, p-value = 0.01609

Since, p-value is smaller than \(0.05\), we could reject the null hypothesis that the heights of football player is distributed by normal distribution.

For basketball player

shapiro.test(height_data$HtBk)

## 
##  Shapiro-Wilk normality test
## 
## data:  height_data$HtBk
## W = 0.96839, p-value = 0.3197

Since, p-value is smaller than \(0.05\), we could not reject the null hypothesis that the heights of basketball player is distributed by normal distribution.

Test the equity of means and variances of the heights of football and basketball players.

Means

wilcox.test(jitter(height_data$HtFt),jitter(height_data$HtBk))

## 
##  Wilcoxon rank sum test
## 
## data:  jitter(height_data$HtFt) and jitter(height_data$HtBk)
## W = 531, p-value = 0.0009971
## alternative hypothesis: true location shift is not equal to 0

ks.test(jitter(height_data$HtFt),jitter(height_data$HtBk))

## 
##  Two-sample Kolmogorov-Smirnov test
## 
## data:  jitter(height_data$HtFt) and jitter(height_data$HtBk)
## D = 0.33889, p-value = 0.01137
## alternative hypothesis: two-sided

The results shown that the means of the heights of football and basketball players are not equal.

Variances

library(carData)
library(car)
leveneTest(height_data$HtFt,height_data$HtBk)

## Warning in leveneTest.default(height_data$HtFt, height_data$HtBk):
## height_data$HtBk coerced to factor.

## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group 14  0.5718 0.8617
##       25

P-value is larger than \(0.05\), So we cannot reject the null hypothesis that the variances of the heights of football and basketball players are equal.

Test if the distributions of the heights of football and basketball players are the same.

wilcox.test(jitter(height_data$HtFt),jitter(height_data$HtBk))

## 
##  Wilcoxon rank sum test
## 
## data:  jitter(height_data$HtFt) and jitter(height_data$HtBk)
## W = 512, p-value = 0.0005228
## alternative hypothesis: true location shift is not equal to 0

ks.test(jitter(height_data$HtFt),jitter(height_data$HtBk))

## 
##  Two-sample Kolmogorov-Smirnov test
## 
## data:  jitter(height_data$HtFt) and jitter(height_data$HtBk)
## D = 0.35, p-value = 0.008027
## alternative hypothesis: two-sided

According to the results, we could reject the null hypothesis that the distributions of the heights of football and basketball players are the same.

Sugery

Read Data

To read “xlsx” files, I use R function - read_excel in package readxl.

library(readxl)
read_excel('../datasets/surgery.xlsx',sheet = 1,col_names = TRUE) -> surgery_data
na.omit(surgery_data)->surgery_data_omitNA

Binomial Test

temp = surgery_data_omitNA$`B V right`<surgery_data_omitNA$`A V right` & 
  surgery_data_omitNA$`B V left`<surgery_data_omitNA$`A V left`
as.data.frame(table(temp))

##    temp Freq
## 1 FALSE   18
## 2  TRUE   69

binom.test(69,87,0.7)

## 
##  Exact binomial test
## 
## data:  69 and 87
## number of successes = 69, number of trials = 87, p-value = 0.06129
## alternative hypothesis: true probability of success is not equal to 0.7
## 95 percent confidence interval:
##  0.6928684 0.8725251
## sample estimates:
## probability of success 
##              0.7931034

P-value is larger than \(0.05\), so we cannot reject the null hypothesis that the operation is successful with probability 0.7.

Acknowledgements

Thanks for knitr designed by(Xie 2015).

References

Xie, Yihui. 2015. Dynamic Documents with R and Knitr. 2nd ed. Boca Raton, Florida: Chapman; Hall/CRC. http://yihui.name/knitr/.

Lecture 2 - Tests - Zhao Chi - 19.M09

Zhao Chi

Babyboom

Data Read and Separete

One-Sample Test for Normality

For all baby

For Boys

For Girls

Test the hypothesis if the mean of the weight of girls is the same as the weight of boys.

Wilcoxon Rank sum test

Two Sample Kolmogorov-Smirnov Tests

Student-T test

Test the hypothesis if the variance of the weight of girls is the same as the weight of boys.

One-Sample Tests for Exponentiality

One-sample Kolmogorov–Smirnov test

Cramer–von Mises test

Atkinson test

Lorenz test

Shapiro-Wilk test for exponentiality:

Kimber-Michael test for exponentially:

Test the hypothesis if the births per hour for each hour is distributed by Poisson distribution

Euroweight

Read Data

One-Sample Tests for Normality

For whole sample

For each group

Test the hypothesis that the mean of the weight of coins is the same in different packages

Kruskal-Wallis test

Pairwise Wilcox test

Pairwise T test

Iris

Read Data

Test the normality of length of flowers grouping them by the type of iris

Test the hypotheses about similarity of distributions of characteristics of flowers of different types

For sepal length

For petal length

Test the hypotheses if the means and variances of the characteristics of flowers of different types are equal

For sepal length

Means

Variances

For sepal width

Means

Variances

For petal length

Means

Variances

For petal width

Means

Variances

Height

Read Data

Test the normality of heights of football and basketball players

For football players

For basketball player

Test the equity of means and variances of the heights of football and basketball players.

Means

Variances

Test if the distributions of the heights of football and basketball players are the same.

Sugery

Read Data

Binomial Test

Acknowledgements

References