Training Course on Capacity Building of NARS Scientists in Advance Analytical Techniques

Blogs

25 November 2019

Introduction
Regression Analysis
Simple Linear Regression
- Example
- Example
- Example
Multiple Linear Regression
- Example
Polynomial Regression Analysis
- Example
Analysis of Variance (ANOVA)
- Example
- Example
Analysis of Covariance (ANCOVA)
Correlation Analysis
Simple Correlation Analysis
- Example
Partial Correlation Analysis
- Example
Multiple Correlation Analysis
- Example
Completely Randomized Design (CRD)
- Example
Randomized Complete Block Design (RCBD)
- Example
Latin Square Design
- Example
Factorial Experiment under RCBD
Stability Analysis

Introduction

R is a free, open-source programming language and software environment for statistical computing, bioinformatics, visualization and general computing. R provides a wide variety of statistical and graphical techniques, and is highly extensible. The latest version of R can be obtained from https://cran.r-project.org/bin/windows/base/.

R R for Windows (32/64 bit)

RStudio RStudio

Manual An Introduction to R

Ref Card R Reference Card

New York Times New York Times

Nature Article Nature Article

Regression Analysis

Quantifying dependency of a normal response on quantitative explanatory variable(s)

alt text

Simple Linear Regression

Quantifying dependency of a normal response on a quantitative explanatory variable

Example

Fertilizer (Kg/acre)	Production (Bushels/acre)
100	70
200	70
400	80
500	100

# Fertilizer (Kg/acre)
# Production (Bushels/acre)
Fertilizer <- c(100, 200, 400, 500)
Production <- c( 70,  70,  80, 100)
df1        <- data.frame(Fertilizer, Production)
df1

  Fertilizer Production
1        100         70
2        200         70
3        400         80
4        500        100

library(ggplot2)
# if (!require("ggplot2")) install.packages("ggplot2")
p1 <- 
  ggplot(data = df1, mapping = aes(x = Fertilizer, y = Production)) + 
  geom_point() +
  labs(x = "Fertilizer", y = "Production") +
  theme_bw()
print(p1)

fm1 <- lm(formula = Production ~ Fertilizer, data = df1)
summary(fm1)


Call:
lm(formula = Production ~ Fertilizer, data = df1)

Residuals:
 1  2  3  4 
 4 -3 -7  6 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)  
(Intercept) 59.00000    7.95299   7.419   0.0177 *
Fertilizer   0.07000    0.02345   2.985   0.0963 .
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 7.416 on 2 degrees of freedom
Multiple R-squared:  0.8167,    Adjusted R-squared:  0.725 
F-statistic: 8.909 on 1 and 2 DF,  p-value: 0.0963

anova(fm1)

Analysis of Variance Table

Response: Production
           Df Sum Sq Mean Sq F value Pr(>F)  
Fertilizer  1    490     490  8.9091 0.0963 .
Residuals   2    110      55                 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Example

Fertilizer	Yield
0.3	10
0.6	15
0.9	30
1.2	35
1.5	25
1.8	30
2.1	50
2.4	45

Fert  <- c(0.3, 0.6, 0.9, 1.2, 1.5, 1.8, 2.1, 2.4)
Yield <- c(10, 15, 30, 35, 25, 30, 50, 45)
df2   <- data.frame(Fert, Yield)
df2

  Fert Yield
1  0.3    10
2  0.6    15
3  0.9    30
4  1.2    35
5  1.5    25
6  1.8    30
7  2.1    50
8  2.4    45

library(ggplot2)
# if (!require("ggplot2")) install.packages("ggplot2")
p2 <- 
  ggplot(data = df2, mapping = aes(x = Fert, y = Yield)) + 
  geom_point() +
  labs(x = "Fertilizer", y = "Yield") +
  theme_bw()
print(p2)

fm2 <- lm(formula = Yield ~ Fert, data = df2)
summary(fm2)


Call:
lm(formula = Yield ~ Fert, data = df2)

Residuals:
   Min     1Q Median     3Q    Max 
-7.441 -4.018 -2.441  7.351  7.798 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)   
(Intercept)    8.036      5.504   1.460   0.1946   
Fert          16.270      3.633   4.478   0.0042 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 7.064 on 6 degrees of freedom
Multiple R-squared:  0.7697,    Adjusted R-squared:  0.7313 
F-statistic: 20.05 on 1 and 6 DF,  p-value: 0.004202

anova(fm2)

Analysis of Variance Table

Response: Yield
          Df Sum Sq Mean Sq F value   Pr(>F)   
Fert       1 1000.6  1000.6  20.052 0.004202 **
Residuals  6  299.4    49.9                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Example

Weekly Income ($)	Weekly Expenditures ($)
80	70
100	65
120	90
140	95
160	110
180	115
200	120
220	140
240	155
260	150

# Weekly Income ($) of a Family
# Weekly Expenditures ($) of a Family
Income       <- seq(from = 80, to = 260, by = 20)
Expenditures <- c(70, 65, 90, 95, 110, 115, 120, 140, 155, 150)
df3          <- data.frame(Income, Expenditures)
df3

   Income Expenditures
1      80           70
2     100           65
3     120           90
4     140           95
5     160          110
6     180          115
7     200          120
8     220          140
9     240          155
10    260          150

library(ggplot2)
# if (!require("ggplot2")) install.packages("ggplot2")
p3 <- 
  ggplot(data = df3, mapping = aes(x = Income, y = Expenditures)) + 
  geom_point() +
  scale_x_continuous(labels = scales::dollar) +
  scale_y_continuous(labels = scales::dollar) +
  labs(x = "Weekly income, $", y = "Weekly consumption expenditures, $") +
  theme_bw()
print(p3)

fm3 <- lm(formula = Expenditures ~ Income, data = df3)
summary(fm3)


Call:
lm(formula = Expenditures ~ Income, data = df3)

Residuals:
    Min      1Q  Median      3Q     Max 
-10.364  -4.977   1.409   4.364   8.364 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 24.45455    6.41382   3.813  0.00514 ** 
Income       0.50909    0.03574  14.243 5.75e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 6.493 on 8 degrees of freedom
Multiple R-squared:  0.9621,    Adjusted R-squared:  0.9573 
F-statistic: 202.9 on 1 and 8 DF,  p-value: 5.753e-07

anova(fm3)

Analysis of Variance Table

Response: Expenditures
          Df Sum Sq Mean Sq F value    Pr(>F)    
Income     1 8552.7  8552.7  202.87 5.753e-07 ***
Residuals  8  337.3    42.2                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Multiple Linear Regression

Quantifying dependency of a normal response on two or more quantitative explanatory variables

Example

Fertilizer (Kg)	Rainfall (mm)	Yield (Kg)
100	10	40
200	20	50
300	10	50
400	30	70
500	20	65
600	20	65
700	30	80

# Fertilizer (Kg)
# Rainfall (mm)
# Yield (Kg)
Fertilizer <- seq(from = 100, to = 700, by = 100)
Rainfall   <- c(10, 20, 10, 30, 20, 20, 30)
Yield      <- c(40, 50, 50, 70, 65, 65, 80)
df4        <- data.frame(Fertilizer, Rainfall, Yield)
df4

  Fertilizer Rainfall Yield
1        100       10    40
2        200       20    50
3        300       10    50
4        400       30    70
5        500       20    65
6        600       20    65
7        700       30    80

library(car)
# if (!require("car")) install.packages("car")
car::scatter3d(formula = Yield ~ Fertilizer + Rainfall, data = df4)

fm4 <- lm(formula = Yield ~ Fertilizer + Rainfall, data = df4)
summary(fm4)


Call:
lm(formula = Yield ~ Fertilizer + Rainfall, data = df4)

Residuals:
      1       2       3       4       5       6       7 
-0.2381 -2.3810  2.1429  1.6667  1.1905 -2.6190  0.2381 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 28.095238   2.491482  11.277 0.000352 ***
Fertilizer   0.038095   0.005832   6.532 0.002838 ** 
Rainfall     0.833333   0.154303   5.401 0.005690 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.315 on 4 degrees of freedom
Multiple R-squared:  0.9814,    Adjusted R-squared:  0.972 
F-statistic: 105.3 on 2 and 4 DF,  p-value: 0.0003472

anova(fm4)

Analysis of Variance Table

Response: Yield
           Df Sum Sq Mean Sq F value    Pr(>F)    
Fertilizer  1 972.32  972.32 181.500 0.0001756 ***
Rainfall    1 156.25  156.25  29.167 0.0056899 ** 
Residuals   4  21.43    5.36                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

library(lm.beta)
# if (!require("lm.beta")) install.packages("lm.beta")
lm.beta(fm4)


Call:
lm(formula = Yield ~ Fertilizer + Rainfall, data = df4)

Standardized Coefficients::
(Intercept)  Fertilizer    Rainfall 
  0.0000000   0.5944301   0.4914732

Polynomial Regression Analysis

Quantifying non-linear dependency of a normal response on quantitative explanatory variable(s)

Example

An experiment was conducted to evaluate the effects of different levels of nitrogen. Three levels of nitrogen: 0, 10 and 20 grams per plot were used in the experiment. Each treatment was replicated twice and data is given below:

Nitrogen	Yield
0	5
0	7
10	15
10	17
20	9
20	11

Nitrogen <- c(0, 0, 10, 10, 20, 20)
Yield    <- c(5, 7, 15, 17,  9, 11)
df5      <- data.frame(Nitrogen, Yield)
df5

  Nitrogen Yield
1        0     5
2        0     7
3       10    15
4       10    17
5       20     9
6       20    11

library(ggplot2)
# if (!require("ggplot2")) install.packages("ggplot2")
p5 <- 
  ggplot(data = df5, mapping = aes(x = Nitrogen, y = Yield)) + 
  geom_point() +
  labs(x = "Nitrogen", y = "Yield") +
  theme_bw()
print(p5)

fm5 <- lm(formula = Yield ~ Nitrogen + I(Nitrogen^2), data = df5)
# fm5 <- lm(formula = Yield ~ poly(x = Nitrogen, degree = 2, raw = TRUE), data = df5)
summary(fm5)


Call:
lm(formula = Yield ~ Nitrogen + I(Nitrogen^2), data = df5)

Residuals:
 1  2  3  4  5  6 
-1  1 -1  1 -1  1 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)   
(Intercept)    6.00000    1.00000   6.000  0.00927 **
Nitrogen       1.80000    0.25495   7.060  0.00584 **
I(Nitrogen^2) -0.08000    0.01225  -6.532  0.00729 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.414 on 3 degrees of freedom
Multiple R-squared:  0.9441,    Adjusted R-squared:  0.9068 
F-statistic: 25.33 on 2 and 3 DF,  p-value: 0.01322

anova(fm5)

Analysis of Variance Table

Response: Yield
              Df Sum Sq Mean Sq F value   Pr(>F)   
Nitrogen       1 16.000  16.000   8.000 0.066276 . 
I(Nitrogen^2)  1 85.333  85.333  42.667 0.007292 **
Residuals      3  6.000   2.000                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Analysis of Variance (ANOVA)

Comparing means of Normal dependent variable for levels of different factor(s)

alt text

Example

Yield	Variety
5	V1
6	V1
7	V1
15	V2
16	V2
17	V2

Yield   <- c(5, 6, 7, 15, 16, 17)
Variety <- gl(n = 2, k = 3, length = 2*3, labels = c("V1", "V2"))
df6     <- data.frame(Yield, Variety)
df6

  Yield Variety
1     5      V1
2     6      V1
3     7      V1
4    15      V2
5    16      V2
6    17      V2

library(ggplot2)
# if (!require("ggplot2")) install.packages("ggplot2")
p6 <- 
  ggplot(data = df6, mapping = aes(x = Variety, y = Yield)) + 
  geom_point() +
  geom_boxplot() +
  labs(x = "Variety", y = "Yield") +
  theme_bw()
print(p6)

fm6 <- lm(formula = Yield ~ Variety, data = df6)
summary(fm6)


Call:
lm(formula = Yield ~ Variety, data = df6)

Residuals:
         1          2          3          4          5          6 
-1.000e+00 -4.774e-15  1.000e+00 -1.000e+00 -1.110e-16  1.000e+00 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   6.0000     0.5774   10.39 0.000484 ***
VarietyV2    10.0000     0.8165   12.25 0.000255 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1 on 4 degrees of freedom
Multiple R-squared:  0.974, Adjusted R-squared:  0.9675 
F-statistic:   150 on 1 and 4 DF,  p-value: 0.0002552

anova(fm6)

Analysis of Variance Table

Response: Yield
          Df Sum Sq Mean Sq F value    Pr(>F)    
Variety    1    150     150     150 0.0002552 ***
Residuals  4      4       1                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Example

Yield	Variety
5	V1
7	V1
15	V2
17	V2
17	V3
19	V3

Yield   <- c(5, 7, 15, 17, 17, 19)
Variety <- gl(n = 3, k = 2, length = 3*2, labels = c("V1", "V2", "V3"))
df7     <- data.frame(Yield, Variety)
df7

  Yield Variety
1     5      V1
2     7      V1
3    15      V2
4    17      V2
5    17      V3
6    19      V3

library(ggplot2)
# if (!require("ggplot2")) install.packages("ggplot2")
p7 <- 
  ggplot(data = df7, mapping = aes(x = Variety, y = Yield)) + 
  geom_point() +
  geom_boxplot() +
  labs(x = "Variety", y = "Yield") +
  theme_bw()
print(p7)

fm7 <- lm(formula = Yield ~ Variety, data = df7)
summary(fm7)


Call:
lm(formula = Yield ~ Variety, data = df7)

Residuals:
 1  2  3  4  5  6 
-1  1 -1  1 -1  1 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)   
(Intercept)    6.000      1.000   6.000  0.00927 **
VarietyV2     10.000      1.414   7.071  0.00582 **
VarietyV3     12.000      1.414   8.485  0.00344 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.414 on 3 degrees of freedom
Multiple R-squared:  0.965, Adjusted R-squared:  0.9416 
F-statistic: 41.33 on 2 and 3 DF,  p-value: 0.006553

anova(fm7)

Analysis of Variance Table

Response: Yield
          Df Sum Sq Mean Sq F value   Pr(>F)   
Variety    2 165.33  82.667  41.333 0.006553 **
Residuals  3   6.00   2.000                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Analysis of Covariance (ANCOVA)

Quantifying dependency of a normal response on quantitative explanatory variable(s)

Comparing means of Normal dependent variable for levels of different factor(s)

alt text

Example

Yield	Fert	Variety
51	80	V1
52	80	V1
53	90	V1
54	90	V1
56	100	V1
57	100	V1
55	80	V2
56	80	V2
58	90	V2
59	90	V2
62	100	V2
63	100	V2

Yield    <- c(51, 52, 53, 54, 56, 57, 55, 56, 58, 59, 62, 63)
Fert     <- rep(x = c(80, 90, 100), each = 2)
Variety  <- gl(n = 2, k = 6, length = 2*6, labels = c("V1", "V2"), ordered = FALSE)
df8         <- data.frame(Yield, Fert, Variety)
df8

   Yield Fert Variety
1     51   80      V1
2     52   80      V1
3     53   90      V1
4     54   90      V1
5     56  100      V1
6     57  100      V1
7     55   80      V2
8     56   80      V2
9     58   90      V2
10    59   90      V2
11    62  100      V2
12    63  100      V2

Same intercepts but different slopes

fm8 <- lm(formula = Yield ~ Fert + Variety, data = df8)
summary(fm8)


Call:
lm(formula = Yield ~ Fert + Variety, data = df8)

Residuals:
    Min      1Q  Median      3Q     Max 
-0.8333 -0.8333  0.1667  0.1667  1.1667 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 26.83333    2.54557   10.54 2.30e-06 ***
Fert         0.30000    0.02805   10.69 2.04e-06 ***
VarietyV2    5.00000    0.45812   10.91 1.72e-06 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.7935 on 9 degrees of freedom
Multiple R-squared:  0.9629,    Adjusted R-squared:  0.9546 
F-statistic: 116.7 on 2 and 9 DF,  p-value: 3.657e-07

anova(fm8)

Analysis of Variance Table

Response: Yield
          Df Sum Sq Mean Sq F value    Pr(>F)    
Fert       1 72.000   72.00  114.35 2.042e-06 ***
Variety    1 75.000   75.00  119.12 1.720e-06 ***
Residuals  9  5.667    0.63                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

library(ggplot2)
# if (!require("ggplot2")) install.packages("ggplot2")
p8 <- 
  ggplot(data = df8, mapping = aes(x = Fert, y = Yield, color = Variety)) + 
  geom_point() +
  geom_line(mapping = aes(y = predict(fm8))) +
  labs(x = "Fert", y = "Yield") +
  theme_bw()
print(p8)

Different intercepts and different slopes

fm9 <- lm(formula = Yield ~ Fert * Variety, data = df8)
summary(fm9)


Call:
lm(formula = Yield ~ Fert * Variety, data = df8)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.83333 -0.33333 -0.08333  0.66667  0.66667 

Coefficients:
               Estimate Std. Error t value Pr(>|t|)    
(Intercept)    31.33333    3.05903  10.243 7.09e-06 ***
Fert            0.25000    0.03385   7.385 7.73e-05 ***
VarietyV2      -4.00000    4.32612  -0.925   0.3822    
Fert:VarietyV2  0.10000    0.04787   2.089   0.0701 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.677 on 8 degrees of freedom
Multiple R-squared:  0.976, Adjusted R-squared:  0.967 
F-statistic: 108.4 on 3 and 8 DF,  p-value: 8.11e-07

anova(fm9)

Analysis of Variance Table

Response: Yield
             Df Sum Sq Mean Sq  F value    Pr(>F)    
Fert          1 72.000  72.000 157.0909 1.538e-06 ***
Variety       1 75.000  75.000 163.6364 1.315e-06 ***
Fert:Variety  1  2.000   2.000   4.3636   0.07013 .  
Residuals     8  3.667   0.458                       
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

library(ggplot2)
# if (!require("ggplot2")) install.packages("ggplot2")
p9 <- 
  ggplot(data = df8, mapping = aes(x = Fert, y = Yield, color = Variety)) + 
  geom_point() +
  geom_line(mapping = aes(y = predict(fm9))) +
  labs(x = "Fert", y = "Yield") +
  theme_bw()
print(p9)

Correlation Analysis

Linear Relationship between Quantitative Variables

Simple Correlation Analysis

Linear Relationship between Two Quantitative Variables

$\left(X_{1},X_{2}\right)$

Example

Sparrow Wing length (cm)	Sparrow Tail length (cm)
10.4	7.4
10.8	7.6
11.1	7.9
10.2	7.2
10.3	7.4
10.2	7.1
10.7	7.4
10.5	7.2
10.8	7.8
11.2	7.7
10.6	7.8
11.4	8.3

WingLength  <- c(10.4, 10.8, 11.1, 10.2, 10.3, 10.2, 10.7, 10.5, 10.8, 11.2, 10.6, 11.4)
TailLength  <- c(7.4, 7.6, 7.9, 7.2, 7.4, 7.1, 7.4, 7.2, 7.8, 7.7, 7.8, 8.3)
df10        <- data.frame(WingLength, TailLength)
df10

   WingLength TailLength
1        10.4        7.4
2        10.8        7.6
3        11.1        7.9
4        10.2        7.2
5        10.3        7.4
6        10.2        7.1
7        10.7        7.4
8        10.5        7.2
9        10.8        7.8
10       11.2        7.7
11       10.6        7.8
12       11.4        8.3

library(ggplot2)
# if (!require("ggplot2")) install.packages("ggplot2")
p10 <- 
  ggplot(data = df10, mapping = aes(x= WingLength, y = TailLength)) + 
  geom_point() +
  labs(x = "Sparrow Wing Length (cm)", y = "Sparrow Tail Length (cm)") +
  theme_bw()
print(p10)

cor(df10$WingLength, df10$TailLength)

[1] 0.8703546

cor.test(df10$WingLength, df10$TailLength)


    Pearson's product-moment correlation

data:  df10$WingLength and df10$TailLength
t = 5.5893, df = 10, p-value = 0.0002311
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.5923111 0.9631599
sample estimates:
      cor 
0.8703546

Partial Correlation Analysis

Linear Relationship between Quantitative Variables while holding/keeping all other constants

$\left(X_{1},X_{2}\right)|X_{3}$

Example

Leaf Area (cm^2)	Leaf Moisture (%)	Total Shoot Length (cm)
72	80	307
174	75	529
116	81	632
78	83	527
134	79	442
95	81	525
113	80	481
98	81	710
148	74	422
42	78	345

LeafArea    <- c(72, 174, 116, 78, 134, 95, 113, 98, 148, 42)
LeafMoist   <- c(75, 81, 83, 79, 81, 80, 81, 74, 78, 58)
TotShLength <- c(307, 529, 632, 527, 442, 525, 481, 710, 422, 345)
df11        <- data.frame(LeafArea, LeafMoist, TotShLength)
df11

   LeafArea LeafMoist TotShLength
1        72        75         307
2       174        81         529
3       116        83         632
4        78        79         527
5       134        81         442
6        95        80         525
7       113        81         481
8        98        74         710
9       148        78         422
10       42        58         345

library(ppcor)
# if (!require("ppcor")) install.packages("ppcor")
pcor(df11)

$estimate
                LeafArea LeafMoist  TotShLength
LeafArea     1.000000000 0.6509096 -0.007012236
LeafMoist    0.650909618 1.0000000  0.324334659
TotShLength -0.007012236 0.3243347  1.000000000

$p.value
              LeafArea  LeafMoist TotShLength
LeafArea    0.00000000 0.05760462   0.9857154
LeafMoist   0.05760462 0.00000000   0.3944839
TotShLength 0.98571537 0.39448385   0.0000000

$statistic
               LeafArea LeafMoist TotShLength
LeafArea     0.00000000  2.268502 -0.01855309
LeafMoist    2.26850175  0.000000  0.90714704
TotShLength -0.01855309  0.907147  0.00000000

$n
[1] 10

$gp
[1] 1

$method
[1] "pearson"

Multiple Correlation Analysis

Linear Relationship between a Quantitative Variable and set of other Quantitative Variables

$\left(X_{1},\left[X_{2},X_{3}\right]\right)$

Example

Leaf Area (cm^2)	Leaf Moisture (%)	Total Shoot Length (cm)
72	80	307
174	75	529
116	81	632
78	83	527
134	79	442
95	81	525
113	80	481
98	81	710
148	74	422
42	78	345

fm12 <- lm(formula = TotShLength ~ LeafArea + LeafMoist, data = df11)
sqrt(summary(fm12)$r.squared)

[1] 0.421277

Completely Randomized Design (CRD)

Used when experimental material is homogeneous

Example

The following table shows some of the results of an experiment on the effects of applications of sulphur in reducing scab disease of potatoes. The object in applying sulphur is to increase the acidity of the soil, since scab does not thrive in very acid soil. In addition to untreated plots which serve as a control, 3 amounts of dressing were compared—300, 600, and 900 lb. per acre. Both a fall and a spring application of each amount was tested, so that in all there were seven distinct treatments. The sulphur was spread by hand on the surface of the soil, and then diced into a depth of about 4 inches. The quantity to be analyzed is the “scab index”. That is roughly speaking, the percentage of the surface area of the potato that is infected with scab. It is obtained by examining 100 potatoes at random from each plot, grading each potato on a scale from 0 to 100% infected, and taking the average.

ScabIndex <-
      c(
        12, 10,  24, 29, 
        30, 18,  32, 26, 
         9,  9,  16,  4,
        30,  7,  21,  9,
        16, 10,  18, 18,
        18, 24,  12, 19,
        10,  4,   4,  5,
        17,  7,  16, 17
      )
Trt  <- factor(
          x = rep(x = 1:7, times = c(8, 4, 4, 4, 4, 4, 4))
        , labels = c("O", "F3", "S3", "F6", "S6", "F9", "S9")
        )

df13 <- data.frame(Trt, ScabIndex)
df13

   Trt ScabIndex
1    O        12
2    O        10
3    O        24
4    O        29
5    O        30
6    O        18
7    O        32
8    O        26
9   F3         9
10  F3         9
11  F3        16
12  F3         4
13  S3        30
14  S3         7
15  S3        21
16  S3         9
17  F6        16
18  F6        10
19  F6        18
20  F6        18
21  S6        18
22  S6        24
23  S6        12
24  S6        19
25  F9        10
26  F9         4
27  F9         4
28  F9         5
29  S9        17
30  S9         7
31  S9        16
32  S9        17

library(ggplot2)
# if (!require("ggplot2")) install.packages("ggplot2")
p13 <- 
  ggplot(data = df13, mapping = aes(x = Trt, y = ScabIndex)) + 
  geom_point() +
  geom_boxplot() +
  labs(x = "Treatment", y = "Scab Index") +
  theme_bw()
print(p13)

fm13 <- lm(formula = ScabIndex ~ Trt, data = df13)
summary(fm13)


Call:
lm(formula = ScabIndex ~ Trt, data = df13)

Residuals:
    Min      1Q  Median      3Q     Max 
-12.625  -4.844   0.625   3.594  13.250 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   22.625      2.369   9.549 8.08e-10 ***
TrtF3        -13.125      4.104  -3.198 0.003734 ** 
TrtS3         -5.875      4.104  -1.432 0.164666    
TrtF6         -7.125      4.104  -1.736 0.094858 .  
TrtS6         -4.375      4.104  -1.066 0.296601    
TrtF9        -16.875      4.104  -4.112 0.000372 ***
TrtS9         -8.375      4.104  -2.041 0.051977 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 6.702 on 25 degrees of freedom
Multiple R-squared:  0.4641,    Adjusted R-squared:  0.3355 
F-statistic: 3.608 on 6 and 25 DF,  p-value: 0.01026

anova(fm13)

Analysis of Variance Table

Response: ScabIndex
          Df  Sum Sq Mean Sq F value  Pr(>F)  
Trt        6  972.34 162.057  3.6081 0.01026 *
Residuals 25 1122.88  44.915                  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Randomized Complete Block Design (RCBD)

Used when experimental material is heterogenous in one direction

Example

Yield : Yield of barley, SoilType : Soil Type, and Trt : 5 sources and a control

Yield <- c(
             32.1,  35.6,   41.9,   35.4
          ,  30.1,  31.5,   37.1,   30.8
          ,  25.4,  27.4,   33.8,   31.1
          ,  24.1,  33.0,   35.6,   31.4
          ,  26.1,  31.0,   33.8,   31.9
          ,  23.2,  24.8,   26.7,   26.7
          )
SoilType <- gl(n = 4, k = 6, length = 4*6
               , labels = c("I", "II", "III", "IV"), ordered = FALSE)
Trt      <- gl(n = 6, k = 1, length = 4*6
               , labels = c("(NH4)2SO4", "NH4NO3", "CO(NH2)2", "Ca(NO3)2", "NaNO3", "Control"), ordered = FALSE)

df14 <- data.frame(Yield, SoilType, Trt)
df14

   Yield SoilType       Trt
1   32.1        I (NH4)2SO4
2   35.6        I    NH4NO3
3   41.9        I  CO(NH2)2
4   35.4        I  Ca(NO3)2
5   30.1        I     NaNO3
6   31.5        I   Control
7   37.1       II (NH4)2SO4
8   30.8       II    NH4NO3
9   25.4       II  CO(NH2)2
10  27.4       II  Ca(NO3)2
11  33.8       II     NaNO3
12  31.1       II   Control
13  24.1      III (NH4)2SO4
14  33.0      III    NH4NO3
15  35.6      III  CO(NH2)2
16  31.4      III  Ca(NO3)2
17  26.1      III     NaNO3
18  31.0      III   Control
19  33.8       IV (NH4)2SO4
20  31.9       IV    NH4NO3
21  23.2       IV  CO(NH2)2
22  24.8       IV  Ca(NO3)2
23  26.7       IV     NaNO3
24  26.7       IV   Control

library(ggplot2)
# if (!require("ggplot2")) install.packages("ggplot2")
p14 <- 
  ggplot(data = df14, mapping = aes(x = Trt, y = Yield)) + 
  geom_point() +
  geom_boxplot() +
  labs(x = "Treatment", y = "Yield") +
  theme_bw()
print(p14)

fm14 <- lm(formula = Yield ~ SoilType + Trt, data = df14)
summary(fm14)


Call:
lm(formula = Yield ~ SoilType + Trt, data = df14)

Residuals:
    Min      1Q  Median      3Q     Max 
-7.0208 -2.4229  0.0792  2.1354  6.7958 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   35.354      2.826  12.511 2.44e-09 ***
SoilTypeII    -3.500      2.664  -1.314   0.2087    
SoilTypeIII   -4.233      2.664  -1.589   0.1329    
SoilTypeIV    -6.583      2.664  -2.471   0.0259 *  
TrtNH4NO3      1.050      3.263   0.322   0.7521    
TrtCO(NH2)2   -0.250      3.263  -0.077   0.9399    
TrtCa(NO3)2   -2.025      3.263  -0.621   0.5442    
TrtNaNO3      -2.600      3.263  -0.797   0.4380    
TrtControl    -1.700      3.263  -0.521   0.6100    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 4.615 on 15 degrees of freedom
Multiple R-squared:  0.3512,    Adjusted R-squared:  0.005215 
F-statistic: 1.015 on 8 and 15 DF,  p-value: 0.4653

anova(fm14)

Analysis of Variance Table

Response: Yield
          Df Sum Sq Mean Sq F value Pr(>F)
SoilType   3 133.62  44.539  2.0915 0.1443
Trt        5  39.31   7.862  0.3692 0.8618
Residuals 15 319.43  21.295

Latin Square Design

Used when experimental material is heterogenous in two perpendicular directions

Example

The following table shows the field layout and yield of a 5×5 Latin square experiment on the effect of spacing on yield of millet plants. Five levels of spacing were used. The data on yield (grams/plot) was recorded and is given below:

Yield <- c(
           257, 230, 279, 287, 202,
           245, 283, 245, 280, 260,
           182, 252, 280, 246, 250,
           203, 204, 227, 193, 259,
           231, 271, 266, 334, 338
           )
Row <- factor(x = rep(x = 1:5, each = 5))
Col <- factor(x = rep(x = 1:5, times = 5))
Trt <- c(
          "B", "E", "A", "C", "D"
        , "D", "A", "E", "B", "C"
        , "E", "B", "C", "D", "A"
        , "A", "C", "D", "E", "B"
        , "C", "D", "B", "A", "E"
         )

df15 <- data.frame(Yield, Row, Col, Trt)
df15

   Yield Row Col Trt
1    257   1   1   B
2    230   1   2   E
3    279   1   3   A
4    287   1   4   C
5    202   1   5   D
6    245   2   1   D
7    283   2   2   A
8    245   2   3   E
9    280   2   4   B
10   260   2   5   C
11   182   3   1   E
12   252   3   2   B
13   280   3   3   C
14   246   3   4   D
15   250   3   5   A
16   203   4   1   A
17   204   4   2   C
18   227   4   3   D
19   193   4   4   E
20   259   4   5   B
21   231   5   1   C
22   271   5   2   D
23   266   5   3   B
24   334   5   4   A
25   338   5   5   E

library(ggplot2)
# if (!require("ggplot2")) install.packages("ggplot2")
p15 <- 
  ggplot(data = df15, mapping = aes(x = Trt, y = Yield)) + 
  geom_point() +
  geom_boxplot() +
  labs(x = "Treatment", y = "Yield") +
  theme_bw()
print(p15)

fm15 <- lm(formula = Yield ~ Row + Col + Trt, data = df15)
summary(fm15)


Call:
lm(formula = Yield ~ Row + Col + Trt, data = df15)

Residuals:
   Min     1Q Median     3Q    Max 
-44.68 -12.48   1.12  16.52  54.92 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   240.08      23.43  10.247 2.75e-07 ***
Row2           11.60      20.55   0.565   0.5828    
Row3           -9.00      20.55  -0.438   0.6692    
Row4          -33.80      20.55  -1.645   0.1259    
Row5           37.00      20.55   1.801   0.0969 .  
Col2           24.40      20.55   1.187   0.2580    
Col3           35.80      20.55   1.742   0.1070    
Col4           44.40      20.55   2.161   0.0516 .  
Col5           38.20      20.55   1.859   0.0877 .  
TrtB           -7.00      20.55  -0.341   0.7393    
TrtC          -17.40      20.55  -0.847   0.4137    
TrtD          -31.60      20.55  -1.538   0.1500    
TrtE          -32.20      20.55  -1.567   0.1431    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 32.49 on 12 degrees of freedom
Multiple R-squared:  0.6536,    Adjusted R-squared:  0.3073 
F-statistic: 1.887 on 12 and 12 DF,  p-value: 0.1426

anova(fm15)

Analysis of Variance Table

Response: Yield
          Df  Sum Sq Mean Sq F value  Pr(>F)  
Row        4 13601.4  3400.3  3.2212 0.05164 .
Col        4  6146.2  1536.5  1.4556 0.27581  
Trt        4  4156.6  1039.1  0.9844 0.45229  
Residuals 12 12667.3  1055.6                  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Factorial Experiment under RCBD

Three strains of perennial ryegrass were grown as swards at each of the fertilizer levels. The three strains were S23, Kent & a control. The fertilizer levels were heavy & average. The experiment was laid out under randomized complete block design. The mid summer dry matter yields, in units of 10 lb/acre were as follows. Analyze the data & construct a table of means with appropriate standard errors. Draw any graphs you deem fit to explain the results & write a short report on the conclusions to be drawn from this experiment.

DMY <- 
  c(
    299,    318,    284,    279,
    247,    202,    171,    183,
    403,    439,    355,    324,
    222,    170,    192,    176,
    382,    353,    383,    310,
    233,    216,    200,    143
    )
Blk  <- gl(n = 4, k = 1, length = 24, labels = c("I", "II", "III", "IV"))
Vart <- gl(n = 3, k = 8, length = 24, labels = c("S23", "Control", "Kent"))
Manu <- gl(n = 2, k = 4, length = 24, labels = c("Heavy", "Average"))

df1 <- data.frame(DMY, Blk, Vart, Manu)
df1

   DMY Blk    Vart    Manu
1  299   I     S23   Heavy
2  318  II     S23   Heavy
3  284 III     S23   Heavy
4  279  IV     S23   Heavy
5  247   I     S23 Average
6  202  II     S23 Average
7  171 III     S23 Average
8  183  IV     S23 Average
9  403   I Control   Heavy
10 439  II Control   Heavy
11 355 III Control   Heavy
12 324  IV Control   Heavy
13 222   I Control Average
14 170  II Control Average
15 192 III Control Average
16 176  IV Control Average
17 382   I    Kent   Heavy
18 353  II    Kent   Heavy
19 383 III    Kent   Heavy
20 310  IV    Kent   Heavy
21 233   I    Kent Average
22 216  II    Kent Average
23 200 III    Kent Average
24 143  IV    Kent Average

fm1 <- lm(formula = DMY ~ Blk + Vart * Manu, data = df1)
anova(fm1)

Analysis of Variance Table

Response: DMY
          Df Sum Sq Mean Sq  F value    Pr(>F)    
Blk        3  12814    4271   7.1611  0.003294 ** 
Vart       2   6196    3098   5.1935  0.019324 *  
Manu       1 131128  131128 219.8375 2.287e-10 ***
Vart:Manu  2   9590    4795   8.0389  0.004239 ** 
Residuals 15   8947     596                       
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

library(emmeans)
emmip(fm1, Vart ~ Manu)

emmip(fm1, Manu ~ Vart)

emmeans(fm1, ~ Vart)

 Vart    emmean   SE df lower.CL upper.CL
 S23        248 8.63 15      229      266
 Control    285 8.63 15      267      304
 Kent       278 8.63 15      259      296

Results are averaged over the levels of: Blk, Manu 
Confidence level used: 0.95

emmeans(fm1, ~ Manu)

 Manu    emmean   SE df lower.CL upper.CL
 Heavy      344 7.05 15      329      359
 Average    196 7.05 15      181      211

Results are averaged over the levels of: Blk, Vart 
Confidence level used: 0.95

emmeans(fm1, ~ Manu|Vart)

Vart = S23:
 Manu    emmean   SE df lower.CL upper.CL
 Heavy      295 12.2 15      269      321
 Average    201 12.2 15      175      227

Vart = Control:
 Manu    emmean   SE df lower.CL upper.CL
 Heavy      380 12.2 15      354      406
 Average    190 12.2 15      164      216

Vart = Kent:
 Manu    emmean   SE df lower.CL upper.CL
 Heavy      357 12.2 15      331      383
 Average    198 12.2 15      172      224

Results are averaged over the levels of: Blk 
Confidence level used: 0.95

emmeans(fm1, ~ Vart|Manu)

Manu = Heavy:
 Vart    emmean   SE df lower.CL upper.CL
 S23        295 12.2 15      269      321
 Control    380 12.2 15      354      406
 Kent       357 12.2 15      331      383

Manu = Average:
 Vart    emmean   SE df lower.CL upper.CL
 S23        201 12.2 15      175      227
 Control    190 12.2 15      164      216
 Kent       198 12.2 15      172      224

Results are averaged over the levels of: Blk 
Confidence level used: 0.95

emmeans(fm1, pairwise ~ Vart)

$emmeans
 Vart    emmean   SE df lower.CL upper.CL
 S23        248 8.63 15      229      266
 Control    285 8.63 15      267      304
 Kent       278 8.63 15      259      296

Results are averaged over the levels of: Blk, Manu 
Confidence level used: 0.95 

$contrasts
 contrast       estimate   SE df t.ratio p.value
 S23 - Control    -37.25 12.2 15 -3.050  0.0208 
 S23 - Kent       -29.62 12.2 15 -2.426  0.0689 
 Control - Kent     7.62 12.2 15  0.624  0.8092 

Results are averaged over the levels of: Blk, Manu 
P value adjustment: tukey method for comparing a family of 3 estimates

emmeans(fm1, pairwise ~ Manu)

$emmeans
 Manu    emmean   SE df lower.CL upper.CL
 Heavy      344 7.05 15      329      359
 Average    196 7.05 15      181      211

Results are averaged over the levels of: Blk, Vart 
Confidence level used: 0.95 

$contrasts
 contrast        estimate   SE df t.ratio p.value
 Heavy - Average      148 9.97 15 14.827  <.0001 

Results are averaged over the levels of: Blk, Vart

emmeans(fm1, pairwise ~ Vart|Manu)

$emmeans
Manu = Heavy:
 Vart    emmean   SE df lower.CL upper.CL
 S23        295 12.2 15      269      321
 Control    380 12.2 15      354      406
 Kent       357 12.2 15      331      383

Manu = Average:
 Vart    emmean   SE df lower.CL upper.CL
 S23        201 12.2 15      175      227
 Control    190 12.2 15      164      216
 Kent       198 12.2 15      172      224

Results are averaged over the levels of: Blk 
Confidence level used: 0.95 

$contrasts
Manu = Heavy:
 contrast       estimate   SE df t.ratio p.value
 S23 - Control    -85.25 17.3 15 -4.936  0.0005 
 S23 - Kent       -62.00 17.3 15 -3.590  0.0071 
 Control - Kent    23.25 17.3 15  1.346  0.3927 

Manu = Average:
 contrast       estimate   SE df t.ratio p.value
 S23 - Control     10.75 17.3 15  0.622  0.8102 
 S23 - Kent         2.75 17.3 15  0.159  0.9861 
 Control - Kent    -8.00 17.3 15 -0.463  0.8893 

Results are averaged over the levels of: Blk 
P value adjustment: tukey method for comparing a family of 3 estimates

emmeans(fm1, pairwise ~ Manu|Vart)

$emmeans
Vart = S23:
 Manu    emmean   SE df lower.CL upper.CL
 Heavy      295 12.2 15      269      321
 Average    201 12.2 15      175      227

Vart = Control:
 Manu    emmean   SE df lower.CL upper.CL
 Heavy      380 12.2 15      354      406
 Average    190 12.2 15      164      216

Vart = Kent:
 Manu    emmean   SE df lower.CL upper.CL
 Heavy      357 12.2 15      331      383
 Average    198 12.2 15      172      224

Results are averaged over the levels of: Blk 
Confidence level used: 0.95 

$contrasts
Vart = S23:
 contrast        estimate   SE df t.ratio p.value
 Heavy - Average     94.2 17.3 15  5.458  0.0001 

Vart = Control:
 contrast        estimate   SE df t.ratio p.value
 Heavy - Average    190.2 17.3 15 11.016  <.0001 

Vart = Kent:
 contrast        estimate   SE df t.ratio p.value
 Heavy - Average    159.0 17.3 15  9.207  <.0001 

Results are averaged over the levels of: Blk

Stability Analysis

library(stability)
library(help = stability)
data(ge_data)

Individual Analysis of Variance for each Location

Yield.indiv_anova <-
         indiv_anova(
            .data = ge_data
          , .y    = Yield
          , .rep  = Rep
          , .gen  = Gen
          , .env  = Env
          )
Yield.indiv_anova

$`Analysis of Variance for FSD`
            Df   Sum Sq Mean Sq F value Pr(>F)    
Gen         59 45524231  771597   32.62 <2e-16 ***
Residuals   60  1419180   23653                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

$`Analysis of Variance for FSDR`
            Df   Sum Sq Mean Sq F value   Pr(>F)    
Gen         59 14019781  237623   2.724 7.91e-05 ***
Residuals   60  5234860   87248                     
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

$`Analysis of Variance for Okara`
            Df   Sum Sq Mean Sq F value   Pr(>F)    
Gen         59 26802597  454281   3.538 1.22e-06 ***
Residuals   60  7703642  128394                     
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

$`Analysis of Variance for Sargodha`
            Df   Sum Sq Mean Sq F value  Pr(>F)    
Gen         59 38381469  650533   4.254 4.3e-08 ***
Residuals   60  9175301  152922                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

$`Analysis of Variance for Gujranwala`
            Df   Sum Sq Mean Sq F value   Pr(>F)    
Gen         59 41863914  709558   4.687 6.55e-09 ***
Residuals   60  9082956  151383                     
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

$`Analysis of Variance for KotNaina`
            Df   Sum Sq Mean Sq F value   Pr(>F)    
Gen         59 26692263  452411     4.3 3.51e-08 ***
Residuals   60  6313095  105218                     
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

$`Analysis of Variance for KSK`
            Df   Sum Sq Mean Sq F value   Pr(>F)    
Gen         59 17890629  303231   3.472 1.68e-06 ***
Residuals   60  5239575   87326                     
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

$`Analysis of Variance for Khanewal`
            Df   Sum Sq Mean Sq F value   Pr(>F)    
Gen         59 15125214  256360   2.451 0.000345 ***
Residuals   60  6274965  104583                     
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

$`Analysis of Variance for Sahiwal`
            Df   Sum Sq Mean Sq F value   Pr(>F)    
Gen         59 26109918  442541   4.095 8.82e-08 ***
Residuals   60  6484679  108078                     
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

$`Analysis of Variance for Bhakkar`
            Df   Sum Sq Mean Sq F value  Pr(>F)   
Gen         59 10870598  184247   1.862 0.00885 **
Residuals   60  5936717   98945                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

$`Analysis of Variance for Bahawalnagar`
            Df  Sum Sq Mean Sq F value Pr(>F)  
Gen         59 9718292  164717   1.716 0.0196 *
Residuals   60 5760765   96013                 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Combined Analysis of Variance

YieldANOVA <-
     add_anova(
            .data = ge_data
          , .y    = Yield
          , .rep  = Rep
          , .gen  = Gen
          , .env  = Env
      )
YieldANOVA

$anova
Analysis of Variance Table

Response: .data[[Y]]
           Df    Sum Sq  Mean Sq F value    Pr(>F)    
Env        10 263940496 26394050 68.0551 1.995e-08 ***
Rep(Env)   11   4266167   387833                      
Gen        59  71662431  1214617 12.2482 < 2.2e-16 ***
Gen:Env   590 201336476   341248  3.4411 < 2.2e-16 ***
Residuals 649  64359568    99167                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Additive Main Effects and Multiplicative Interaction (AMMI) Analysis

Yield.ammi <-
     ammi(
            .data = ge_data
          , .y    = Yield
          , .rep  = Rep
          , .gen  = Gen
          , .env  = Env
      )
Yield.ammi

$anova
Analysis of Variance Table

Response: .data[[Y]]
           Df    Sum Sq  Mean Sq F value    Pr(>F)    
Env        10 263940496 26394050 68.0551 1.995e-08 ***
Rep(Env)   11   4266167   387833                      
Gen        59  71662431  1214617 12.2482 < 2.2e-16 ***
Gen:Env   590 201336476   341248  3.4411 < 2.2e-16 ***
Residuals 649  64359568    99167                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

$pc.anova
# A tibble: 11 x 9
   PC     SingVal  Percent accumPerc    Df       SS  Mean.Sq  F.value   Pr..F.
   <fct>    <dbl>    <dbl>     <dbl> <dbl>    <dbl>    <dbl>    <dbl>    <dbl>
 1 PC1   4.50e+ 3 2.01e+ 1      20.1    68 4.04e+ 7 5.95e+ 5 6.00e+ 0 0.      
 2 PC2   4.17e+ 3 1.73e+ 1      37.4    66 3.49e+ 7 5.28e+ 5 5.33e+ 0 0.      
 3 PC3   4.05e+ 3 1.63e+ 1      53.7    64 3.27e+ 7 5.12e+ 5 5.16e+ 0 0.      
 4 PC4   3.49e+ 3 1.21e+ 1      65.8    62 2.44e+ 7 3.93e+ 5 3.96e+ 0 0.      
 5 PC5   3.38e+ 3 1.13e+ 1      77.1    60 2.29e+ 7 3.81e+ 5 3.84e+ 0 0.      
 6 PC6   3.11e+ 3 9.63e+ 0      86.7    58 1.94e+ 7 3.34e+ 5 3.37e+ 0 4.02e-14
 7 PC7   2.30e+ 3 5.25e+ 0      92.0    56 1.06e+ 7 1.89e+ 5 1.90e+ 0 1.45e- 4
 8 PC8   2.03e+ 3 4.10e+ 0      96.1    54 8.25e+ 6 1.53e+ 5 1.54e+ 0 9.52e- 3
 9 PC9   1.52e+ 3 2.30e+ 0      98.4    52 4.63e+ 6 8.91e+ 4 8.98e- 1 6.77e- 1
10 PC10  1.28e+ 3 1.63e+ 0     100      50 3.27e+ 6 6.55e+ 4 6.60e- 1 9.66e- 1
11 PC11  1.92e-11 3.64e-28     100      48 7.34e-22 1.53e-23 1.54e-28 1.00e+ 0

Additive Main Effects and Multiplicative Interaction (AMMI) Biplot Analysis

ammi_biplot(
            .data = ge_data
          , .y    = Yield
          , .rep  = Rep
          , .gen  = Gen
          , .env  = Env
      )

$aami.biplot


$MeanPC1Plot

Genotype plus Genotypes by Environment (GGE) Interaction Biplot Analysis

gge_biplot(
            .data = ge_data
          , .y    = Yield
          , .rep  = Rep
          , .gen  = Gen
          , .env  = Env
      )

$gge.biplot

All posts by date

Yield	Fert	Variety
51	80	V1
52	80	V1
53	90	V1
54	90	V1
56	100	V1
57	100	V1
55	80	V2
56	80	V2
58	90	V2
59	90	V2
62	100	V2
63	100	V2

Leaf Area (cm^2)	Leaf Moisture (%)	Total Shoot Length (cm)
72	80	307
174	75	529
116	81	632
78	83	527
134	79	442
95	81	525
113	80	481
98	81	710
148	74	422
42	78	345

Leaf Area (cm^2)	Leaf Moisture (%)	Total Shoot Length (cm)
72	80	307
174	75	529
116	81	632
78	83	527
134	79	442
95	81	525
113	80	481
98	81	710
148	74	422
42	78	345

Yield	Fert	Variety
51	80	V1
52	80	V1
53	90	V1
54	90	V1
56	100	V1
57	100	V1
55	80	V2
56	80	V2
58	90	V2
59	90	V2
62	100	V2
63	100	V2

Leaf Area (cm^2)	Leaf Moisture (%)	Total Shoot Length (cm)
72	80	307
174	75	529
116	81	632
78	83	527
134	79	442
95	81	525
113	80	481
98	81	710
148	74	422
42	78	345

Leaf Area (cm^2)	Leaf Moisture (%)	Total Shoot Length (cm)
72	80	307
174	75	529
116	81	632
78	83	527
134	79	442
95	81	525
113	80	481
98	81	710
148	74	422
42	78	345

Yield	Fert	Variety
51	80	V1
52	80	V1
53	90	V1
54	90	V1
56	100	V1
57	100	V1
55	80	V2
56	80	V2
58	90	V2
59	90	V2
62	100	V2
63	100	V2

Leaf Area (cm^2)	Leaf Moisture (%)	Total Shoot Length (cm)
72	80	307
174	75	529
116	81	632
78	83	527
134	79	442
95	81	525
113	80	481
98	81	710
148	74	422
42	78	345

Leaf Area (cm^2)	Leaf Moisture (%)	Total Shoot Length (cm)
72	80	307
174	75	529
116	81	632
78	83	527
134	79	442
95	81	525
113	80	481
98	81	710
148	74	422
42	78	345