Training Course on Capacity Building of NARS Scientists in Advance Analytical Techniques
- Introduction
- Regression Analysis
- Simple Linear Regression
- Multiple Linear Regression
- Polynomial Regression Analysis
- Analysis of Variance (ANOVA)
- Analysis of Covariance (ANCOVA)
- Correlation Analysis
- Simple Correlation Analysis
- Partial Correlation Analysis
- Multiple Correlation Analysis
- Completely Randomized Design (CRD)
- Randomized Complete Block Design (RCBD)
- Latin Square Design
- Factorial Experiment under RCBD
- Stability Analysis
Introduction
R is a free, open-source programming language and software environment for statistical computing, bioinformatics, visualization and general computing. R provides a wide variety of statistical and graphical techniques, and is highly extensible. The latest version of R can be obtained from https://cran.r-project.org/bin/windows/base/.
R R for Windows (32/64 bit)
RStudio RStudio
Manual An Introduction to R
Ref Card R Reference Card
New York Times New York Times
Nature Article Nature Article
Regression Analysis
- Quantifying dependency of a normal response on quantitative explanatory variable(s)
Simple Linear Regression
- Quantifying dependency of a normal response on a quantitative explanatory variable
Example
Fertilizer (Kg/acre) | Production (Bushels/acre) |
---|---|
100 | 70 |
200 | 70 |
400 | 80 |
500 | 100 |
# Fertilizer (Kg/acre)
# Production (Bushels/acre)
Fertilizer <- c(100, 200, 400, 500)
Production <- c( 70, 70, 80, 100)
df1 <- data.frame(Fertilizer, Production)
df1
Fertilizer Production
1 100 70
2 200 70
3 400 80
4 500 100
library(ggplot2)
# if (!require("ggplot2")) install.packages("ggplot2")
p1 <-
ggplot(data = df1, mapping = aes(x = Fertilizer, y = Production)) +
geom_point() +
labs(x = "Fertilizer", y = "Production") +
theme_bw()
print(p1)
fm1 <- lm(formula = Production ~ Fertilizer, data = df1)
summary(fm1)
Call:
lm(formula = Production ~ Fertilizer, data = df1)
Residuals:
1 2 3 4
4 -3 -7 6
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 59.00000 7.95299 7.419 0.0177 *
Fertilizer 0.07000 0.02345 2.985 0.0963 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 7.416 on 2 degrees of freedom
Multiple R-squared: 0.8167, Adjusted R-squared: 0.725
F-statistic: 8.909 on 1 and 2 DF, p-value: 0.0963
anova(fm1)
Analysis of Variance Table
Response: Production
Df Sum Sq Mean Sq F value Pr(>F)
Fertilizer 1 490 490 8.9091 0.0963 .
Residuals 2 110 55
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Example
Fertilizer | Yield |
---|---|
0.3 | 10 |
0.6 | 15 |
0.9 | 30 |
1.2 | 35 |
1.5 | 25 |
1.8 | 30 |
2.1 | 50 |
2.4 | 45 |
Fert <- c(0.3, 0.6, 0.9, 1.2, 1.5, 1.8, 2.1, 2.4)
Yield <- c(10, 15, 30, 35, 25, 30, 50, 45)
df2 <- data.frame(Fert, Yield)
df2
Fert Yield
1 0.3 10
2 0.6 15
3 0.9 30
4 1.2 35
5 1.5 25
6 1.8 30
7 2.1 50
8 2.4 45
library(ggplot2)
# if (!require("ggplot2")) install.packages("ggplot2")
p2 <-
ggplot(data = df2, mapping = aes(x = Fert, y = Yield)) +
geom_point() +
labs(x = "Fertilizer", y = "Yield") +
theme_bw()
print(p2)
fm2 <- lm(formula = Yield ~ Fert, data = df2)
summary(fm2)
Call:
lm(formula = Yield ~ Fert, data = df2)
Residuals:
Min 1Q Median 3Q Max
-7.441 -4.018 -2.441 7.351 7.798
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 8.036 5.504 1.460 0.1946
Fert 16.270 3.633 4.478 0.0042 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 7.064 on 6 degrees of freedom
Multiple R-squared: 0.7697, Adjusted R-squared: 0.7313
F-statistic: 20.05 on 1 and 6 DF, p-value: 0.004202
anova(fm2)
Analysis of Variance Table
Response: Yield
Df Sum Sq Mean Sq F value Pr(>F)
Fert 1 1000.6 1000.6 20.052 0.004202 **
Residuals 6 299.4 49.9
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Example
Weekly Income ($) | Weekly Expenditures ($) |
---|---|
80 | 70 |
100 | 65 |
120 | 90 |
140 | 95 |
160 | 110 |
180 | 115 |
200 | 120 |
220 | 140 |
240 | 155 |
260 | 150 |
# Weekly Income ($) of a Family
# Weekly Expenditures ($) of a Family
Income <- seq(from = 80, to = 260, by = 20)
Expenditures <- c(70, 65, 90, 95, 110, 115, 120, 140, 155, 150)
df3 <- data.frame(Income, Expenditures)
df3
Income Expenditures
1 80 70
2 100 65
3 120 90
4 140 95
5 160 110
6 180 115
7 200 120
8 220 140
9 240 155
10 260 150
library(ggplot2)
# if (!require("ggplot2")) install.packages("ggplot2")
p3 <-
ggplot(data = df3, mapping = aes(x = Income, y = Expenditures)) +
geom_point() +
scale_x_continuous(labels = scales::dollar) +
scale_y_continuous(labels = scales::dollar) +
labs(x = "Weekly income, $", y = "Weekly consumption expenditures, $") +
theme_bw()
print(p3)
fm3 <- lm(formula = Expenditures ~ Income, data = df3)
summary(fm3)
Call:
lm(formula = Expenditures ~ Income, data = df3)
Residuals:
Min 1Q Median 3Q Max
-10.364 -4.977 1.409 4.364 8.364
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 24.45455 6.41382 3.813 0.00514 **
Income 0.50909 0.03574 14.243 5.75e-07 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 6.493 on 8 degrees of freedom
Multiple R-squared: 0.9621, Adjusted R-squared: 0.9573
F-statistic: 202.9 on 1 and 8 DF, p-value: 5.753e-07
anova(fm3)
Analysis of Variance Table
Response: Expenditures
Df Sum Sq Mean Sq F value Pr(>F)
Income 1 8552.7 8552.7 202.87 5.753e-07 ***
Residuals 8 337.3 42.2
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Multiple Linear Regression
- Quantifying dependency of a normal response on two or more quantitative explanatory variables
Example
Fertilizer (Kg) | Rainfall (mm) | Yield (Kg) |
---|---|---|
100 | 10 | 40 |
200 | 20 | 50 |
300 | 10 | 50 |
400 | 30 | 70 |
500 | 20 | 65 |
600 | 20 | 65 |
700 | 30 | 80 |
# Fertilizer (Kg)
# Rainfall (mm)
# Yield (Kg)
Fertilizer <- seq(from = 100, to = 700, by = 100)
Rainfall <- c(10, 20, 10, 30, 20, 20, 30)
Yield <- c(40, 50, 50, 70, 65, 65, 80)
df4 <- data.frame(Fertilizer, Rainfall, Yield)
df4
Fertilizer Rainfall Yield
1 100 10 40
2 200 20 50
3 300 10 50
4 400 30 70
5 500 20 65
6 600 20 65
7 700 30 80
library(car)
# if (!require("car")) install.packages("car")
car::scatter3d(formula = Yield ~ Fertilizer + Rainfall, data = df4)
fm4 <- lm(formula = Yield ~ Fertilizer + Rainfall, data = df4)
summary(fm4)
Call:
lm(formula = Yield ~ Fertilizer + Rainfall, data = df4)
Residuals:
1 2 3 4 5 6 7
-0.2381 -2.3810 2.1429 1.6667 1.1905 -2.6190 0.2381
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 28.095238 2.491482 11.277 0.000352 ***
Fertilizer 0.038095 0.005832 6.532 0.002838 **
Rainfall 0.833333 0.154303 5.401 0.005690 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2.315 on 4 degrees of freedom
Multiple R-squared: 0.9814, Adjusted R-squared: 0.972
F-statistic: 105.3 on 2 and 4 DF, p-value: 0.0003472
anova(fm4)
Analysis of Variance Table
Response: Yield
Df Sum Sq Mean Sq F value Pr(>F)
Fertilizer 1 972.32 972.32 181.500 0.0001756 ***
Rainfall 1 156.25 156.25 29.167 0.0056899 **
Residuals 4 21.43 5.36
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
library(lm.beta)
# if (!require("lm.beta")) install.packages("lm.beta")
lm.beta(fm4)
Call:
lm(formula = Yield ~ Fertilizer + Rainfall, data = df4)
Standardized Coefficients::
(Intercept) Fertilizer Rainfall
0.0000000 0.5944301 0.4914732
Polynomial Regression Analysis
- Quantifying non-linear dependency of a normal response on quantitative explanatory variable(s)
Example
An experiment was conducted to evaluate the effects of different levels of nitrogen. Three levels of nitrogen: 0, 10 and 20 grams per plot were used in the experiment. Each treatment was replicated twice and data is given below:
Nitrogen | Yield |
---|---|
0 | 5 |
0 | 7 |
10 | 15 |
10 | 17 |
20 | 9 |
20 | 11 |
Nitrogen <- c(0, 0, 10, 10, 20, 20)
Yield <- c(5, 7, 15, 17, 9, 11)
df5 <- data.frame(Nitrogen, Yield)
df5
Nitrogen Yield
1 0 5
2 0 7
3 10 15
4 10 17
5 20 9
6 20 11
library(ggplot2)
# if (!require("ggplot2")) install.packages("ggplot2")
p5 <-
ggplot(data = df5, mapping = aes(x = Nitrogen, y = Yield)) +
geom_point() +
labs(x = "Nitrogen", y = "Yield") +
theme_bw()
print(p5)
fm5 <- lm(formula = Yield ~ Nitrogen + I(Nitrogen^2), data = df5)
# fm5 <- lm(formula = Yield ~ poly(x = Nitrogen, degree = 2, raw = TRUE), data = df5)
summary(fm5)
Call:
lm(formula = Yield ~ Nitrogen + I(Nitrogen^2), data = df5)
Residuals:
1 2 3 4 5 6
-1 1 -1 1 -1 1
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.00000 1.00000 6.000 0.00927 **
Nitrogen 1.80000 0.25495 7.060 0.00584 **
I(Nitrogen^2) -0.08000 0.01225 -6.532 0.00729 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1.414 on 3 degrees of freedom
Multiple R-squared: 0.9441, Adjusted R-squared: 0.9068
F-statistic: 25.33 on 2 and 3 DF, p-value: 0.01322
anova(fm5)
Analysis of Variance Table
Response: Yield
Df Sum Sq Mean Sq F value Pr(>F)
Nitrogen 1 16.000 16.000 8.000 0.066276 .
I(Nitrogen^2) 1 85.333 85.333 42.667 0.007292 **
Residuals 3 6.000 2.000
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Analysis of Variance (ANOVA)
- Comparing means of Normal dependent variable for levels of different factor(s)
Example
Yield | Variety |
---|---|
5 | V1 |
6 | V1 |
7 | V1 |
15 | V2 |
16 | V2 |
17 | V2 |
Yield <- c(5, 6, 7, 15, 16, 17)
Variety <- gl(n = 2, k = 3, length = 2*3, labels = c("V1", "V2"))
df6 <- data.frame(Yield, Variety)
df6
Yield Variety
1 5 V1
2 6 V1
3 7 V1
4 15 V2
5 16 V2
6 17 V2
library(ggplot2)
# if (!require("ggplot2")) install.packages("ggplot2")
p6 <-
ggplot(data = df6, mapping = aes(x = Variety, y = Yield)) +
geom_point() +
geom_boxplot() +
labs(x = "Variety", y = "Yield") +
theme_bw()
print(p6)
fm6 <- lm(formula = Yield ~ Variety, data = df6)
summary(fm6)
Call:
lm(formula = Yield ~ Variety, data = df6)
Residuals:
1 2 3 4 5 6
-1.000e+00 -4.774e-15 1.000e+00 -1.000e+00 -1.110e-16 1.000e+00
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.0000 0.5774 10.39 0.000484 ***
VarietyV2 10.0000 0.8165 12.25 0.000255 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1 on 4 degrees of freedom
Multiple R-squared: 0.974, Adjusted R-squared: 0.9675
F-statistic: 150 on 1 and 4 DF, p-value: 0.0002552
anova(fm6)
Analysis of Variance Table
Response: Yield
Df Sum Sq Mean Sq F value Pr(>F)
Variety 1 150 150 150 0.0002552 ***
Residuals 4 4 1
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Example
Yield | Variety |
---|---|
5 | V1 |
7 | V1 |
15 | V2 |
17 | V2 |
17 | V3 |
19 | V3 |
Yield <- c(5, 7, 15, 17, 17, 19)
Variety <- gl(n = 3, k = 2, length = 3*2, labels = c("V1", "V2", "V3"))
df7 <- data.frame(Yield, Variety)
df7
Yield Variety
1 5 V1
2 7 V1
3 15 V2
4 17 V2
5 17 V3
6 19 V3
library(ggplot2)
# if (!require("ggplot2")) install.packages("ggplot2")
p7 <-
ggplot(data = df7, mapping = aes(x = Variety, y = Yield)) +
geom_point() +
geom_boxplot() +
labs(x = "Variety", y = "Yield") +
theme_bw()
print(p7)
fm7 <- lm(formula = Yield ~ Variety, data = df7)
summary(fm7)
Call:
lm(formula = Yield ~ Variety, data = df7)
Residuals:
1 2 3 4 5 6
-1 1 -1 1 -1 1
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.000 1.000 6.000 0.00927 **
VarietyV2 10.000 1.414 7.071 0.00582 **
VarietyV3 12.000 1.414 8.485 0.00344 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1.414 on 3 degrees of freedom
Multiple R-squared: 0.965, Adjusted R-squared: 0.9416
F-statistic: 41.33 on 2 and 3 DF, p-value: 0.006553
anova(fm7)
Analysis of Variance Table
Response: Yield
Df Sum Sq Mean Sq F value Pr(>F)
Variety 2 165.33 82.667 41.333 0.006553 **
Residuals 3 6.00 2.000
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Analysis of Covariance (ANCOVA)
- Quantifying dependency of a normal response on quantitative explanatory variable(s)
- Comparing means of Normal dependent variable for levels of different factor(s)
Example
Yield | Fert | Variety |
---|---|---|
51 | 80 | V1 |
52 | 80 | V1 |
53 | 90 | V1 |
54 | 90 | V1 |
56 | 100 | V1 |
57 | 100 | V1 |
55 | 80 | V2 |
56 | 80 | V2 |
58 | 90 | V2 |
59 | 90 | V2 |
62 | 100 | V2 |
63 | 100 | V2 |
Yield <- c(51, 52, 53, 54, 56, 57, 55, 56, 58, 59, 62, 63)
Fert <- rep(x = c(80, 90, 100), each = 2)
Variety <- gl(n = 2, k = 6, length = 2*6, labels = c("V1", "V2"), ordered = FALSE)
df8 <- data.frame(Yield, Fert, Variety)
df8
Yield Fert Variety
1 51 80 V1
2 52 80 V1
3 53 90 V1
4 54 90 V1
5 56 100 V1
6 57 100 V1
7 55 80 V2
8 56 80 V2
9 58 90 V2
10 59 90 V2
11 62 100 V2
12 63 100 V2
Same intercepts but different slopes
fm8 <- lm(formula = Yield ~ Fert + Variety, data = df8)
summary(fm8)
Call:
lm(formula = Yield ~ Fert + Variety, data = df8)
Residuals:
Min 1Q Median 3Q Max
-0.8333 -0.8333 0.1667 0.1667 1.1667
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 26.83333 2.54557 10.54 2.30e-06 ***
Fert 0.30000 0.02805 10.69 2.04e-06 ***
VarietyV2 5.00000 0.45812 10.91 1.72e-06 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.7935 on 9 degrees of freedom
Multiple R-squared: 0.9629, Adjusted R-squared: 0.9546
F-statistic: 116.7 on 2 and 9 DF, p-value: 3.657e-07
anova(fm8)
Analysis of Variance Table
Response: Yield
Df Sum Sq Mean Sq F value Pr(>F)
Fert 1 72.000 72.00 114.35 2.042e-06 ***
Variety 1 75.000 75.00 119.12 1.720e-06 ***
Residuals 9 5.667 0.63
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
library(ggplot2)
# if (!require("ggplot2")) install.packages("ggplot2")
p8 <-
ggplot(data = df8, mapping = aes(x = Fert, y = Yield, color = Variety)) +
geom_point() +
geom_line(mapping = aes(y = predict(fm8))) +
labs(x = "Fert", y = "Yield") +
theme_bw()
print(p8)
Different intercepts and different slopes
fm9 <- lm(formula = Yield ~ Fert * Variety, data = df8)
summary(fm9)
Call:
lm(formula = Yield ~ Fert * Variety, data = df8)
Residuals:
Min 1Q Median 3Q Max
-0.83333 -0.33333 -0.08333 0.66667 0.66667
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 31.33333 3.05903 10.243 7.09e-06 ***
Fert 0.25000 0.03385 7.385 7.73e-05 ***
VarietyV2 -4.00000 4.32612 -0.925 0.3822
Fert:VarietyV2 0.10000 0.04787 2.089 0.0701 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.677 on 8 degrees of freedom
Multiple R-squared: 0.976, Adjusted R-squared: 0.967
F-statistic: 108.4 on 3 and 8 DF, p-value: 8.11e-07
anova(fm9)
Analysis of Variance Table
Response: Yield
Df Sum Sq Mean Sq F value Pr(>F)
Fert 1 72.000 72.000 157.0909 1.538e-06 ***
Variety 1 75.000 75.000 163.6364 1.315e-06 ***
Fert:Variety 1 2.000 2.000 4.3636 0.07013 .
Residuals 8 3.667 0.458
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
library(ggplot2)
# if (!require("ggplot2")) install.packages("ggplot2")
p9 <-
ggplot(data = df8, mapping = aes(x = Fert, y = Yield, color = Variety)) +
geom_point() +
geom_line(mapping = aes(y = predict(fm9))) +
labs(x = "Fert", y = "Yield") +
theme_bw()
print(p9)
Correlation Analysis
- Linear Relationship between Quantitative Variables
Simple Correlation Analysis
- Linear Relationship between Two Quantitative Variables
- \(\left(X_{1},X_{2}\right)\)
Example
Sparrow Wing length (cm) | Sparrow Tail length (cm) |
---|---|
10.4 | 7.4 |
10.8 | 7.6 |
11.1 | 7.9 |
10.2 | 7.2 |
10.3 | 7.4 |
10.2 | 7.1 |
10.7 | 7.4 |
10.5 | 7.2 |
10.8 | 7.8 |
11.2 | 7.7 |
10.6 | 7.8 |
11.4 | 8.3 |
WingLength <- c(10.4, 10.8, 11.1, 10.2, 10.3, 10.2, 10.7, 10.5, 10.8, 11.2, 10.6, 11.4)
TailLength <- c(7.4, 7.6, 7.9, 7.2, 7.4, 7.1, 7.4, 7.2, 7.8, 7.7, 7.8, 8.3)
df10 <- data.frame(WingLength, TailLength)
df10
WingLength TailLength
1 10.4 7.4
2 10.8 7.6
3 11.1 7.9
4 10.2 7.2
5 10.3 7.4
6 10.2 7.1
7 10.7 7.4
8 10.5 7.2
9 10.8 7.8
10 11.2 7.7
11 10.6 7.8
12 11.4 8.3
library(ggplot2)
# if (!require("ggplot2")) install.packages("ggplot2")
p10 <-
ggplot(data = df10, mapping = aes(x= WingLength, y = TailLength)) +
geom_point() +
labs(x = "Sparrow Wing Length (cm)", y = "Sparrow Tail Length (cm)") +
theme_bw()
print(p10)
cor(df10$WingLength, df10$TailLength)
[1] 0.8703546
cor.test(df10$WingLength, df10$TailLength)
Pearson's product-moment correlation
data: df10$WingLength and df10$TailLength
t = 5.5893, df = 10, p-value = 0.0002311
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.5923111 0.9631599
sample estimates:
cor
0.8703546
Partial Correlation Analysis
- Linear Relationship between Quantitative Variables while holding/keeping all other constants
- \(\left(X_{1},X_{2}\right)|X_{3}\)
Example
Leaf Area (cm^2) | Leaf Moisture (%) | Total Shoot Length (cm) |
---|---|---|
72 | 80 | 307 |
174 | 75 | 529 |
116 | 81 | 632 |
78 | 83 | 527 |
134 | 79 | 442 |
95 | 81 | 525 |
113 | 80 | 481 |
98 | 81 | 710 |
148 | 74 | 422 |
42 | 78 | 345 |
LeafArea <- c(72, 174, 116, 78, 134, 95, 113, 98, 148, 42)
LeafMoist <- c(75, 81, 83, 79, 81, 80, 81, 74, 78, 58)
TotShLength <- c(307, 529, 632, 527, 442, 525, 481, 710, 422, 345)
df11 <- data.frame(LeafArea, LeafMoist, TotShLength)
df11
LeafArea LeafMoist TotShLength
1 72 75 307
2 174 81 529
3 116 83 632
4 78 79 527
5 134 81 442
6 95 80 525
7 113 81 481
8 98 74 710
9 148 78 422
10 42 58 345
library(ppcor)
# if (!require("ppcor")) install.packages("ppcor")
pcor(df11)
$estimate
LeafArea LeafMoist TotShLength
LeafArea 1.000000000 0.6509096 -0.007012236
LeafMoist 0.650909618 1.0000000 0.324334659
TotShLength -0.007012236 0.3243347 1.000000000
$p.value
LeafArea LeafMoist TotShLength
LeafArea 0.00000000 0.05760462 0.9857154
LeafMoist 0.05760462 0.00000000 0.3944839
TotShLength 0.98571537 0.39448385 0.0000000
$statistic
LeafArea LeafMoist TotShLength
LeafArea 0.00000000 2.268502 -0.01855309
LeafMoist 2.26850175 0.000000 0.90714704
TotShLength -0.01855309 0.907147 0.00000000
$n
[1] 10
$gp
[1] 1
$method
[1] "pearson"
Multiple Correlation Analysis
- Linear Relationship between a Quantitative Variable and set of other Quantitative Variables
- \(\left(X_{1},\left[X_{2},X_{3}\right]\right)\)
Example
Leaf Area (cm^2) | Leaf Moisture (%) | Total Shoot Length (cm) |
---|---|---|
72 | 80 | 307 |
174 | 75 | 529 |
116 | 81 | 632 |
78 | 83 | 527 |
134 | 79 | 442 |
95 | 81 | 525 |
113 | 80 | 481 |
98 | 81 | 710 |
148 | 74 | 422 |
42 | 78 | 345 |
fm12 <- lm(formula = TotShLength ~ LeafArea + LeafMoist, data = df11)
sqrt(summary(fm12)$r.squared)
[1] 0.421277
Completely Randomized Design (CRD)
- Used when experimental material is homogeneous
Example
The following table shows some of the results of an experiment on the effects of applications of sulphur in reducing scab disease of potatoes. The object in applying sulphur is to increase the acidity of the soil, since scab does not thrive in very acid soil. In addition to untreated plots which serve as a control, 3 amounts of dressing were compared—300, 600, and 900 lb. per acre. Both a fall and a spring application of each amount was tested, so that in all there were seven distinct treatments. The sulphur was spread by hand on the surface of the soil, and then diced into a depth of about 4 inches. The quantity to be analyzed is the “scab index”. That is roughly speaking, the percentage of the surface area of the potato that is infected with scab. It is obtained by examining 100 potatoes at random from each plot, grading each potato on a scale from 0 to 100% infected, and taking the average.
ScabIndex <-
c(
12, 10, 24, 29,
30, 18, 32, 26,
9, 9, 16, 4,
30, 7, 21, 9,
16, 10, 18, 18,
18, 24, 12, 19,
10, 4, 4, 5,
17, 7, 16, 17
)
Trt <- factor(
x = rep(x = 1:7, times = c(8, 4, 4, 4, 4, 4, 4))
, labels = c("O", "F3", "S3", "F6", "S6", "F9", "S9")
)
df13 <- data.frame(Trt, ScabIndex)
df13
Trt ScabIndex
1 O 12
2 O 10
3 O 24
4 O 29
5 O 30
6 O 18
7 O 32
8 O 26
9 F3 9
10 F3 9
11 F3 16
12 F3 4
13 S3 30
14 S3 7
15 S3 21
16 S3 9
17 F6 16
18 F6 10
19 F6 18
20 F6 18
21 S6 18
22 S6 24
23 S6 12
24 S6 19
25 F9 10
26 F9 4
27 F9 4
28 F9 5
29 S9 17
30 S9 7
31 S9 16
32 S9 17
library(ggplot2)
# if (!require("ggplot2")) install.packages("ggplot2")
p13 <-
ggplot(data = df13, mapping = aes(x = Trt, y = ScabIndex)) +
geom_point() +
geom_boxplot() +
labs(x = "Treatment", y = "Scab Index") +
theme_bw()
print(p13)
fm13 <- lm(formula = ScabIndex ~ Trt, data = df13)
summary(fm13)
Call:
lm(formula = ScabIndex ~ Trt, data = df13)
Residuals:
Min 1Q Median 3Q Max
-12.625 -4.844 0.625 3.594 13.250
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 22.625 2.369 9.549 8.08e-10 ***
TrtF3 -13.125 4.104 -3.198 0.003734 **
TrtS3 -5.875 4.104 -1.432 0.164666
TrtF6 -7.125 4.104 -1.736 0.094858 .
TrtS6 -4.375 4.104 -1.066 0.296601
TrtF9 -16.875 4.104 -4.112 0.000372 ***
TrtS9 -8.375 4.104 -2.041 0.051977 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 6.702 on 25 degrees of freedom
Multiple R-squared: 0.4641, Adjusted R-squared: 0.3355
F-statistic: 3.608 on 6 and 25 DF, p-value: 0.01026
anova(fm13)
Analysis of Variance Table
Response: ScabIndex
Df Sum Sq Mean Sq F value Pr(>F)
Trt 6 972.34 162.057 3.6081 0.01026 *
Residuals 25 1122.88 44.915
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Randomized Complete Block Design (RCBD)
- Used when experimental material is heterogenous in one direction
Example
Yield : Yield of barley, SoilType : Soil Type, and Trt : 5 sources and a control
Yield <- c(
32.1, 35.6, 41.9, 35.4
, 30.1, 31.5, 37.1, 30.8
, 25.4, 27.4, 33.8, 31.1
, 24.1, 33.0, 35.6, 31.4
, 26.1, 31.0, 33.8, 31.9
, 23.2, 24.8, 26.7, 26.7
)
SoilType <- gl(n = 4, k = 6, length = 4*6
, labels = c("I", "II", "III", "IV"), ordered = FALSE)
Trt <- gl(n = 6, k = 1, length = 4*6
, labels = c("(NH4)2SO4", "NH4NO3", "CO(NH2)2", "Ca(NO3)2", "NaNO3", "Control"), ordered = FALSE)
df14 <- data.frame(Yield, SoilType, Trt)
df14
Yield SoilType Trt
1 32.1 I (NH4)2SO4
2 35.6 I NH4NO3
3 41.9 I CO(NH2)2
4 35.4 I Ca(NO3)2
5 30.1 I NaNO3
6 31.5 I Control
7 37.1 II (NH4)2SO4
8 30.8 II NH4NO3
9 25.4 II CO(NH2)2
10 27.4 II Ca(NO3)2
11 33.8 II NaNO3
12 31.1 II Control
13 24.1 III (NH4)2SO4
14 33.0 III NH4NO3
15 35.6 III CO(NH2)2
16 31.4 III Ca(NO3)2
17 26.1 III NaNO3
18 31.0 III Control
19 33.8 IV (NH4)2SO4
20 31.9 IV NH4NO3
21 23.2 IV CO(NH2)2
22 24.8 IV Ca(NO3)2
23 26.7 IV NaNO3
24 26.7 IV Control
library(ggplot2)
# if (!require("ggplot2")) install.packages("ggplot2")
p14 <-
ggplot(data = df14, mapping = aes(x = Trt, y = Yield)) +
geom_point() +
geom_boxplot() +
labs(x = "Treatment", y = "Yield") +
theme_bw()
print(p14)
fm14 <- lm(formula = Yield ~ SoilType + Trt, data = df14)
summary(fm14)
Call:
lm(formula = Yield ~ SoilType + Trt, data = df14)
Residuals:
Min 1Q Median 3Q Max
-7.0208 -2.4229 0.0792 2.1354 6.7958
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 35.354 2.826 12.511 2.44e-09 ***
SoilTypeII -3.500 2.664 -1.314 0.2087
SoilTypeIII -4.233 2.664 -1.589 0.1329
SoilTypeIV -6.583 2.664 -2.471 0.0259 *
TrtNH4NO3 1.050 3.263 0.322 0.7521
TrtCO(NH2)2 -0.250 3.263 -0.077 0.9399
TrtCa(NO3)2 -2.025 3.263 -0.621 0.5442
TrtNaNO3 -2.600 3.263 -0.797 0.4380
TrtControl -1.700 3.263 -0.521 0.6100
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 4.615 on 15 degrees of freedom
Multiple R-squared: 0.3512, Adjusted R-squared: 0.005215
F-statistic: 1.015 on 8 and 15 DF, p-value: 0.4653
anova(fm14)
Analysis of Variance Table
Response: Yield
Df Sum Sq Mean Sq F value Pr(>F)
SoilType 3 133.62 44.539 2.0915 0.1443
Trt 5 39.31 7.862 0.3692 0.8618
Residuals 15 319.43 21.295
Latin Square Design
- Used when experimental material is heterogenous in two perpendicular directions
Example
The following table shows the field layout and yield of a 5×5 Latin square experiment on the effect of spacing on yield of millet plants. Five levels of spacing were used. The data on yield (grams/plot) was recorded and is given below:
Yield <- c(
257, 230, 279, 287, 202,
245, 283, 245, 280, 260,
182, 252, 280, 246, 250,
203, 204, 227, 193, 259,
231, 271, 266, 334, 338
)
Row <- factor(x = rep(x = 1:5, each = 5))
Col <- factor(x = rep(x = 1:5, times = 5))
Trt <- c(
"B", "E", "A", "C", "D"
, "D", "A", "E", "B", "C"
, "E", "B", "C", "D", "A"
, "A", "C", "D", "E", "B"
, "C", "D", "B", "A", "E"
)
df15 <- data.frame(Yield, Row, Col, Trt)
df15
Yield Row Col Trt
1 257 1 1 B
2 230 1 2 E
3 279 1 3 A
4 287 1 4 C
5 202 1 5 D
6 245 2 1 D
7 283 2 2 A
8 245 2 3 E
9 280 2 4 B
10 260 2 5 C
11 182 3 1 E
12 252 3 2 B
13 280 3 3 C
14 246 3 4 D
15 250 3 5 A
16 203 4 1 A
17 204 4 2 C
18 227 4 3 D
19 193 4 4 E
20 259 4 5 B
21 231 5 1 C
22 271 5 2 D
23 266 5 3 B
24 334 5 4 A
25 338 5 5 E
library(ggplot2)
# if (!require("ggplot2")) install.packages("ggplot2")
p15 <-
ggplot(data = df15, mapping = aes(x = Trt, y = Yield)) +
geom_point() +
geom_boxplot() +
labs(x = "Treatment", y = "Yield") +
theme_bw()
print(p15)
fm15 <- lm(formula = Yield ~ Row + Col + Trt, data = df15)
summary(fm15)
Call:
lm(formula = Yield ~ Row + Col + Trt, data = df15)
Residuals:
Min 1Q Median 3Q Max
-44.68 -12.48 1.12 16.52 54.92
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 240.08 23.43 10.247 2.75e-07 ***
Row2 11.60 20.55 0.565 0.5828
Row3 -9.00 20.55 -0.438 0.6692
Row4 -33.80 20.55 -1.645 0.1259
Row5 37.00 20.55 1.801 0.0969 .
Col2 24.40 20.55 1.187 0.2580
Col3 35.80 20.55 1.742 0.1070
Col4 44.40 20.55 2.161 0.0516 .
Col5 38.20 20.55 1.859 0.0877 .
TrtB -7.00 20.55 -0.341 0.7393
TrtC -17.40 20.55 -0.847 0.4137
TrtD -31.60 20.55 -1.538 0.1500
TrtE -32.20 20.55 -1.567 0.1431
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 32.49 on 12 degrees of freedom
Multiple R-squared: 0.6536, Adjusted R-squared: 0.3073
F-statistic: 1.887 on 12 and 12 DF, p-value: 0.1426
anova(fm15)
Analysis of Variance Table
Response: Yield
Df Sum Sq Mean Sq F value Pr(>F)
Row 4 13601.4 3400.3 3.2212 0.05164 .
Col 4 6146.2 1536.5 1.4556 0.27581
Trt 4 4156.6 1039.1 0.9844 0.45229
Residuals 12 12667.3 1055.6
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Factorial Experiment under RCBD
- Three strains of perennial ryegrass were grown as swards at each of the fertilizer levels. The three strains were S23, Kent & a control. The fertilizer levels were heavy & average. The experiment was laid out under randomized complete block design. The mid summer dry matter yields, in units of 10 lb/acre were as follows. Analyze the data & construct a table of means with appropriate standard errors. Draw any graphs you deem fit to explain the results & write a short report on the conclusions to be drawn from this experiment.
DMY <-
c(
299, 318, 284, 279,
247, 202, 171, 183,
403, 439, 355, 324,
222, 170, 192, 176,
382, 353, 383, 310,
233, 216, 200, 143
)
Blk <- gl(n = 4, k = 1, length = 24, labels = c("I", "II", "III", "IV"))
Vart <- gl(n = 3, k = 8, length = 24, labels = c("S23", "Control", "Kent"))
Manu <- gl(n = 2, k = 4, length = 24, labels = c("Heavy", "Average"))
df1 <- data.frame(DMY, Blk, Vart, Manu)
df1
DMY Blk Vart Manu
1 299 I S23 Heavy
2 318 II S23 Heavy
3 284 III S23 Heavy
4 279 IV S23 Heavy
5 247 I S23 Average
6 202 II S23 Average
7 171 III S23 Average
8 183 IV S23 Average
9 403 I Control Heavy
10 439 II Control Heavy
11 355 III Control Heavy
12 324 IV Control Heavy
13 222 I Control Average
14 170 II Control Average
15 192 III Control Average
16 176 IV Control Average
17 382 I Kent Heavy
18 353 II Kent Heavy
19 383 III Kent Heavy
20 310 IV Kent Heavy
21 233 I Kent Average
22 216 II Kent Average
23 200 III Kent Average
24 143 IV Kent Average
fm1 <- lm(formula = DMY ~ Blk + Vart * Manu, data = df1)
anova(fm1)
Analysis of Variance Table
Response: DMY
Df Sum Sq Mean Sq F value Pr(>F)
Blk 3 12814 4271 7.1611 0.003294 **
Vart 2 6196 3098 5.1935 0.019324 *
Manu 1 131128 131128 219.8375 2.287e-10 ***
Vart:Manu 2 9590 4795 8.0389 0.004239 **
Residuals 15 8947 596
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
library(emmeans)
emmip(fm1, Vart ~ Manu)
emmip(fm1, Manu ~ Vart)
emmeans(fm1, ~ Vart)
Vart emmean SE df lower.CL upper.CL
S23 248 8.63 15 229 266
Control 285 8.63 15 267 304
Kent 278 8.63 15 259 296
Results are averaged over the levels of: Blk, Manu
Confidence level used: 0.95
emmeans(fm1, ~ Manu)
Manu emmean SE df lower.CL upper.CL
Heavy 344 7.05 15 329 359
Average 196 7.05 15 181 211
Results are averaged over the levels of: Blk, Vart
Confidence level used: 0.95
emmeans(fm1, ~ Manu|Vart)
Vart = S23:
Manu emmean SE df lower.CL upper.CL
Heavy 295 12.2 15 269 321
Average 201 12.2 15 175 227
Vart = Control:
Manu emmean SE df lower.CL upper.CL
Heavy 380 12.2 15 354 406
Average 190 12.2 15 164 216
Vart = Kent:
Manu emmean SE df lower.CL upper.CL
Heavy 357 12.2 15 331 383
Average 198 12.2 15 172 224
Results are averaged over the levels of: Blk
Confidence level used: 0.95
emmeans(fm1, ~ Vart|Manu)
Manu = Heavy:
Vart emmean SE df lower.CL upper.CL
S23 295 12.2 15 269 321
Control 380 12.2 15 354 406
Kent 357 12.2 15 331 383
Manu = Average:
Vart emmean SE df lower.CL upper.CL
S23 201 12.2 15 175 227
Control 190 12.2 15 164 216
Kent 198 12.2 15 172 224
Results are averaged over the levels of: Blk
Confidence level used: 0.95
emmeans(fm1, pairwise ~ Vart)
$emmeans
Vart emmean SE df lower.CL upper.CL
S23 248 8.63 15 229 266
Control 285 8.63 15 267 304
Kent 278 8.63 15 259 296
Results are averaged over the levels of: Blk, Manu
Confidence level used: 0.95
$contrasts
contrast estimate SE df t.ratio p.value
S23 - Control -37.25 12.2 15 -3.050 0.0208
S23 - Kent -29.62 12.2 15 -2.426 0.0689
Control - Kent 7.62 12.2 15 0.624 0.8092
Results are averaged over the levels of: Blk, Manu
P value adjustment: tukey method for comparing a family of 3 estimates
emmeans(fm1, pairwise ~ Manu)
$emmeans
Manu emmean SE df lower.CL upper.CL
Heavy 344 7.05 15 329 359
Average 196 7.05 15 181 211
Results are averaged over the levels of: Blk, Vart
Confidence level used: 0.95
$contrasts
contrast estimate SE df t.ratio p.value
Heavy - Average 148 9.97 15 14.827 <.0001
Results are averaged over the levels of: Blk, Vart
emmeans(fm1, pairwise ~ Vart|Manu)
$emmeans
Manu = Heavy:
Vart emmean SE df lower.CL upper.CL
S23 295 12.2 15 269 321
Control 380 12.2 15 354 406
Kent 357 12.2 15 331 383
Manu = Average:
Vart emmean SE df lower.CL upper.CL
S23 201 12.2 15 175 227
Control 190 12.2 15 164 216
Kent 198 12.2 15 172 224
Results are averaged over the levels of: Blk
Confidence level used: 0.95
$contrasts
Manu = Heavy:
contrast estimate SE df t.ratio p.value
S23 - Control -85.25 17.3 15 -4.936 0.0005
S23 - Kent -62.00 17.3 15 -3.590 0.0071
Control - Kent 23.25 17.3 15 1.346 0.3927
Manu = Average:
contrast estimate SE df t.ratio p.value
S23 - Control 10.75 17.3 15 0.622 0.8102
S23 - Kent 2.75 17.3 15 0.159 0.9861
Control - Kent -8.00 17.3 15 -0.463 0.8893
Results are averaged over the levels of: Blk
P value adjustment: tukey method for comparing a family of 3 estimates
emmeans(fm1, pairwise ~ Manu|Vart)
$emmeans
Vart = S23:
Manu emmean SE df lower.CL upper.CL
Heavy 295 12.2 15 269 321
Average 201 12.2 15 175 227
Vart = Control:
Manu emmean SE df lower.CL upper.CL
Heavy 380 12.2 15 354 406
Average 190 12.2 15 164 216
Vart = Kent:
Manu emmean SE df lower.CL upper.CL
Heavy 357 12.2 15 331 383
Average 198 12.2 15 172 224
Results are averaged over the levels of: Blk
Confidence level used: 0.95
$contrasts
Vart = S23:
contrast estimate SE df t.ratio p.value
Heavy - Average 94.2 17.3 15 5.458 0.0001
Vart = Control:
contrast estimate SE df t.ratio p.value
Heavy - Average 190.2 17.3 15 11.016 <.0001
Vart = Kent:
contrast estimate SE df t.ratio p.value
Heavy - Average 159.0 17.3 15 9.207 <.0001
Results are averaged over the levels of: Blk
Stability Analysis
library(stability)
library(help = stability)
data(ge_data)
Individual Analysis of Variance for each Location
Yield.indiv_anova <-
indiv_anova(
.data = ge_data
, .y = Yield
, .rep = Rep
, .gen = Gen
, .env = Env
)
Yield.indiv_anova
$`Analysis of Variance for FSD`
Df Sum Sq Mean Sq F value Pr(>F)
Gen 59 45524231 771597 32.62 <2e-16 ***
Residuals 60 1419180 23653
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
$`Analysis of Variance for FSDR`
Df Sum Sq Mean Sq F value Pr(>F)
Gen 59 14019781 237623 2.724 7.91e-05 ***
Residuals 60 5234860 87248
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
$`Analysis of Variance for Okara`
Df Sum Sq Mean Sq F value Pr(>F)
Gen 59 26802597 454281 3.538 1.22e-06 ***
Residuals 60 7703642 128394
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
$`Analysis of Variance for Sargodha`
Df Sum Sq Mean Sq F value Pr(>F)
Gen 59 38381469 650533 4.254 4.3e-08 ***
Residuals 60 9175301 152922
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
$`Analysis of Variance for Gujranwala`
Df Sum Sq Mean Sq F value Pr(>F)
Gen 59 41863914 709558 4.687 6.55e-09 ***
Residuals 60 9082956 151383
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
$`Analysis of Variance for KotNaina`
Df Sum Sq Mean Sq F value Pr(>F)
Gen 59 26692263 452411 4.3 3.51e-08 ***
Residuals 60 6313095 105218
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
$`Analysis of Variance for KSK`
Df Sum Sq Mean Sq F value Pr(>F)
Gen 59 17890629 303231 3.472 1.68e-06 ***
Residuals 60 5239575 87326
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
$`Analysis of Variance for Khanewal`
Df Sum Sq Mean Sq F value Pr(>F)
Gen 59 15125214 256360 2.451 0.000345 ***
Residuals 60 6274965 104583
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
$`Analysis of Variance for Sahiwal`
Df Sum Sq Mean Sq F value Pr(>F)
Gen 59 26109918 442541 4.095 8.82e-08 ***
Residuals 60 6484679 108078
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
$`Analysis of Variance for Bhakkar`
Df Sum Sq Mean Sq F value Pr(>F)
Gen 59 10870598 184247 1.862 0.00885 **
Residuals 60 5936717 98945
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
$`Analysis of Variance for Bahawalnagar`
Df Sum Sq Mean Sq F value Pr(>F)
Gen 59 9718292 164717 1.716 0.0196 *
Residuals 60 5760765 96013
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Combined Analysis of Variance
YieldANOVA <-
add_anova(
.data = ge_data
, .y = Yield
, .rep = Rep
, .gen = Gen
, .env = Env
)
YieldANOVA
$anova
Analysis of Variance Table
Response: .data[[Y]]
Df Sum Sq Mean Sq F value Pr(>F)
Env 10 263940496 26394050 68.0551 1.995e-08 ***
Rep(Env) 11 4266167 387833
Gen 59 71662431 1214617 12.2482 < 2.2e-16 ***
Gen:Env 590 201336476 341248 3.4411 < 2.2e-16 ***
Residuals 649 64359568 99167
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Additive Main Effects and Multiplicative Interaction (AMMI) Analysis
Yield.ammi <-
ammi(
.data = ge_data
, .y = Yield
, .rep = Rep
, .gen = Gen
, .env = Env
)
Yield.ammi
$anova
Analysis of Variance Table
Response: .data[[Y]]
Df Sum Sq Mean Sq F value Pr(>F)
Env 10 263940496 26394050 68.0551 1.995e-08 ***
Rep(Env) 11 4266167 387833
Gen 59 71662431 1214617 12.2482 < 2.2e-16 ***
Gen:Env 590 201336476 341248 3.4411 < 2.2e-16 ***
Residuals 649 64359568 99167
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
$pc.anova
# A tibble: 11 x 9
PC SingVal Percent accumPerc Df SS Mean.Sq F.value Pr..F.
<fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 PC1 4.50e+ 3 2.01e+ 1 20.1 68 4.04e+ 7 5.95e+ 5 6.00e+ 0 0.
2 PC2 4.17e+ 3 1.73e+ 1 37.4 66 3.49e+ 7 5.28e+ 5 5.33e+ 0 0.
3 PC3 4.05e+ 3 1.63e+ 1 53.7 64 3.27e+ 7 5.12e+ 5 5.16e+ 0 0.
4 PC4 3.49e+ 3 1.21e+ 1 65.8 62 2.44e+ 7 3.93e+ 5 3.96e+ 0 0.
5 PC5 3.38e+ 3 1.13e+ 1 77.1 60 2.29e+ 7 3.81e+ 5 3.84e+ 0 0.
6 PC6 3.11e+ 3 9.63e+ 0 86.7 58 1.94e+ 7 3.34e+ 5 3.37e+ 0 4.02e-14
7 PC7 2.30e+ 3 5.25e+ 0 92.0 56 1.06e+ 7 1.89e+ 5 1.90e+ 0 1.45e- 4
8 PC8 2.03e+ 3 4.10e+ 0 96.1 54 8.25e+ 6 1.53e+ 5 1.54e+ 0 9.52e- 3
9 PC9 1.52e+ 3 2.30e+ 0 98.4 52 4.63e+ 6 8.91e+ 4 8.98e- 1 6.77e- 1
10 PC10 1.28e+ 3 1.63e+ 0 100 50 3.27e+ 6 6.55e+ 4 6.60e- 1 9.66e- 1
11 PC11 1.92e-11 3.64e-28 100 48 7.34e-22 1.53e-23 1.54e-28 1.00e+ 0
Additive Main Effects and Multiplicative Interaction (AMMI) Biplot Analysis
ammi_biplot(
.data = ge_data
, .y = Yield
, .rep = Rep
, .gen = Gen
, .env = Env
)
$aami.biplot
$MeanPC1Plot
Genotype plus Genotypes by Environment (GGE) Interaction Biplot Analysis
gge_biplot(
.data = ge_data
, .y = Yield
, .rep = Rep
, .gen = Gen
, .env = Env
)
$gge.biplot