# $\color{green}{\textit{Statistics:}}$$\color{green}{\textit{The Art & Science of}}$$\color{green}{\textit{Learning from Data}}$

$\color{green}{\textit{}}$

# Introduction

## Introduction

$\color{green}{\textit{In God we trust,}}$ $\color{green}{\textit{all others must bring data.}}$

## Introduction

• $\color{green}{\textit{In God we trust, all others must bring data.}}$
($\color{black}{\textit{William E. Deming}}$)

• $\color{green}{\textit{Statistical thinking will one day be as necessary a}}$ $\color{green}{\textit{qualification for efficient citizenship as the ability}}$ $\color{green}{\textit{to read & write.}}$
($\color{black}{\textit{H.G. Wells}}$)

## Introduction

• $\color{green}{\textit{To call in the statistician after the experiment is}}$ $\color{green}{\textit{done may be no more than asking him to perform}}$ $\color{green}{\textit{a postmortem examination: he may be able to say}}$ $\color{green}{\textit{what the experiment died of.}}$
($\color{black}{\textit{R. A. Fisher}}$)

• $\color{green}{\textit{If all you have is a hammer, everything looks like}}$ $\color{green}{\textit{a nail.}}$
($\color{black}{\textit{Abraham Maslow}}$)

## Introduction

• $\color{red}{\text{Mathematics}}$ is the $\color{red}{\text{language}}$ of $\color{red}{\text{Science}}$

• $2 + 2 = 4$
• $0^{\circ}C = 32^{\circ}F$ ($\color{blue}{\text{Paradox}}$)
• Every action has reaction of same extent but opposite direction. ($\color{blue}{\text{Newton's Third Law}}$)
• $\color{green}{\text{Statistics}}$ is the science of $\color{green}{\text{uncertainty}}$ and $\color{green}{\text{variability}}$

• $\color{green}{\text{Statistics}}$ is the $\color{green}{\text{interpretation}}$ of $\color{green}{\text{Science}}$

• $\color{green}{\text{D}}$ata $\color{green}{\text{D}}$riven $\color{green}{\text{D}}$ecisions ($\color{green}{\text{3Ds}}$)

## Reasoning

• $\color{green}{\text{Deduction:}}$
• Reasoning from $\color{green}{\text{general}}$ to $\color{green}{\text{particular}}$.

• Man is mortal. → Every human being is mortal.

• $\color{red}{\text{Induction:}}$
• Reasoning from $\color{red}{\text{particular}}$ to $\color{red}{\text{general}}$.

## Statistical Reasoning & Analysis

• $\color{green}{\text{Statistics}}$ is the $\color{green}{\text{science}}$ of $\color{green}{\text{uncertainty}}$ & $\color{green}{\text{variability}}$

• Turning $\color{green}{\text{Data}}$ into $\color{green}{\text{Information}}$

• $\color{green}{\text{Data}}$$\color{green}{\text{Information}}$$\color{green}{\text{Knowledge}}$$\color{green}{\text{Wisdom}}$

• $\color{green}{\text{Statistics}}$ is the $\color{green}{\text{Art}}$ and $\color{green}{\text{Science}}$ of $\color{green}{\text{learning}}$ from $\color{green}{\text{Data}}$.

# Variable

## Variable

• $\color{red}{\text{Variable:}}$ A $\color{green}{\text{characteristic}}$ that may $\color{green}{\text{vary}}$ from $\color{green}{\text{subject}}$ to $\color{green}{\text{subject}}$

• $\color{red}{\text{Variables}}$ are $\color{green}{\text{denoted}}$ by $\color{green}{\text{last}}$ $\color{green}{\text{English}}$ $\color{green}{\text{alphabets}}$ in $\color{green}{\text{upper}}$ $\color{green}{\text{case}}$

• $\color{green}{\text{Different observations}}$ of a $\color{red}{\text{variable}}$ are $\color{green}{\text{characterized}}$ by $\color{green}{\text{subscripts}}$

# Measurement & Measurement Scales

Traffic Signal

## Measurement & Measurement Scales

• $\color{green}{\text{Measurement}}$
• The process of assigning numbers or labels to objects or states in accordance with specific logically accepted rules.

• $\color{green}{\text{Measurement Scales}}$
• Data can be classified according to levels of measurement.
• The level of measurement of the data often dictates the calculations that can be done to summarize and present the data.
• It will also determine the statistical tests that should be performed.

## Measurement & Measurement Scales

Measurement Scales

# Types of Variables

## Qualitative & Quantitative Variables

• $\color{green}{\text{Qualitative}}$
• Nominal or Ordinal variables
• $\color{green}{\text{Quantitative}}$
• Interval or Ratio variables
• Discrete
• Continuous
• Normal
• Non-Normal

## Dependent & Independent Variables

• $\color{green}{\text{Dependent Variable}}$
• Variable influenced by other variable(s)

• $\color{green}{\text{Independent Variable}}$
• Variable influencing other variable(s)

# Relationship b/w Variables

## Dependent & Independent Variables

• $\color{green}{\text{Expenditures & Income}}$
• Expenditures are influenced by Income.
• $\color{green}{\text{Dependent Variable:}}$ Expenditures
• $\color{green}{\text{Independent Variable:}}$ Income
• Expenditures Income

• $\color{green}{\text{CGPA & Study Hours}}$
• CGPA is influenced by Study Hours.
• $\color{green}{\text{Dependent Variable:}}$ CGPA
• $\color{green}{\text{Independent Variable:}}$ Study Hours
• CGPA Study Hours

## Dependent & Independent Variables

• $\color{green}{\text{Landline Phone Bill & Number of Calls}}$
• Landline Phone Bill is affected by Number of Calls made.
• $\color{green}{\text{Dependent Variable:}}$ Landline Phone Bill
• $\color{green}{\text{Independent Variable:}}$ No. of Calls
• Bill No. of Calls

• $\color{green}{\text{Crop Production & Amount of Fertilizer}}$
• Crop Production is influenced by Amount of Fertilizer used.
• $\color{green}{\text{Dependent Variable:}}$ Crop Production
• $\color{green}{\text{Independent Variable:}}$ Amount of Fertilizer
• Crop Production Amount of Fertilizer

# Types of Relationship

## Mathematical Relationship

• $\color{green}{\text{Mathematical Relationship}}$
• $\color{red}{\text{Exact Relationship}}$
• $Y = f\left(X\right)$
• $Y \mathrel{\color{red}\leftarrow} X$

## Mathematical Relationship

• $\color{green}{\text{Relationship between Area and Radius of a Circle}}$
• $A = f\left(r\right)$
• $A = \pi r^{2}$
• $A \mathrel{\color{red}\leftarrow} r$

• $\color{green}{\text{Relationship between Landline Phone Bill and Number of Calls made}}$
• $\text{Bill} = f\left(\text{No. of Calls}\right)$
• $\text{Bill} \mathrel{\color{red}\leftarrow} \text{No. of Calls}$

## Statistical Relationship

• $\color{green}{\text{Statistical Relationship}}$
• $\color{red}{\text{Inexact or Probabilistic Relationship}}$
• $Y = f\left(X\right)+\epsilon$
• $Y \mathrel{\color{red}\leftarrow} X$

## Statistical Relationship

• $\color{green}{\text{Relationship between Expenditures and Income}}$
• $\text{Expenditures} = f\left(\text{Income}\right)+\epsilon$
• $\text{Expenditures} \mathrel{\color{red}\leftarrow} \text{Income}$

• $\color{green}{\text{Relationship between CGPA and Study Hours}}$
• $\text{CGPA} = f\left(\text{Study Hours}\right)+\epsilon$
• $\text{CGPA} \mathrel{\color{red}\leftarrow} \text{Study Hours}$

# Statistical Models

## Models

$\color{green}{\textit{All models are wrong,}}$ $\color{green}{\textit{but some are useful.}}$

## Linear Model

• $\color{green}{\text{Expenditures & Income}}$
• $\color{green}{\text{Expenditures are influenced by Income}}$
• $\text{Expenditures} \mathrel{\color{red}\leftarrow} \text{Income}$
• $\color{green}{\text{Expenditures & Gender}}$
• $\color{green}{\text{Expenditures are influenced by Gender}}$
• $\text{Expenditures} \mathrel{\color{red}\leftarrow} \text{Gender}$
• $\color{green}{\text{Expenditures, Income & Gender}}$
• $\color{green}{\text{Expenditures are influenced by Income & Gender}}$
• $\text{Expenditures} \mathrel{\color{red}\leftarrow} \text{Income} + \text{Gender}$

## Linear Model

• $\color{green}{\text{CGPA & Study Hours}}$
• $\color{green}{\text{CGPA is influenced by Study Hours}}$
• $\text{CGPA} \mathrel{\color{red}\leftarrow} \text{Study Hours}$
• $\color{green}{\text{CGPA & Gender}}$
• $\color{green}{\text{CGPA is influenced by Gender}}$
• $\text{CGPA} \mathrel{\color{red}\leftarrow} \text{Gender}$
• $\color{green}{\text{CGPA, Study Hours & Gender}}$
• $\color{green}{\text{CGPA is influenced by Study Hours & Gender}}$
• $\text{CGPA} \mathrel{\color{red}\leftarrow} \text{Study Hours} + \text{Gender}$

## Linear Model

• $\color{green}{\text{Weight Gain & Intake}}$
• $\color{green}{\text{Weight Gain is influenced by Intake}}$
• $\text{Weight Gain} \mathrel{\color{red}\leftarrow} \text{Intake}$
• $\color{green}{\text{Weight Gain & Feed Type}}$
• $\color{green}{\text{Weight Gain is influenced by Feed Type}}$
• $\text{Weight Gain} \mathrel{\color{red}\leftarrow} \text{Feed Type}$
• $\color{green}{\text{Weight Gain, Intake & Feed Type}}$
• $\color{green}{\text{Weight Gain is influenced by Intake & Feed Type}}$
• $\text{Weight Gain} \mathrel{\color{red}\leftarrow} \text{Intake} + \text{Feed Type}$

## Linear Model

• $\color{green}{\text{Yield & Amount of Fertilizer}}$
• $\color{green}{\text{Yield of a crop is influenced by Amount of Fertilizer}}$
• $\text{Yield} \mathrel{\color{red}\leftarrow} \text{Amount of Fertilizer}$
• $\color{green}{\text{Yield & Varieties}}$
• $\color{green}{\text{Yield of a crop is influenced by Varieties}}$
• $\text{Yield} \mathrel{\color{red}\leftarrow} \text{Varieties}$
• $\color{green}{\text{Yield Amount of Fertilizer & Varieties}}$
• $\color{green}{\text{Yield of a crop is influenced by Amount of Fertilizer & Varieties}}$
• $\text{Yield} \mathrel{\color{red}\leftarrow} \text{Amount of Fertilizer} + \text{Varieties}$

# Regression Model

## Regression Model

• $\color{green}{\text{Quantify the dependency of a Normal variable on}}$ $\color{green}{\text{one or more quantitative variable(s)}}$

## Regression Model

Population Regression Function

# ANOVA Model

## ANOVA Model

• $\color{green}{\text{Comparing means of Normal dependent variable}}$ $\color{green}{\text{for levels of different factors}}$

ANOVA Model

# ANCOVA Model

## ANCOVA Model

• $\color{green}{\text{Quantify the dependency of a Normal variable on}}$ $\color{green}{\text{one or more quantitative variable(s)}}$

• $\color{green}{\text{Comparing means of Normal dependent variable}}$ $\color{green}{\text{for levels of different factors}}$

ANCOVA Model

# Treatment Structure

## Treatment Structure

Treatment Structure

## Treatment Structure

Treatment Structure

## References (Cont’d…)

Searle, S. R. and M. H. J. Gruber (2016). Linear Models. John Wiley & Sons.

Sullivan, L. M. (2018). Essentials of Biostatistics in Public Health. Jones & Bartlett Learning.

Triola, M. F. (2018). Elementary Statistics. Pearson Education, Inc.

Triola, M. M., M. F. Triola, and J. Roy (2018). Biostatistics for the Biological & Health Sciences. Pearson Education, Inc.