top of page

Back to Basics- Linear Relations

  • mahimagupta950
  • Jun 2, 2020
  • 2 min read

Do you often come across terms that you have read a hundred times before (Eg. binomial distribution, variables, parameters, covariance, R^2, P value, etc.), however, they still manage to baffle you?

Well, its a classic case of #Dejapoo (adj. the feeling that you've heard this crap before).


Fret not! In this blog post (and the upcoming posts of the series 'Back to Basics'), we'll go through the basics of statistics.


Statistics usually entail the study of more than one variable at a time and necessitates finding out the relationship between/among those variables. Variables can be both linearly or non-linearly (curvilinear relation) related. In this blog post we will focus on linear relationship between variables and their measurement.


Linear relationship is a statistical term used to defined a straight line relationship between a constant and a variable. It can be represented both in graphical or mathematical format.


The technique for studying the relationship between two or more variables can be broadly grouped under two heads:

a) Correlation- generally between two interdependent variables, linear relations only.

b) Regression- between two or more dependent sort of variables.

(Regression will be dealt with in a separate post)


Measurement of Linear Relations- There are three simple ways of finding out the degree of linear association between two variables:

1. Scatter Plots- When the pairwise data of the two variables are plotted on a two dimensional graph paper, the approach is termed as scatter plot. It is the simplest way of evaluating the nature of relationship between two variables. However, it does not provide us any mathematical measure.

Scatter plot depicting the relationship between number of at-bats and the number of hits- strong positive relation

Eg. Scatter plot depicting the relationship between number of at-bats and the number of hits- strong positive relation










2. Covariance- It is a measure of corresponding change in one variable due to change in the second variable. It measures the degree of linear association between the two variables. It is calculated by the following formula:


Where X and Y are the two variables and X bar and Y bar are the arithmetic means of the two datasets. Covariance determines the direction of relationship and can be negative.

cov(X,Y) > 0 X and Y are positively correlated

cov(X,Y) < 0 X and Y are inversely correlated

cov(X,Y) = 0 X and Y are independent


3. Pearson's Correlation Coefficient (r)- Correlation is measure of intensity or degree of linear relationship between two variables. It is applicable in case the data set is continuous in nature. It is calculated by the following formula:

Where X and Y are the two variables and X bar and Y bar are the arithmetic means of the two datasets.

Correlation coefficient is unit less and its value ranges from +1 to -1, with +1 indicating strong, positive relationship and -1 indicating strong negative relationship. It is used to assess both the strength and direction of the linear relationships between pairs of variables. It is sensitive to outliers.











Comments


Subscribe Form

  • instagram
  • linkedin

©2020 by rampkart

bottom of page