Linear Regression is the most basic statistical method that helps you identify the relationship between two or more variables which examines the ability of an independent variable to influence the dependent variable. It is a common practice at sports analytics to determine how the metrics are correlated with the outcomes.
The formula tells how the dependent variable (y) changes as the predictor variable (x) changes.
Y is the dependent variable a.k.a. criterion a.k.a. response variable
β1 is the correlation coefficient which is the slope of the regression line
X is the independent variable, a.k.a. predictor, a.k.a. explanatory
C is the constant value a.k.a. error
Multiple variable regression
Multivariable linear regression can be used to evaluate the ranking metrics where each predictor’s coefficient controls for the effect of the other metrics, and its statistical significance in the overall model is evaluated.
The formula tells how the dependent variable (y) changes as the predictor variables (x1), (x2), (xn) changes.
What happens if the independent variables have linear relationships with each other?
This is called multicollinearity, and causes a division by zero which breaks your regression analysis when least squares estimates are unbiased, but their variances are large. So, determine if multicollinearity exists before doing regression analysis. In the detected cases of multicollinearity, ridge regression comes in. It is a technique that adds a degree of bias to the regression estimates where the standard errors are reduced.