Does an individual’s brain size and body size (height, weight etc) predictive of his/her intelligence??
Now to answer this research question, we first need to identify the predictor and response variables.
• Response variable (y): Performance IQ Score (PIQ) of an individual
• Potential Predictor Variable (x1): Brain Size obtained from MRI scan provided by doctor
• Potential Predictor Variable (x2): Height in inches
• Potential Predictor Variable (x3): Weight in kgs
A common way of investigating the relationships among all of the variables is by way of a "scatter plot matrix." which contains a scatter plot of each pair of variables arranged in an orderly arrangement. Here's how a scatter plot matrix looks like for our brain and body size case study:
Image Courtesy: onlinecourses.science.psu.edu
Below we can see the multiple linear regression model with three quantitative predictors (brain size, height and weight) :
Where independent error terms εi follow a normal distribution with mean 0 and equal variance σ 2
The multiple regression model formulated above will try to answer the below question concerns:
• Which predictors (if any) -brain size, height, or weight - explain some of the variation in intelligence scores? Or conduct hypothesis tests for testing whether the slope parameters are 0.
• What is the effect of brain size on PIQ, after taking into account height and weight? Or calculate and interpret a confidence interval for the PIQ slope parameter.
• What is the PIQ of an individual with a given brain size, height, and weight? Or Calculate and interpret a prediction interval for the response.
However, one point to be noted is that whether the individual predictor variables (brain size, height, weight etc) are correlated with each other or not?
• If yes, then it will create a problem in running linear regression, also referred to as Multicollinearity. It is not a mistake in model specification but happens due to the nature of data at hand.
• A simple test for detecting multicollinearity is to conduct artificial regressions between each independent variable (as the “dependent” variable) and the remaining independent variables
• High R2, highly significant F-test, but few or no statistically significant t tests are a symptom of the presence of multicollinearity in the model.