Monday – September 25, 2023.

In today’s analysis, I have refined my code and effectively addressed all previously encountered codes issues. In the past, I had previously developed a linear regression model. However, in the present analysis, I did a comprehensive evaluation of this model across three distinct scenarios involving the computation of key statistical metrics:

  1. Calculation of p-values to assess the significance of individual predictors.
  2. Estimation of confidence intervals to gauge the precision of regression coefficient estimates.
  3. Computation of the coefficient of determination (R-squared) to measure the explained variance in the dependent variable.
  4. Execution of cross-validation procedures to assess model performance and generalizability.
  5. Investigation of collinearity among predictor variables to identify potential multicollinearity issues.

# Final Analysis of Linear Regression Model Let’s compare the three models (A, B, and C) based on various statistics and provide a detailed analysis:

Model A:

VIF values:
const: 325.88
% Diabetes: 1.18
% Obesity: 1.18
Mean R-squared: 0.125
Intercept: -0.158
Coefficients for % Diabetes and % Obesity: 0.957 and 0.445, respectively
Confidence Intervals for coefficients (95%):
% Diabetes: [0.769, 1.145]
% Obesity: [0.312, 0.578]
F-statistic: 115.2
Prob (F-statistic): 3.51e-39
Model B:

VIF values:
const: 318.05
% Inactivity: 1.29
% Obesity: 1.29
Mean R-squared: 0.155
Intercept: 1.654
Coefficients for % Inactivity and % Obesity: 0.232 and 0.111, respectively
Confidence Intervals for coefficients (95%):
% Inactivity: [0.187, 0.278]
% Obesity: [0.043, 0.180]
F-statistic: 90.71
Prob (F-statistic): 1.76e-32
Model C:

VIF values:
const: 120.67
% Inactivity: 1.47
% Diabetes: 1.47
Mean R-squared: 0.093
Intercept: 12.794
Coefficients for % Inactivity and % Diabetes: 0.247 and 0.254, respectively
Confidence Intervals for coefficients (95%):
% Inactivity: [0.173, 0.321]
% Diabetes: [0.097, 0.410]
F-statistic: 57.04
Prob (F-statistic): 3.54e-22
Analysis and Comparison:

VIF Values:

Model A has a very high VIF value for the constant (const), indicating potential multicollinearity with other variables in the model.
Model B and Model C also have high VIF values for the constant but lower than in Model A. These models include different sets of independent variables.
R-squared:

Model B has the highest mean R-squared (0.155), indicating that it explains the most variation in the dependent variable (% Inactivity).
Model A has the lowest mean R-squared (0.125).
Model C falls in between with a mean R-squared of 0.093.
Intercept and Coefficients:

The intercept values differ significantly between models. For Model A, it’s close to zero, while for Models B and C, it’s considerably higher.
The coefficients also vary between models, and their interpretations depend on the specific variables used in each model.
Confidence Intervals:

Confidence intervals for coefficients indicate whether they are statistically significant. In all models, some coefficients have confidence intervals that exclude zero, making them statistically significant predictors.
F-statistic:

Model A has the highest F-statistic (115.2), indicating strong overall model significance.
Model B has a lower F-statistic (90.71), but it is still highly significant.
Model C has the lowest F-statistic (57.04), which is also statistically significant but relatively lower than the other models.
Multicollinearity:

All three models exhibit multicollinearity to some extent, with high VIF values for the constant term in each case.
Model B and Model C include % Inactivity as an independent variable, which may contribute to multicollinearity in these models.
Relationships:

Model B appears to perform the best in terms of R-squared and overall model significance.
Model A has a particularly high VIF value for the constant, which indicates a potential issue with multicollinearity.
Model C has a moderate R-squared and F-statistic but also includes % Inactivity as an independent variableProject 1 - Progress report - Jupyter Notebook

Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *