In today’s work, I successfully updated my code and constructed linear regression models for all three categories:
- Inactivity vs. Obesity predicting Diabetes.
- Inactivity vs. Diabetes predicting Obesity.
- Obesity vs. Diabetes predicting Inactivity.
However, I am currently facing an issue with calculating confidence intervals & p values for these linear regression models. I’ve been troubleshooting this problem but have not yet found a solution. My goal is to refine my code and proceed with the analysis.
For the analysis of my linear regression models, I plan to follow these steps:
- Calculating the p-values: Resolve the issue with calculating p-values to determine the significance of each coefficient in the models.
- Calculating confidence intervals: Once the p-values are successfully calculated, estimate confidence intervals for the coefficients to understand the range of potential values.
- Using metrics like R-squared: Evaluate the goodness-of-fit of the models using metrics like R-squared to measure how well the models explain the variation in the dependent variable.
- Performing cross-validation: Implement cross-validation techniques to assess the models’ generalization performance and identify potential overfitting.
- Finding collinearity: Detect and handle multicollinearity among independent variables to ensure the models’ stability and interpretability.
I’m actively working on resolving the issue with p-values , confidence intervals and progressing with the analysis of these linear regression models.Project 1 - Progress report - Jupyter Notebook
Leave a comment