Friday – September 29, 2023.

Based on my preliminary analysis, I have concluded that it is not possible to perform time series analysis modeling on a dataset with only one year of data. This is because time series models require a sufficient amount of historical data to learn the underlying trends and patterns in the data. Otherwise, the model will be unable to generate accurate predictions.

I also attempted to perform geospatial analysis on the dataset, as it contains county and state information. However, my code failed to execute because the dataset does not include a geometry column. This column is required for geospatial analysis, as it specifies the spatial location of each data point.

Finally, I tried to use ensemble methods, such as random forests, to gain insights into feature importance and relationships between predictor variables and the outcome. However, ensemble methods are not suitable for small datasets, as they are prone to overfitting.

Overall, I have made significant progress in exploring different modeling techniques for the dataset. However, I need to address the following challenges before I can finalize the modeling techniques and start writing the first report draft.

Project 1 - Progress report - Jupyter Notebook

Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *