Identifying regional COVID-19 presence early with time series analysis

I did not use SafeGraph data in this paper, but as a member of this community I’d like to share my MSc capstone research, which was recently accepted for publication in IOP SciNotes. The paper was co authored by my advisor Suboh Alkhushayni and is available here: https://doi.org/10.1088/2633-1357/aba739.
Titled Identifying Regional COVID-19 Presence Early with Time Series Analysis, we used CDC Influenze-like Illness Surveillance Network data to find evidence of COVID-19 presence in the US in late December 2019/early January 2020. We used three methods of analysis. First, we forecast prediction intervals using data until mid-November 2019 and compared the predictions with observed values for the subsequent 16 weeks. Second, we performed residual hypothesis testing by removing the trend and seasonality in order to compare residuals from before and after November 17, 2019. Third, we used changepoint analysis to identify major changes in trend and seasonality. The purpose of the study was not to identify specific states, but South Dakota has the strongest evidence, followed by California, Delaware, Maine, and New Mexico.
Summarized conclusion: Combined with the knowledge that COVID-19 was spreading across other parts of the world, anomalous patterns in CDC ILINet data should have been a warning sign that COVID-19 was already spreading in the US as early as December 2019.
Feedback is welcome!