Hi - Most of the U.S. datasets being used for COVID-19 research (other than SafeGraph) provide data only down to the county level. Are any of you accessing or creating data sets at the sub-county level (census tract, zip code), whether it be COVID-19 case data or other data on demographics or other contextual features?
Are any of you accessing or creating data sets at the sub-county level (census tract, zip code), whether it be COVID-19 case data or other data on demographics or other contextual features?
Hi @Lorene_Nelson_Stanford - our focus is on supporting Philadelphia community-based organizations, so we have been keeping a history of the daily snapshots made available by the City of Philadelphia. We are developing community-level risk metrics based on zip code level testing results and cbg/poi-level mobility. The daily data can be found here: GitHub - ambientpointcorp/covid19-philadelphia: De-identified, aggregate datasets showing COVID-19 cases, hospitalizations, deaths and vaccinations by date, zip, or age/sex/race as made available by the City of Philadelphia through its Open Data Program.
We have daily automated jobs pulling the data via OpenDataPhilly’s APIs. It’s easy to do for one or a few cities, maybe more challenging to scale given the variety of interfaces, aggregation levels and formats across different counties.
Hi @Lorene_Nelson_Stanford, I’ve been working with @Derek_Ouyang_Stanford and others on exploring the relationship between case growth in the Bay Area and Safegraph metrics for movement. We’ve recently been focusing in particular on Alameda County, because they have daily case data at a zip code level. We’ve been using this data to create visualizations and analyses like the dashboard you can see here [here](https://stanfordfuturebay.shinyapps.io/cases_visits_dashboard/>. We’ve also been looking at connections between zip code averages of demographic variables and case counts with simple linear regression models, some examples are <https://stanfordfuturebay.github.io/simone_cases_social_distancing_07.html#demographic-correlations).
The LA Times’s database of cases by place also includes zip code-level case data for San Francisco County over time, though we have noted some issues with data on some of the dates.
Lori, you should connect with @Jason_Lally_City_and_County_of_San_Francisco, Chief Data officer. He shared this with me Thursday:
"Attached is the policy document from DPH. I haven’t added to the portal yet, but will so under each dataset where there’s a privacy rule applied we’ll point at the document. As reference, under HIPAA there’s 2 methods of acceptable de-identification, safe harbor and expert determination: https://www.hhs.gov/hipaa/for-professionals/privacy/special-topics/de-identification/index.html#standard
In this case we’ve followed the Expert Determination process to clear data for release. The expert at DPH in this case is an epidemiologist with peer review internally by others. It is a bit more of a process and compliance headache than applying the safe harbor rules, but in this case worth the effort as far as I’m concerned. Especially because equity and disparate impacts are an important lens on understanding the virus."
@Derek_Ouyang_Stanford Hi Derek! Thank you. Yes, I do remember connecting with you previously in 2017 re: PHS pilot grants. Thanks for your helpful replies. I will respond next to both you and Simone giving some background on a new public health data platform Google Cloud is helping us build.
@Derek_Ouyang_Stanford @Simone_Speizer_Stanford It is so nice to hear from both of you. You are doing exciting work that could really benefit work I am doing with support from Google Cloud, and I am also hoping that our work can also benefit others in within Stanford, including your group. My team in the new Dept of Epidemiology and Population Health has received a major donation of senior Google Cloud engineering effort to build a data ecosystem and dashboard viewer on the NERO HIPAA-compliant system, containing very detailed contextual data on demographics, health characteristics, environmental factors, etc. (down to census tract/group or zip code level, depending on data source) for all 58 counties in California. It is premised on the idea that the front line of managing the pandemic is at the sub-county level, yet most data sources developed to date in the US stop at the county level. It will serve as a resource for public health departments (who are terribly under-resourced for accessing/visualizing their own data), but also for Stanford researchers to develop prediction models for SARS-CoV-2 infection and other Covid-related adverse outcomes, both direct effects of SARS-CoV-2 infection related and secondary effects even among those not infected. We will be connecting with each county to determine how we can include their COVID-19 data, whether it be in a view that only they can access, or, if they are open to collaboration, we can arrange to ingest it and use access controls to give approved researchers access. We have one county already where we have a BAA and are able to receive individual identified data, including home address which will be great for lat/lon localization. Would you be interested to jump on a Zoom with me and my team which includes David Rehkopf, the Co-Director of the Stanford Center for Population Health Sciences, my postdoc epidemiologist, Hoda Abdel Magid who is a spatial epidemiology expert, and others? I would like to also invite David Grusky who I know a little bit, as well as @Serina_Stanford if she is interested.
Thanks so much @Ariel_Rodriguez_Ambient_Point_Corp! We are trying to set up a data platform for all 58 counties in California, and you are right about the challenges to scale given the variety of interfaces, aggregation levels and formats across counties. I’d love to hear more about how you are supporting community-based organizations. Do you have a document or a link?
@Lorene_Nelson_Stanford that sounds great, yes! Also flagging this for @Cansu_Stanford
@Simone_Speizer_Stanford Thanks! @Cansu_Stanford Let me know if you are interested in joining a zoom to discuss the COVID-19 public health platform we are building.
Hey @Lorene_Nelson_Stanford, thanks @Simone_Speizer_Stanford for pinging me on this. I’d love to join! I’ve been meaning to reach out to Hoda Abdel Magid so this would be great!
Wonderful, Cansu - I will definitely invite you to our Zoom. Look forward to connecting.