Is it possible to use the naics codes to measure foot traffic in K-12 schools over the past year or so?

I suspect my question has been asked and answered so apologies in advance if this has been covered elsewhere. Is it possible to use the naics codes to measure foot traffic in K-12 schools over the past year or so? In theory, does the data include every single school in the United States or would there be significant missing data? If anyone has tried this, have you also tried to use coordinates to match schools to other databases like NCES’ etc. to bring in other data (e.g., demographics, school funding). I am trying to figure out the most efficient way to determine whether databases that identify school districts as doing in-person vs. remote learning, are accurate. Thank you.

@Michael_Hartney_Boston_College I haven’t done it, but if you need some help doing it, I’m curious about the relationships. I think doing this next years when NAEP scores are out would be quite interesting.

I’m not sure if it would be possible to link the NAEP microdata to school building level data that measures how “open” schools were but that would be really neat. I suspect states’ own assessment data will be less reliable this year since many are simply not going to do high stakes testing…

@Michael_Hartney_Boston_College I guess the SafeGraph figures would have to be aggregated at the school district level?

I have checked a few public schools in Washington State, and I noticed that some of them don’t have the right NAICs code (so you might need to look for the schools by name) or name (so you might need to look for schools by street address), and small schools might not have enough traffic to measure accurately. But big schools, and maybe those with older students who are more likely to have phones with apps, do seem to have enough traffic - the ones on hybrid schedules have about half the weekday traffic now compared to February 2020. But if you are just interested in quick high-level statistics, the data might be good enough even with some misclassification.

@Michael_Hartney_Boston_College @Dennis_Chao_Institute_for_Disease_Modeling This sounds like the perfect use case for <#C0194MACBFC|placekey>. Michael, if you’d like to combine Patterns data with NCES data, in-person vs remote data, and other data, I would strongly suggest you use placekey. If you are unfamiliar or have any questions, check out the #placekey channel or let me know!

Also relevant: check out this thread about the age restrictions of any data SafeGraph receives. In short, elementary students would not show up in SafeGraph patterns because when SafeGraph receives the anonymized data it’s already limited to ages 13+

Thank you Ryan and Dennis. Those are helpful thoughts. I am just getting my feet wet with the SafeGraph data so I’m trying to get up to speed quickly. I figure that teachers/staff should be enough to pick up cell data to compare year over year to measure which schools are opening at full capacity in person (versus those that aren’t).

I will have to look into placekey - appreciate the tips

@Michael_Hartney_Boston_College You’re welcome. Here is a link to a placekey demonstration joining datasets on public schools with Esri that may be useful. There are several other videos at that link, including a general introduction to placekey.

@Michael_Hartney_Boston_College Not sure if teachers/staff alone are enough. I think SafeGraph has roughly 3-5% population coverage, so a school with a staff of 100 could have 3-5 phones of staff members. Also, if a school is on an A/B schedule, it could have the same staff as a regular school but half the students. Finally, can’t distinguish between teachers and students in the data.

@Ryan_Kruse_MN_State Thanks for the reminder on the restriction on kids<13. That would explain why the elementary school traffic was so low.

These are really helpful thoughts everyone is sharing here. This would seem to suggest that high schools would be the one category where the data would “speak” clearly (so to speak). On the other hand, high schools are more likely to be remote based on several datasets I’ve seen. So, evidence of high schools being “in person” (lots of foot traffic) would most likely understate the degree to which elementary kids are in school too. Just a thought.

@Michael_Hartney_Boston_College Yes, I agree with your thinking. I still think some differences could show up in elementary schools, so they may be worth looking at too, but I would expect the differences to be less pronounced than those in high schools. I’m looking forward to your findings!