I am working with all placekey’s in California and see the following distribution of closed_on values:
The spike at 2020m1 and 2021m10 have been explained here:
I am trying to understand the very large spikes at 2022m3 and 2022m4 and the smaller spike at 2022m6.
Hi Angela, I also checked the docs and couldn’t find an obvious answer for this. I reached out to the SafeGraph teams and they’re looking into it. Hope to have an answer back in the next few days.
Hi Evan, thank you for looking into this and reaching out to the SafeGraph team! Looking forward to hearing back.
Hi @Angela_Ma_Harvard_University thanks for your patience. The SafeGraph team looked into this for a while but couldn’t find a technical reason that would have caused the anomaly. Sorry we couldn’t find a concrete answer. These is a chance it’s representative of what was happening in the real world.
Hi @evan-barry-dewey , thank you & SafeGraph for looking into this! It’s helpful to know that there’s no known technical reason for the spike and that it may be representative of real-world business closure.
Hi @evan-barry-dewey, I investigated cases of closed_on = 2022-03 and 2022-04 because the spike is so large. It looks like closed_on is accurate for businesses that have non-missing visits data (i.e. raw_visit_counts is filled in at some point for the placekey) but closed_on is incorrect for the other businesses. Specifically, in the incorrect cases, the business closes well before 2022. Here are a few examples:
placekey = 222-222@5vg-7gw-5fz
placekey = 224-222@5vg-7gt-t9z
Wow, great investigation. Thanks @Angela_Ma_Harvard_University. I’ll report this back to the SafeGraph team so they can take a look as well.