SafeGraph Patterns Data - Missing values

I am using the monthly patterns dataset of the NAICS code 611310. I saw that the colleges/universities that are covered in the dataset are not the same for all the years. That is, college “x” appears in 2019, but may not appear in 2020 or 2021, or it may appear in all those years but may be missing in some months. Could you please give me any background information on why this could be the case? How can one university/college be included in one month and may not be there for another?

@kairongarcia thanks for writing in.

You mind sharing the Placekey of a few examples where you are seeing this happen?

Here are a few examples of placekeys of colleges/universities that do not appear for all 12 months in the following years:




May I also add that there are also universities that may appear in one year, but not in another. For example, placekey 222-222@5nx-ywk-9mk appears in 2019 & 2020, but not in the same months. Similarly, Placekey 222-222@5nx-4yd-tsq appears in 2021 but not in 2020.

If you could give me some information about why this happens, it will be great! Thank you very much!

Hi @kairongarcia , I haven’t had the chance to look at these Placekeys in data myself, but I have a couple suggestions.

First, I’m curious if the NAICS code you’re using might be too narrow. It’s possible the NAICS code for those POIs could have shifted around a bit from year to year or even month to month. SafeGraph has made some changes to their algorithm over the years, which could play a role in the issue.

Second, visit attribution with colleges can be a bit tricky because they may consist of “parent” and “children” POIs. I’m curious if this is related to visits being attribute to children POIs instead of a parent, which could result in the parent POI having 0 visits in a month and being excluded from Patterns. There are a couple things to verify here: (1) Do the Patterns files have records with 0 visits? (2) Are there children POIs that, when pulled in, show visits at the college for all expected months?

Hi @ryank, thank you very much for these information. I think I am using the children placekeys; I am seeing a lot of data values here with a blank “parent_placekey”. Regarding the two things you wanted to verify: (1) The minimum “raw_visit_counts” I see is “1”; (2) Not sure how to do this because there are a lot of values without a “parent_placekey” so I couldn’t group the children POIs by their parent. What do you think?