I have been working with the Places Patterns Monthly Dataset, and I have some questions regarding the `raw_visit_count` metric

Hi all,

I have been working with the Places Patterns Monthly Dataset, and I have some questions regarding the raw_visit_count metric.

I am aware that this metric is not scaled or normalized in any way. Nevertheless, many of the POI visit counts seem unreasonably low.

For example, during the years 2018 and 2019, I note that -
sg:00d096c86cb64771914acb41e72577f0, a Prada store on 301 Canal St. in New Orleans has a mean of 1 visit per month

sg:062d8f0d70604ad2808d680f70050c0b, a Smoothie King on 500 Port of New Orleans Pl #C has a mean of 1 visit per month,

sg:29f32ae3874a427281bf4b7daa18d390, a New Balance retailer on 500 Port of New Orleans Pl has a mean of 1 visit per month,

sg:350bbaa135374ea8ae6a67168f55a82f, a YMCA on 2220 Oretha Castle Haley Blvd in New Orleans has a mean of 1 visit per month,

sg:26b8ca382fcc4184822167f754837e76, a T-Mobile on 2700 S Claiborne Ave #300 in New Orleans has a mean of 1.5 visits per month.

There are many other examples.

It doesn’t appear that these locations were closed at the time of the sample.

I am also wondering what happens when an active POI gets no visits during a given month. In this case, it appears that the entire row is omitted from the data. Is this correct?

Many of the low visit count POIs do not have a full 24 months of observations. However, these POIs appear to drop in and out of the sample, which implies that they were not closed for the missing months. If these missing months are actually months in which SafeGraph observed no visits at all, the mean visit counts that I report above are overestimates of what SafeGraph observed, as I omit months in which no visits occurred. This makes the POI visit counts listed above seem even more unlikely.

This topic was automatically generated from Slack. You can find the original thread here.

Hi Thanks for reaching out! We are looking into it and will get back to you once we have an answer.

  1. Yes, if the POI gets no visits in the month the row is omitted from the data. You’ll often see POIs with very low visit counts drop in and out of months in the data.
  2. Remember that our panel of devices is a subset of the whole U.S. Population, so when you get to the level of individual POIs, our panel ends up having varying coverage of the people who actually visit such places. These are outliers, for sure, and would recommend omitting them if you were doing any brand or category-level aggregation (e.g., if looking visits to all Prada stores, omit this particular POI).
  3. See below for different threads where this has been asked a few times:
    a. Slack
    b. Slack

Thanks, , for this information. I noticed this post in the second thread you linked, regarding the 2018 data. I was surprised by the claim that the 2018 data “was created via Backfill, which is dependent on patterns seen later / expected, not necessarily ‘real’ data from that time.” Is this correct? My understanding of the backfill is that it uses raw foot traffic data collected in 2018 and applies it to the most up-to-date Core Places/Geometry definitions, so the 2018 data (while created by backfill) is using real 2018 data, and not imputed data from later in time - am I misunderstanding this?

That doesn’t seem correct to me, and could either be a miscommunication or perhaps a change in how the data were made since that time. Your understand is correct - we do not impute any data, either for 2018 or other years.

Hi ! Just confirming that we answered your question. I’m going to go ahead and close this thread out. If you have any more questions or follow-up questions, we’re always here to help! Just be sure to make a new post to safegraphdata, as we aren’t monitoring old threads at this time. Thanks!