@Ryan_Fox_Squire_SafeGraph Hi Ryan (and others). (1) Thank you for all this. (2) Reviewing and thinking about your colab notebook in which you estimate demographic profiles for a brand (aggregated pois), there’s something that’s baffling me. The process for getting a demographic profile for an individual poi makes sense to me. But when you aggregate pois – say for a brand – we can’t sum the unique visitors to each poi to arrive at the total unique visitors to a given brand, right? Because if Starbucks poi1 gets 40 visitors from cbg x and Starbucks poi2 gets 20 visitors from cbg x, I can’t see any way to figure out the extent to which these visitors overlap. If I’m understanding this correctly, then doesn’t this prevent us from knowing what percent of a brand’s visitors come from a particular cbg? Thanks, hopefully this isn’t too convoluted and let me know if I can clarify.
This is correct. There aren’t really any individual-level crosstabs in the data, so it’s not possible to avoid double-counting when aggregating POIs. The baseline approach is effectively to assume no overlap and to recognize that it might be an overcount of users. So it is possible that an increase in visitors from a particular CBG is either an increase in total visitors, or an increase in how many POIs each user from that CBG visits. This should be taken into account in interpretation.
This is also a reason why relative changes over time are generally preferred to absolute counts.
I agree with @Nick_H-K_Seattle_University
I’ll only add that for most places and most brands people rarely visit more than one location per week or per month. Starbucks is a notable exception and some other coffee chains and fast food chains may also see a lot of cross store visits but places like Walmart, or Safeway, or Red Robin Gourmet Burgers, rarely have multiple locations in a particular market.
So I think for many brands it is a fairly comfortable assumption.
@Andy_Martens_NYC_Dept_of_Health is there any useful aggregation info we could provide that would help solve this? Like if we provided a Total unique visitors per brand per month aggregation, would that make a significant difference?
@Ryan_Fox_Squire_SafeGraph @Nick_H-K_Seattle_University Thanks. And thanks Ryan for the possibility of an additional metric to help disentangle things. At this point I’m just starting to get into the data and not sure what kind of poi aggregation would be useful yet (but probably along the lines of NAICS code categorization within NYC as opposed to brands). Great for me to keep that in mind going forward as analyses and ideas crystalize.