I have a question regarding the data. Shouldnt raw_visitor_counts and visitor_home_cbgs be the same value, at least once you add up all the values for each row in visitor_home_cbgs? Isnt visitor_home_cbgs counting the unique visitors, but just by cbg in which they live at? Please get back to me as soon as possible.
This topic was automatically generated from Slack. You can find the original thread here.
raw_visitor_counts: Number of unique visitors from our panel to this POI during the date range.
visitor_home_cbg: A mapping of census block groups to the number of visitors to the POI whose home is in that census block group.
With this information, we see that raw_visitor_counts accounts for all unique visits to a POI whereas the visitor_home_cbgs is limited to the number of visitors to the POI whose home is in that CBG (essentially members who live there).
At times, apparent “discrepancies”, like the ones you noticed, may appear. These are explained simply by our privacy noise algorithm. We add extra noise to the individual visitor_home_cbg counts, to help protect privacy. This jittering can, in some cases, case the sum of visitor_home_cbg > raw_visitor_counts.
raw_visitor_counts is not jittered and is not “adjusted” based on the noise added to visitor_home_cbg.
Does that answer your question?
For more details, you can check out this conversation thread:
Additionally, we only map a CBG to a visitor if we were able to confidently assign a home location to the device. See here for more details on how we determine device home locations: Patterns | SafeGraph Docs
Hi - To prevent any further questions from being overlooked, I’ll go ahead and close this thread out. If you have any more questions or follow-up questions, we’re always here to help! Just be sure to make a new post to safegraphdata, as we aren’t monitoring old threads at this time. Thanks!