Hello there! I am trying to look at the origin locations of visitors to all pois in my region of interest. I have the following questions below:
-
When trying to do the ‘Micro’ normalization of visitor_home_cbgs to obtain a true population count for visitors to each poi from each origin cbg, I see in a colab tutorial that a method of eliminating all visitor_home_cbgs with less than 5 visitor counts is used, the idea is to eliminate any of the cbgs with 4 visitors because this value could technically be anything from 2-4 actual sampled visitors due to the added noise for privacy concerns. So my question is, if we filter out all visitor_home_cbgs <5, and then apply the eq–>
(SG CBG visitor raw count / SG CBG sample size) x CBG population to get a true estimate of the population of visitors to that poi from that specific CBG, how can we reliably use this if we are filtering out a lot of the visitors to each poi, from that CBG? What sort of significance would an analysis be with this method of filtering out every cbg with less than 5 visitors? -
Similar to Q1, I am looking at the amount of visitor_home_cbgs with counts = 4 in a monthly patterns dataset vs in a weekly patterns dataset. Since the weekly patterns is 1/4 of the sampling period, wouldn’t we expect to see a much larger amount of visitor_home_cbgs counts to be =4 and thus have to be filtered out? Im wondering if doing a study like this would produce vastly different answers if looking at monthly vs weekly patterns due to the relatively large amount of counts equalling 4.
-
Also has anyone tried to do this for thousands of poi’s in a city to generate a map of sorts of origin locations, the process of exploding the visitor_home_cbgs, then counting up each CBG visitor count and adding it to a master list of all CBGs in the country seems like it would be a nested loop nightmare. But I havent been able to find any resources of someone trying to do this.
Sorry if this was confusing, let me know if I should restate my questions more succinctly!
link to colab notebook explaining the process for obtaining estimates of population counts of visitors from each CBG: Google Colab