In the pattern data sets, why is there a difference between the raw_visitor_counts column and the visitor_home_cbgs column? For example, in the first row in the sample Minnesota, USA Patterns dataset, there are 10 raw_visitor_counts but only 8 total visitor_home_cbgs. What causes this difference? Thanks!
To preserve privacy, we apply differential privacy techniques to the following columns:
visitor_home_cbgs
,visitor_home_aggregation
,visitor_daytime_cbgs
,visitor_country_of_origin
,device_type
,carrier_name
.
We have added Laplacian noise to the values in these columns. After adding noise, only attributes (e.g., a census block group) with at least two devices are included in the data. If there are between 2 and 4 visitors this is reported as 4.