In what scenario would we see larger numbers for the ‘visitor_home_cbgs’ value (for a particular cbg) coming from weekly-patterns than from monthly-patterns?

Julie · June 23, 2021, 12:00am

Hello again - I’ve got another question regarding comparison of weekly vs monthly patterns:
My understanding is that metrics like home cbg, daytime cbg, and median and bucketed dwell times are aggregated differently for the two data products (weekly vs. monthly). In what scenario would we see larger numbers for the ‘visitor_home_cbgs’ value (for a particular cbg) coming from weekly-patterns than from monthly-patterns? I am looking at overlapping time periods, so the first week of July 2018 vs all of July 2018, for example. In general, it seems that number should be bigger in the monthly patterns data because there will be more visitors to a POI in 30 days vs 7 days and therefore more visitors who live in a particular CBG. When would the reverse be true? Can all these cases be explained by the fact that a visitor can be assigned a different home cbg for the month July than for the first week of July?

Pranav_Thaenraj_UCSD · June 23, 2021, 6:14pm

Hello @Julie The aggregation of the values visitor_home_cbgs should not be greater for the weekly data than the monthly data because as you said, monthly takes into account 30 days while weekly is only addressing 7 days. Visitors will not be assigned different home_cbgs based on weekly vs. monthly data because in both cases user home cbg is calculated through an algorithm that takes into account 6 weeks of user data to determine the user’s home cbg.

Let me know if you have any more questions.

Julie · June 23, 2021, 6:45pm

Not sure the best way to share this output other than a screenshot for now, but what I’m showing is weekly-patterns joined to monthly-patterns for July 2018 (joined on POI and date). The visits are the same in each (as expected) but the ‘visitor_home_cbgs’ are not the same (also as expected). What is not expected is that for some CBGs, when you expand the json, the value coming from weekly is higher than the value coming from monthly. I’ll note there are many more cases where the opposite is true (also, as expected) but not sure how to explain the ones above

Pranav_Thaenraj_UCSD · June 23, 2021, 7:07pm

@Julie is it possible for me to see your code? can you DM me the .py or the notebook file?

Julie · June 23, 2021, 7:12pm

sure! I’ll send that to you by end of day, after a couple of meetings here. Thanks!

Pranav_Thaenraj_UCSD · June 23, 2021, 7:12pm

sounds good

Julie · June 25, 2021, 4:41pm

Hey Pranav, thinking back to this, could the explanation be related to the masking that happens at the CBG level? With monthly counts sometimes going above the 4-count threshold ‘real’ counts could sometimes become smaller than masked (weekly) counts?