In what scenario would we see larger numbers for the ‘visitor_home_cbgs’ value (for a particular cbg) coming from weekly-patterns than from monthly-patterns?

Hello again - I’ve got another question regarding comparison of weekly vs monthly patterns:
My understanding is that metrics like home cbg, daytime cbg, and median and bucketed dwell times are aggregated differently for the two data products (weekly vs. monthly). In what scenario would we see larger numbers for the ‘visitor_home_cbgs’ value (for a particular cbg) coming from weekly-patterns than from monthly-patterns? I am looking at overlapping time periods, so the first week of July 2018 vs all of July 2018, for example. In general, it seems that number should be bigger in the monthly patterns data because there will be more visitors to a POI in 30 days vs 7 days and therefore more visitors who live in a particular CBG. When would the reverse be true? Can all these cases be explained by the fact that a visitor can be assigned a different home cbg for the month July than for the first week of July?

Hello @Julie The aggregation of the values visitor_home_cbgs should not be greater for the weekly data than the monthly data because as you said, monthly takes into account 30 days while weekly is only addressing 7 days. Visitors will not be assigned different home_cbgs based on weekly vs. monthly data because in both cases user home cbg is calculated through an algorithm that takes into account 6 weeks of user data to determine the user’s home cbg.

Let me know if you have any more questions.

Not sure the best way to share this output other than a screenshot for now, but what I’m showing is weekly-patterns joined to monthly-patterns for July 2018 (joined on POI and date). The visits are the same in each (as expected) but the ‘visitor_home_cbgs’ are not the same (also as expected). What is not expected is that for some CBGs, when you expand the json, the value coming from weekly is higher than the value coming from monthly. I’ll note there are many more cases where the opposite is true (also, as expected) but not sure how to explain the ones above

@Julie is it possible for me to see your code? can you DM me the .py or the notebook file?

sure! I’ll send that to you by end of day, after a couple of meetings here. Thanks!

sounds good :slightly_smiling_face:

Hey Pranav, thinking back to this, could the explanation be related to the masking that happens at the CBG level? With monthly counts sometimes going above the 4-count threshold ‘real’ counts could sometimes become smaller than masked (weekly) counts?