I have a quick question about the number of "total devices seen" available in normalization-stats.csv vs. "number of devices residing" available in home-panel-summary.csv

Hi there Ryan @Ryan_Fox_Squire_SafeGraph and Jack @Jack_Lindsay_Kraken1, I have a quick question about the number of “total devices seen” available in normalization-stats.csv vs. “number of devices residing” available in home-panel-summary.csv (referring to the Weekly patterns data). “total devices seen” is daily and “number of devices residing” is weekly. Besides this, what is the difference between these two? Are there devices that are not residing anywhere? If not, then the sum of “number of devices residing” across all CBG’s should be a good weekly measurement of the weekly total devices seen, but perhaps it’s not so simple… Many thanks in advance for your reply!

Hi @Etienne_Lale_University_of_Quebec_at_Montreal, you’re pretty much on the right track. First, (in case you haven’t seen it) the documentation page shows how each variable is defined. There are some fundamental differences:

  1. home_summary_file.number_devices_residing is the number of devices that have a primary nighttime location in the given CBG. For weekly Patterns, it is given weekly.
  2. On the other hand, normalization_stats.total_devices_seen gives the number of devices that had a visit in the given STATE. It is given daily. (It is also given in total for all states).
    Additionally, some devices cannot be confidently given a home location, which is assigned using a six-week rolling window. So…number_devices_residing will not necessarily give you the same number as total_devices_seen . Important note: number_devices_residing is commonly used for “cbg-level micro-normalization” to try correcting for changes in SafeGraph’s underlying panel over time.

Great, many thanks @Ryan_Kruse_MN_State, very useful!!

Hi @Etienne_Lale_University_of_Quebec_at_Montreal, thanks and same to you! There was a recent change in the methodology of how total_devices_residing is determined. A general announcement is made here. Included in this methodology change was an update to how the home_panel_summary is determined. Please see this message which addresses the change. Finally, please check out the release notes for December 2020. The goal of these updates is to provide more consistent normalization over time.

I believe the first linked message gives the solution to what you are seeing (the other links provide additional information/context). For backfilled data using the new method, you will need to use data from sg-c19-response/weekly-patterns-delivery-2020-12-backfill/ for the weekly data from Jan 01 2018 to Nov 25 2020. For weekly data using the new method from Nov 26 2020 on, download from sg-c19-response/weekly-patterns-delivery-2020-12/weekly/ , which also uses the new method. Then you will have consistent methodology for your entire dataset.

Is this helpful? Please let me know if you have any questions!

Hi @Ryan_Kruse_MN_State, thanks very much for you reply, and thanks for all these useful links. Quick follow up question: have the home_panel_summary.csv and normalization_stats.csv provided in the SG catalog repository been adjusted all the way back, i.e. have they been replaced with csv files that contain the counts of devices seen and home devices based on the new methodology? (Concretely, do I need to download all these files again to get a time-consistent measure of the devices seen and home devices :confused:?) Many thanks in advance for your reply! :blush:

Many thanks @Ryan_Kruse_MN_State, this is extremely useful to know!