Do we have a better solution to such issue?

Fabio_VANNI · May 27, 2021, 12:00am

I noticed that there is an important selection bias in the device_count data, respect to the real number of individuals who live in each census block, I use to fix it assuming that the proportions remain the same in order to recover a balance stratification sampling. But still I guess some bias remains. Do we have a better solution to such issue?

Martin_Andersen_UNC_Greensboro · May 27, 2021, 7:11pm

@Fabio_VANNI what are you seeing there?

Fabio_VANNI · May 27, 2021, 7:12pm

it was an error

Ryan_Kruse_MN_State · May 27, 2021, 8:10pm

Hi @Fabio_VANNI, the Data Science Resources have some info on how you might correct some of this bias.

I would also suggest you take a look at this paper/presentation, which finds that SafeGraph’s panel tends to underrepresent minorities and older people. The work was featured in a past webinar, much like the one you gave today.

The approach we think works best currently is micronormalization by home CBG. This doc is listed under the Data Science Resources I linked to above. Essentially, we scale the number of visitors to each POI from each CBG in visitor_home_cbgs by the CBG’s Census population divided by the count of devices in SafeGraph’s panel that we identify as living in that CBG (home_panel_summary.number_devices_residing ). Is this helpful?

Ryan_Kruse_MN_State · June 1, 2021, 4:00pm

@Fabio_VANNI Following up here too to make sure you saw this message

Fabio_VANNI · June 1, 2021, 4:29pm

yes