I have a very embarrassing question: why is my computers (both MacOS and Windows) reading SafeGraph data (monthly patterns csv.) so low?

Jifan_He · January 1, 2021, 12:00am

Hello, everyone, I have a very embarrassing question: why is my computers (both MacOS and Windows) reading SafeGraph data (monthly patterns csv.) so low? I confessed the data in MacOS is automatically uncompressed (file ended with “.csv”), and it took hours or even a whole afternoon to complete the reading in R software. Is that reading an unzipped file an important part here? Or do I need a better computer (better processor type and RAM memory)? My Mac’s storage is 500 GB, processor type is 1.3 GHz Dual-Core Intel Core i5, RAM Memory is 8 GB, and my Mac version is macOS BigSur Version 11.0.1. My Windows is Windows 10, processor type is Intel(R) Core™ i5 - 7300U CPU @ 2.60GHz 2.71 GHz, RAM Memory is 8 GB, and storage is 256 GB.

Jack_Lindsay_Kraken1 · January 2, 2021, 6:17am

Hi @Jifan_He do you mean reading 1 monthly file or all months?

Jifan_He · January 2, 2021, 7:58am

1 monthly file

Jifan_He · January 2, 2021, 7:58am

only

Difang_Huang_Monash_University · January 2, 2021, 9:41am

I have the same feeling.

Jack_Lindsay_Kraken1 · January 2, 2021, 5:19pm

Are you both writing your own functions or are you using the safegraph_r library? It may not make a difference, I am just trying to narrow it down.

For 1 monthly file it should take a few mins to read in. At least with Python. @Nick_H-K_Seattle_University can speak more on the read times of R

Could you let me know which month in particular you are reading so I can test it out?

Nick_H-K_Seattle_University · January 2, 2021, 6:02pm

The files are quite large, that’s really what it comes down to. If you are using R, tidyverse really isn’t up to the task. You should switch to data.table and fread instead (or SafeGraphR which uses data.table internally) .

Jifan_He · January 2, 2021, 10:57pm

Ok. I will try that out. Thanks!

Jifan_He · January 4, 2021, 2:47am

Oh, thank you so much!