Hi, given the massive size of the data, I am hoping to get a sense of what other Stata users have (successfully) tried in Stata - has anyone managed to import/append the weekly patterns data in Stata (if so, how long did it take?). Are there any ways to filter the data (e.g., restrict to just one state) before importing/appending all the csv files?
I have not actually worked with SafeGraph data in Stata myself, but I have worked with large CSVs in Stata in the past. Unfortunately, CSV load-in speed is pretty slow in Stata, and import delimited
does not have a keep
option.
What it does have, though, are rowrange()
and colrange()
. If it’s choking on the full file, then I’d recommend finding the number of rows in the data, limiting colrange()
to be just what you need, and then constructing a foreach
loop to load in the rows a few hundred thousand at a time. Once you’ve loaded in a chunk, use keep
to filter to your states of interest, and append the results together. For example:
import delimited using file.csv, rowrange(`rr')
* create state variable from poi_cbg here
keep if state == 8
if "`rr'" != "1:100000" {
append using compiled_data.dta
}
save compiled_data.dta, replace
}```
thank you! I have been trying something similar to this, but thinking of switching to R will share an update if I try anything creative in Stata.