I would like to use the arrow package to work with csv.gz files. However, I am getting the following error. I downloaded SafeGraph monthly patterns for 2020. My directory has a folder for 2020 and 12 subfolders for each month, for example: data/2020/1 Arrow opens the database with all the files but…

I hit the same issue, seems like long fields (i.e. geometries, etc.) need more block space than the default settings. You can adjust the block_size parameter inline : dat <- open_dataset(data_path, format='csv', partitioning = c('month'), block_size=1e9) 1e9 did the trick for me, though for good pr…

Error with Arrow package to read in multiple SafeGraph csv.gz files

Documentation R

evan-barry-dewey (Evan Barry) March 22, 2023, 3:31pm 2

@Christian_Gunning_University_of_Georgia put together this great tutorial on processing SafeGraph in R with Arrow. Hope this helps!