Does anyone have any recommendations for dealing with the json object that stores information on visitor home cbgs?

Hi all, does anyone have any recommendations for dealing with the json object that stores information on visitor home cbgs? I’m trying to construct a measure of what percent of visitors in a given county came from another county. I came up with a really hacked solution, but I imagine there’s a faster way to parse through these n=1MM weekly datasets, lol.

Hi @Stan_Oklobdzija_CA_YIMBY are you looking for a quicker way to explode the jsons or do the actual calculation?

More the latter, but I assume to do the calculation, one must explode the jsons?

Yes, I was going to recommend you check out the safegraph_py library for both single core and multicore explode functions that are pretty quick ( https://github.com/SafeGraphInc/safegraph_py)

If you are looking for a faster solution to the post explode part, i would love to see your code and see if i can help

I managed to do so with some pretty hacky R code. But it’s quite slow and I imagine there must be a better way to do it. I’ll take a look at those libraries, thanks!

Oops just assumed you were a python user. Sorry about that. Here is a link to the safegraph R package

https://github.com/SafeGraphInc/SafeGraphR

It’s ok! I can do both, (though I’m better at R.) No need to apologize!

Looks like the python library handles exploding te jsons much better.

I dont have a ton of experience with R, but i feel like python goes faster sometimes

Yeah, it looks like there’s prepackaged functions to turn the jsons into a pandas DF, but nothing equivalent for R.

Oh wait! Looks like expand_cat_json is what I as looking for!

Just so you know, there are quite a few R users in the community - they would likely be able to help you with any of your R needs

<#C013B8TSETG|r-troubleshooting>

Oh, I didnt know about that channel. Thanks!

no problem!

I’ve previously used R code that I modified from @Derek_Ouyang_Stanford to expand the visitor origins json column. This is their website that has lots of code examples: covid19
Inside the “safegraph_normalization_function.R” file is a function called “expandOrigins” that may help you.

Thank you!

Hello @jack_lindsay_kraken1 @Stan_Oklobdzija_CA_YIMBY I’m interested in your question and wondering if you could assist with python code snippet to measure percent of visitors in a given county that came from another county to visit POIs. I have exploded the json string object (using vertically_explode_json() function) but discovered that once I did so, the dataframe was larger than the previous size (e.g., ~5000 to ~650000 rows).