Hello, when I imported SafeGraph data into my mac and opened it (csv file), it said it cannot import more than 1 million (I believe so) numbers of rows. Does it mean my data is incomplete when I read it in R or Python? (Or does it a big deal?)
Hi @Jifan_He, would you mind sharing a screenshot of what you are seeing as well as any information you have on the program you are writing either Python or R in?
Oh, thank you so much. I just solved that out.
Yes @Jifan_He, the backfill data is historical data that is generated every 6-12 months!
This is from the FAQ on the Safegraph site:
> Please note that the underlying Places data used to create Patterns changes over time due to the history of how we built and updated the product. Below is a chronological breakdown of the Places release used to backfill Patterns for a given time period:
> Historical Patterns activity from October 2016 through and including December 2016 was generated using the April 2019 release of Places. We no longer externally provide this data.
> Patterns provided/delivered between November 2019 and April 2020:
> – Activity from January 2017 through and including October 2019 was generated using the November 2019 release of Places.
> --Activity from November 2019 through and including April 2020 was based on the Places release of the same month as the activity (so December 2019 activity will use the December 2019 Places release).
> Patterns provided/delivered between May 2020 and November 2020:
> --Activity from January 2018 through and including May 2020 was generated using the May 2020 release of Places.
> --Activity from May 2020 thru and including November 2020 is based on the Places release of the same month as the activity (so June 2020 activity will use the June 2020 Places release).
> Patterns provided/delivered December 2020 onward:
> --Activity from January 2018 through and including December 2020 was generated using the Dec 2020 release of Places. This is the first historical delivery that considers point-in-time POI openings/closures. For example, if a POI opened in January 2019, we will not attribute visits to the POI from January 2018 - December 2018 and will only attribute visits from January 2019 onward. On the other hand, if a POI closed in January 2019, we will only attribute visits from January 2018 - December 2018 and will not attribute visits from January 2019 - present.
> We are relying on the metadata provided by our closed_on
, opened_on
, tracking_closed_since
, and tracking_opened_since
columns to make these determinations. If we do not have open/close information for a POI, we will treat the POI as “open” for the duration of the backfill. See here for more about how we determine POI openings/closings.
> --Activity from January 2021 onwards will be based on the Places release of the same month as the activity (so January 2021 activity will use the January 2021 Places release).
does that help? If you don’t mind me asking, what are you looking to do with the data?
Yeah, your response is very helpful for me!
I am just collecting different datasources (since in Patterns data there are a variable which demonstrated the number of visitors of a business, so I think it might be helpful in my research …)