Duplicate files in safegraph spend patterns

Hi, we purchased the spend pattern data recently. But I found three files named spend_patterns.csv.gz in every month’s folder. What is the difference between them?

The ‘ss’ version was for a sample datasets that was previously automatically added to everyone’s account. I believe it was just the Ohio datasets. ‘SP’ should be the product you ordered.

This will no longer be an issue in the new version of the product we’re launching soon.


Thanks for your reply, another question, how does the hd5sum of a file calculate? I use the following code to check the hd5sum of my download files and found they are not equal to the hd5sum online:

import hashlib
if os.path.exists(path):
            fmd5 = hashlib.md5(open(path, "rb").read()).hexdigest()
            if fmd5 == row["md5sum"]: