GWS data download issues

Hi Dewey Data Team,

I hope you are doing well! I encountered a technical issue while downloading data from GWS (U.S. Mobile App Engagement (Magnify by GWS)) and would appreciate your help.

Following the instructions on Bulk Data Downloading in Python (API v3), I used the following command to download the data:

ddp.download_files0(apikey_, product_path_, "E:/Dewey")

The code runs successfully, and the data from January 2019 to June 10, 2020, is intact. However, all files after June 10, 2020, are empty (1KB in size). Using files_df = ddp.get_file_list(apikey_, product_path_, print_info=True), I can see all files up to 2024 (and there shouldn’t be empty files), with a total size of approximately 4TB. Unfortunately, due to the empty files, I am only able to download about 1TB.

I have attempted this multiple times with the same result. Could you please assist me in resolving this issue so I can successfully download all the data?

Thank you so much for your help!

Best regards,
Jack

1 Like

Hi @fxi,

Sorry for the inconvenience here. One potential fix here is try the .download_files1( ). This collects a small page (group) of file links and download them, and move on to the next page and download them, and so on. This helps the collected links to be valid while downloading. So, it is recommended to use download_files1 for a large number of files that may take over 24 hours to download.

This is outlined in the GitHub here.

We are currently in the process of testing this method, but I wanted to bring it to your attention to try.

Let me know if you have any other questions.

Thanks!