Forum Discussion

tmarshall's avatar
tmarshall
Advisor
9 years ago

Bulk Data - Looping Delay Needed?

Hi NaokiT

 

We are running the automated script now and executing the script every 3 or 4 minutes to be able to catch up from 1/1/2015 to present. The first 6 months the file size is very small and there are no errors, but starting on 7/2015 our community started to generate more traffic, users, and boards (more content) and the bulk data extract files are bigger. We have noticed that when we execute the command that the file will not be created successfully in some cases, but when we manually execute the same command it works fine and the file is created no problem. Is there a minimum expected delay when running the curl script on a loop to back fill data? It seems like 3 or 4 minutes is not enough time as we are getting inconsistent failures in which no file is created for that 24 hour period. Can you please help us understand what is the best timing to use so we do not have the babysit the script to backload all the data? Appreciate your help!

 

Thanks,

Tim

2 Replies

  • NaokiT's avatar
    NaokiT
    Lithium Alumni (Retired)
    9 years ago

    Hi tmarshall

     

    Unfortunately it's hard to guess how long it will take for a 24 hour file transmission to complete as there are multiple factors involved - file size, network latency, cluster response time, etc.  The only way to prevent this is to run a staggered query that would wait till the last transmission stops before firing next 24 hour window.  If you prefer to use the time based approach unfortunately I can't give you a time that will definitely work so you'll have to expand and experiment until the errors stop.

     

    regards

    Naoki

  • tmarshall's avatar
    tmarshall
    Advisor
    9 years ago

    NaokiT Thanks for the feedback! The Informatica BI expert was able to babysit the script, and we managed to get all the data into the database so we are now all caught up to the present. Thanks for your help and look forward to continued improvements to the bulk data api!

     

    Thanks,

    Tim