June 27th, 2020

Work period:

June 10 through June 30, 2020.

Overview:

CSV files from 17 different sources/formats are needed for regular inventory and price updates. Each CSV file can have upto 750K products resulting in upto 1.5M database (stock + price) writes. There are ~120 CSV files to be processed.

We process these files after hours in an 8 hour window off-peak.

Some processes were exceeding more than the 5GB PHP memory_limit.

The processes were run one at a time in a queue. The cumulative time of running all of these processes was 1526 minutes. (25.4 hours). We want to run these processes in off hours which is an 8 hour window 6 days/week.

Goals:

  • Minimise RAM usage
  • Run processes in parallel
  • Minimise runtime
  • Minimise DB Writes

Results:

ProblemBeforeAfter
RAM Usage> 5GBAfter 250MB*
Runtime (Serial Mode)25.4 hours2.5 hours*
# DB Writes5,707,5353,565,553
*250MB was a preliminary result, but I later decided to increase RAM usage to optimize speed

Posted In:

ABOUT THE AUTHOR:
I am TSLA Long. Model 3 Owner. Brother of a Model 3 owner. Son of a Model S owner. I have reservations for Slate Roof and Cybertruck. I am a Tesla speculator and fanboy. I am not a financial advisor. Investing in anything comes with inherent risk.