October 28th, 2020

isStock. That is the field. For each product distributor, they either have stock or they don't.

So it makes sense that one would store this field uniquely by distributor. And in a perfect world, it just might be.

But this isn't a perfect world and when you create such a field that impacts multiple dependencies on that field working properly it better be awesome.

I didn't put that field there. In all honesty, it was the furthest thing from my intentions of things to look at. That is until I realized that field was being maintained by various out of date sources.

There was a use case I was completely unaware of. As a merchant, I buy from a distributor and get inventory from them. But what happens when they stop buying from that distributor? It was not being flagged in the system and the file on the FTP still existed. So periodically a cron would fetch the file from many months ago and update all their prices. AND THEIR STOCK LEVEL isStock flag.

So fix it right? What's the big deal? I did fix it. I added date checking on the file before using it and updating products only if the file is newer than it was last time. But that wasn't the biggest issue.

We (my client) subscribes to a product attribute database and pay a hefty amount to Etilize for that. For some reason, I cannot get the 64 bit keys to join nearly as fast as the 32 bit. I looked over the MySQL config on Amazon's RDS settings, but everything looks like 64-bit.

As such, we found a bit of an upward bound on the number of products/sec which we store. It is about 30/sec to pull out all the attributes.

I am working on an attribute optimization system, but these attributes need to be filtered and such. Massaged by PHP into the right structure (consistent with the Shopify product structure).

So when you have 500 products and you cache 30/second it can take some seconds to re-cache the category.

But when you do cache the category, page load time is damn fast. As fast as the connection can load upto 500KB. Alternatively, loading a category page at the rate of 30/second is completely unacceptable as it would take a dozen+ seconds.

So we cache these categories for each dealer. The ones they want. And we maintain the locale info and pricing and everything in this cache so we can dump & go to town on the page.

But one other thing we cache beside price and attributes is stock level.

BIG DEAL! Now for every client we onboard, we are growing the size of our already massive bottleneck in category caching. That is, when we update pricing/stock for one single merchant, it impacts ALL MERCHANTS.

If we have 25 clients using the same category we have to recache 25 categories instead of just 1. 1 how? By not storing the isStock field at the distributor level, but instead storing it at the dealer level, we can reduce the problem size by the number of merchants who sell in that category.

Looking back, this solution would have also circumvented the issue with merchants having the wrong stock levels due to files being tapped from a 6 month old FTP file. This issue would have also been contained to a single offence.

Posted In:

ABOUT THE AUTHOR:
Software Developer always striving to be better. Learn from others' mistakes, learn by doing, fail fast, maximize productivity, and really think hard about good defaults. Computer developers have the power to add an entire infinite dimension with a single Int (or maybe BigInt). The least we can do with that power is be creative.