November 9th, 2020

I do hold the philosophy that the best cache is no cache. But sometimes, caching is the exact right tool for the job.

When it comes to user experience, one metric I hold in the highest regard is page load speed.

I want my pages to load in the least amount of time possible and so from time to time, this requires attention.

A database we licence has a 64-bit join on a BIGINT and despite the table being indexed properly, I cannot seem to get this join to perform nearly as fast as the 32-bit joins on INT fields.

What is necessary for the page to provide a great user experience is a list of products along with their respective attributes.

The attributes can be loaded at the rate of ~30 products per second with ~100 attributes/product.

A page may load 500 products which would result in significant unsustainable server load with a page load time of ~17 seconds. No good.

By caching these attributes on a product/product basis, we can lump all these attributes into a single json field per product.

And since we are caching we may as well cache as far as we can to provide the best user experience to the customer.

Each of our clients weighs attribute importance differently and elect to neglect some while making other more prominent. There are also various prices a customer may have access to. The products themselves may have price and/or inventory changes which means the list of products on display can change regularly as well. Finally, there may be multiple languages and so these attributes need to be cached for each supported language as well.

Caching all of this had several optimizing iterations. Understanding that pulling the attributes out for each product on the BIGINT join was what created the majority of the load was key to building the caching system out in such a way that the focus could be spend on minimizing those operations and reusing existing cache as much as possible.

The products themselves do gain new or edited attributes over time. So determining when a product needs to update it's cache is important. Making these attributes available to all clients is not urgent, but should be done in a matter of days.

The most important aspect is to keep prices and inventory up-to-date as this can cause terrible user experiences otherwise.

So basically, we cache the product attributes whenever they are updated and we re-cache the clients category whenever there is pricing/inventory changes on any single product in the category.

The result is about ~500K ajax call which returns a massive JSON array which contains all attributes the products.

There currently exists further optimization whereby I should be able to reduce the size of the data returned by what I expect would be 50% in filtering down the attributes to only those which the dealer holds in high regards. I should be addressing this in the next couple months.

Posted In:

ABOUT THE AUTHOR:
Software Developer always striving to be better. Learn from others' mistakes, learn by doing, fail fast, maximize productivity, and really think hard about good defaults. Computer developers have the power to add an entire infinite dimension with a single Int (or maybe BigInt). The least we can do with that power is be creative.