Mathematical models or load generation for predictive storage performance analysis?

We attended the 451 Group’s Storage Summit in NYC this week and someone asked me how Load DynamiX modeling compares to using mathematical modeling for storage performance testing and in particular, doing predictive analysis. I offered a couple of observations (actually, I probably droned on far too long). But I thought more about it while flying back to the office.

Where Load DynamiX and mathematical modeling are similar is in the building of an I/O profile model. We both get our input from similar sources and look for similar metrics. But that’s where the similarity ends. In the mathematical modeling approach, a software program attempts to model the storage environment as WELL as the application I/O. The upside is, you don’t need a storage array. But sadly, today’s best mathematical models are not sufficiently realistic enough to account for the myriad performance bottlenecks that happen in the real world. There are dozens of reasons why this is true, and I’ll name just one: the storage array vendors themselves have hundreds of internal proprietary interfaces, APIs, cache algorithms, buffer handling, dedupe and compression techniques and other flow controls. No outside vendor can know how these will react together to predict the performance of complex workloads. Add to that the effect of the switches, multi-pathing, failovers, etc. and there’s no hope of simulating real-world performance. Having said all that, these mathematical models are better than pure guesswork, and in some simple or lucky cases, may even be somewhat useful. But note that even the storage vendors themselves, who have access to tons of data, realize the limitations and don’t even try to pretend that they can build an accurate mathematical prediction model.

On the other side, Load DynamiX builds a workload generator that enables you to stress your actual arrays. Of course, the only way to be 100% accurate is to run your actual workload against those storage networks in a production environment and add enough load to actually break the arrays. But that’s not realistic, so it’s NOT the best choice.

The best, most proven approach is to use a product like Load DynamiX which closely emulates the workload and allows you to run the load in a real pre-production environment. Load DynamiX load generation appliances look to the storage network like your actual application load. And we have customers who have shown that our load gen is nearly identical to their actual production workloads, plenty accurate enough on which to base multi-million dollar storage decisions.

Jim Bahn
Director of Product Marketing