When random isn’t random; you don’t get realistic results

A few days ago, one of our customers ran a test, comparing how well Load DynamiX Enterprise, Iometer, and Vdbench created random data. They were trying to determine which test solution would result in the most realistic models, to use in Flash array performance evaluations.

In production workloads, random data typically contains very few repeating values – hence the term “random.” To build a realistic synthetic workload, if you are simulating random write data, your data creation algorithm must produce truly random data. In this customer example, both Iometer and Vdbench failed miserably at creating random data. Their conclusion: results of tests on Flash systems using Vdbench or Iometer will be wildly inaccurate if the application being modeled consisted of a good deal of random data, which is the case with just about all database and virtualized applications


realistic-resultsThe customer used Iometer, Vdbench and Load DynamiX to create random data, then, using 7Zip with the LZMA algorithm, tried to compress the data. Compression only works if there’s repeating data. For Iometer and Vdbench, there was an overabundance of repeating data. Load DynamiX produced completely random data. If you test Flash systems with Iometer or Vdbench, and your workload contains random data, good luck with making valid analyses. You need to be using Load DynamiX!