Initial Tuning Binary

We rewrite the customer binary in a form which makes testing and tuning efficient. All parallel loops are tuning candidates, and they are written to the binary in both serial and parallel form, with a switch controlled by an envoronment variable.

The tuning pass helps select which of the competing parallelization alternatives should be used in generating the final production binary. All loops which have no nesting relationship can be tuned simultaneously, but loops which are nested within one another require an iterative tuning methodology. This is because it is not always best to simply parallelize the outermost loop; sometimes the best performance is acheived only with nested parallelism. Therefore, the tuning procedure must try various combinations of parallelism at different levels until convergence to the optimum performance configuration occurs.

Initial Tuning Binary

We rewrite the customer binary in a form which makes testing and tuning efficient. All parallel loops are tuning candidates, and they are written to the binary in both serial and parallel form, with a switch controlled by an envoronment variable.

The tuning pass helps select which of the competing parallelization alternatives should be used in generating the final production binary. All loops which have no nesting relationship can be tuned simultaneously, but loops which are nested within one another require an iterative tuning methodology. This is because it is not always best to simply parallelize the outermost loop; sometimes the best performance is acheived only with nested parallelism. Therefore, the tuning procedure must try various combinations of parallelism at different levels until convergence to the optimum performance configuration occurs.

© Copyright 2009, All Rights Reserved | Contact Us