[ HPF Home | Versions | Compilers | Projects | Publications | Applications | Benchmarks | Events | Contact ] |
F.R.Pearce@durham.ac.uk Durham UniversityBrief Description of Application:
Number of Lines of Code: 4000
Target Platforms and HPF Compilers Used:
Coding Styles (data decompositions, computational methods):
Extrinsic Interfaces Used (and reasons):
Performance Information, if Available (including any possible comparisons to MPI and/or OpenMP):
In table 3 I show the scalability of the HPF code. Things look good, the same routines are tricky as for the Craft code. Overall the 16->64 PE scaling is 1.59 as opposed to 1.72 for the Craft. The problem area is the list sort. The refine time drops out of the scaling as it is typically only done every 10 steps. (I.e. the times for "refine" and "refinements", which are significant in the times below for just one iteration, end up being completely dominated by "mesh" and "shgrav" in a typical production run - DSM). Table 3: HPF code scalability PE's 4 8 16 32 64 mesh 32.4 17.8 9.0 6.7 4.1 list 7.3 9.2 9.7 13.3 10.7 refine 12.1 13.4 14.4 18.1 18.9 shgrav 71.9 39.6 21.9 14.2 6.5 clist 2.4 2.9 3.1 4.0 3.4 refinements 35.7 22.9 14.9 12.1 11.8 Total 161.2 106.1 74.5 68.6 55.6 Finally, timing the shmem+gas code on the same position yields table 5. This is approximately twice as long as the HPF code with refinements at the same point (see table 3). Table 5: shmem code, 16 PE's, T3E-900, No refinements mesh 0.95 shgrav 141.9 Total 153.1 (143.2 after 1st step)
Please comment on any aspects of the application that might be interesting, including any problems using HPF effectively:
URL
©2000-2006 Rice University | [ Contact Us | HiPerSoft | Computer Science ] |