Strong scaling results
A 128x128x128 cubed 3D Cartesian convection calculation running on 1-1024 cores of the Magnus Supercomputer in Perth, WA.
The graphs above show total work, actual wall time, and parallel efficiency for a pure FE version of the problem compared to one in which we use particle-in-cell based integration. The PIC method is more memory intensive so there is no single-processor result. In computing relative efficiency figures we compare to the 1 processor result for the FE case. For the PIC case we assume the efficiency for 8 processors is the same for PIC and FE and scale all values accordingly.
- The introduction of the python wrappers has not significantly influenced the parallel behaviour and the results are closely in line with our experience of Underworld1 on similar machines.
- The code running in pure finite element mode has useable efficiency (>40% parallel efficiency) to 512 nodes.
- The particle in cell mode has a higher overhead in general and the efficiency falls off sharply after 256 nodes.
In interpreting these results, please bear in mind that these are preliminary results. We are happy to see that the code runs in parallel to a thousand cores despite having been gutted and re-worked. The basic robust behaviour of underworld in parallel has not regressed ! We are also pleased to see that the python code has not killed the performance - we are passing through most of the difficult computations to the parallel, compiled back end.
Weak scaling results
Coming soon !