White Paper: Why Performance Management?
SMPs may have cores, but clusters have bandwidth.

In my last two columns (Small HPC and HPC Hopscotch) I have been talking about multi-core, memory, and HPC programming. The recent release of the AMD six core Opteron got me thinking about this topic. It will soon be possible to buy a 12 core workstation (or even a 48 core version!) I recall the days when a 32 processor cluster (16 nodes with dual single core processors) was a nice addition to any lab or even computing center.

I also talked about memory locality and how multi-core has introduced a new hierarchy, near, near-by, and non-local memory. In the past an MPI programmer really thought about local memory and non-local (distant) memory on another node. Distant memory was only changed by sending a distant process a message using MPI. Near-by or SMP memory with a bunch of cores attached represents a different (although not all…

