When you mention the term cluster, most people think of racks of servers crunching away in a data center or lab. When you mention grid, most people think of clusters connected by the Internet. Setting “What is a grid?” arguments aside, there is a real distinction between the two high-performance computing (HPC) methodologies. A cluster normally lives in one physical location and has one administrative domain. A grid, on the other hand, is usually built out of separate administrative domains.
In the traditional grid/cluster model, there’s an impedance mismatch of sorts. For a cluster, you want communication to happen as fast as possible. To accomplish this goal, you go around the kernel, doing all of your communication in user space, using some sort of” zero-copy” protocol. On the grid side, you want robustness, standards, and the Internet making sure what you send is what’s received. Furthermore, the cluster/grid connection usually takes place through a special gateway node in the cluster. The node could be a login node or a node that is set up specifically to translate from cluster to Internet (and back). To help with bandwidth, there are sometimes more than one of these nodes, but in general, they represent a bottleneck between “out there” and “in here.”
For many, the cluster/grid mismatch is a real problem. While not quite as vexing as reconciling quantum mechanics and relativity, there are two distinct domains that must be seamlessly connected before large-scale distributed computing can become a reality.
There is a possible…
Please log in to view this content.
Not Yet a Member?
Register with LinuxMagazine.com and get free access to the entire archive, including: