Today's fastest supercomputers are assembled from a large array of servers coupled with high-performance InfiniBand interconnect. These systems have been confined to single subnet architecture. To expand to next generation systems, they need to come out of this confinement.
Obsidian, a developer of InfiniBand products featuring range extension, routing and encryption is trying to remove these limitations of supercomputers through its BGFC software.
The company announced collaboration with the NASA Advanced Supercomputing (NAS) Division at NASA's Ames Research Center, Moffett Field, Calif., and the Hyperion Project at the Department of Energy's Lawrence Livermore National Laboratory (LLNL) to evaluate advanced software engineered for networks with multiple subnets in complex topologies.
David Southwell, Obsidian's chief visionary officer, said, “Multiple subnet architectures provide better scaling, faster initialization, fault isolation and, very importantly - the ability to support very large scale distributed heterogeneous infrastructure.”
The 'Subnet Manager' – the software responsible for pre-calculating traffic paths within a subnet, – is currently limited to dealing with simple, regular topologies (such as Clos networks or hypercubes), Southwell added.
By enabling multiple subnets, BGFC allows larger systems to be constructed by joining subnets using different topologies together into a single system
“For example a multi-subnet supercomputer, storage arrays, visualization systems and many smaller clusters could be combined into a single, optimally routed complex fabric, spanning a campus or even a Wide Area Network,” according to Southwell.
“Without routing, we are limited in how we build and reliably operate large-scale InfiniBand based networks,” said Bob Ciotti, systems lead and chief system architect in the NAS Division at Ames. According to Ciotti, a more sophisticated approach is required not only for routing, but in managing single subnets.
LLNL is keen to utilize the large-scale multi-subnet InfiniBand environments in their data centers, said Matt Leininger, manager of the Hyperion Cluster at LLNL, in a statement. “We are especially excited about the possibilities Obsidian is bringing forward to the open-source community with BGFC”.
Through the Hyperion project, the company brings together LLNL and 10 industry leaders to accelerate the development of next-generation Linux clusters. Hyperion serves as a testbed for HPC technologies critical to Livermore’s national security missions and industry’s ability to make petaFLOP/s computing and storage available to U.S. industry.
Obsidian has also announced a public demonstration of BGFC at the Supercomputing 2011 conference in Seattle this November.
Back in May, Obsidian Strategics announced a recently completed study by the Data Intensive Compute Environment (DICE) Alliance found that Obsidian Longbow router was the fastest transport mechanism evaluated.
Rajani Baburajan is a contributing editor for TMCnet. To read more of Rajani's articles, please visit her columnist page.
Edited by Stefanie Mosca