IEEE - Institute of Electrical and Electronics Engineers, Inc. - JOSHUA: Symmetric Active/Active Replication for Highly Available HPC Job and Resource Management

2006 IEEE International Conference on Cluster Computing

Author(s): K. Uhlemann ; C. Engelmann ; S.L. Scott
Publisher: IEEE - Institute of Electrical and Electronics Engineers, Inc.
Publication Date: 1 September 2006
Conference Location: Barcelona, Spain
Conference Date: 25 September 2006
Page(s): 1 - 10
ISBN (CD): 1-4244-0328-6
ISBN (Paper): 1-4244-0327-8
ISSN (Paper): 1552-5244
DOI: 10.1109/CLUSTR.2006.311855
Regular:

Most of today's HPC systems employ a single head node for control, which represents a single point of failure as it interrupts an entire HPC system upon failure. Furthermore, it is also a single... View More

Advertisement