Titan is a Cray XK7 system consisting of 18,688 AMD sixteen-core Opteron™ processors providing a peak performance of more than 3.3 petaflops (PF) and 600 terabytes (TB) of memory. A total of 512 service input/output (I/O) nodes provide access to the 10 petabytes (PB) “Spider” Lustre parallel file system at more than 240 gigabytes (GB/s). External login nodes (decoupled from the XK7 system) provide a powerful compilation and interactive environment using dual-socket, twelve-core AMD Opteron processors and 256 GB of memory. Each of the 18,688 Titan compute nodes is paired with an NVIDIA Kepler graphics processing unit (GPU) designed to accelerate calculations. With a peak performance per Kepler accelerator of more than 1TF, the aggregate performance of Titan exceeds 20PF. Titan is the Department of Energy’s most powerful open science computer system and is available to the international science community through the INCITE program, jointly managed by DOE’s Leadership Computing Facilities at Argonne and Oak Ridge National Laboratories.
The Spider disk subsystem will be upgraded in 2013 to provide up to 1 TB/s of disk bandwidth and up to 30 PB of storage.
Gaea consists of a pair of Cray XE6 systems. The smaller partition contains 2,624 socket G34 AMD 16-core Opteron processors, providing 41,984 compute cores, 84 TB of double data rate 3 (DDR3) memory, and a peak performance of 386 teraflops (TF). The larger partition contains 4,896 socket G34 AMD 16‑core Interlagos Opteron processors, providing 78,336 compute cores, 156.7 TB of DDR3 memory, and a peak performance of 721 TF.
The aggregate system provides 1.106 PF of computing capability, and 248 TB of memory. The Gaea compute partitions are supported by a series of external login nodes and two separate file systems. The FS file system is based on more than 2,000 SAS drives and provides more than 1 PB (formatted) space for fast scratch to all compute partitions. The LTFS file system provides more than 2000 SATA drives and 4 PB formatted capacity as a staging and archive file system. Gaea is the NOAA climate community’s most powerful computer system and is available to the climate research community through the Department of Commerce/NOAA.
The ORNL Institutional Cluster
The ORNL Institutional Cluster (OIC) consists of two phases. The original OIC consists of a bladed architecture from Ciara Technologies called VXRACK. Each VXRACK contains two login nodes, three storage nodes, and 80 compute nodes. Each compute node has dual Intel 3.4 GHz Xeon EM64T processors, 4 GB of memory, and dual gigabit Ethernet interconnects. Each VXRACK and its associated login and storage nodes are called a block. There are a total of nine blocks of this type. Phase 2 blocks were acquired and brought online in 2008. They are SGI Altix machines. There are two types of blocks in this family.
Thin nodes (3 blocks). Each Altix contains 1 login node, 1 storage node, and 28 compute nodes within 14 chassis. Each node has eight cores and 16 GB of memory. The login and storage nodes are XE240 boxes from SGI. The compute nodes are XE310 boxes from SGI.
Fat nodes (2 blocks). Each Altix contains 1 login node, 1 storage node, and 20 compute nodes within 20 separate chassis. Each node has eight cores and 16 GB of memory. These XE240 nodes from SGI contain larger node-local scratch space and a much higher I/O to this scratch space because the space is a volume from four disks.
Frost (SGI Altix ICE 8200) consists of three racks totaling 128 compute nodes, 5 service nodes (1 batch node and 4 login nodes), 2 rack leader nodes, and 1 administration node. Each compute node has two Intel quad-core Xeon X5560 at 2.8 GHz (Nehalem) processors, 24 GB of memory, a 1 Gb Ethernet connection, and two 4x DDR Infiniband connections. Each rack of compute nodes contains eight Infiniband switches (Mellanox InfiniScale III MT47396, 24 10‑Gb/s Infiniband 4X ports) that are used as the primary interconnect between compute nodes and for connection to the Lustre file system. The center-wide Lustre file system is the main storage available to the compute nodes. The Frost cluster is available to ORNL staff and collaborators.
The University of Tennessee
Kraken is a Cray XT5 system consisting of 18,816 AMD six-core Opteron processors providing a peak performance of 1.17 PF and 147 TB of memory. It is connected to more than 3 PB of disk space for scratch space. Originally deployed in 2010, it remains one of the fastest academic computers in the world and a significant resource on the NSF XSEDE network.
Joint Institute for Computational Sciences
The University of Tennessee (UT) and Oak Ridge National Laboratory (ORNL) established the Joint Institute for Computational Sciences (JICS) in 1991 to encourage and facilitate the use of high-performance computing in the state of Tennessee. When UT joined Battelle Memorial Institute in April 2000 to manage ORNL for the Department of Energy (DOE), the vision for JICS expanded to encompass becoming a world-class center for research, education, and training in computational science and engineering. JICS advances scientific discovery and state-of-the-art engineering by
taking full advantage of the computers at the petascale and beyond housed at ORNL and in the Oak Ridge Leadership Computing Facility (OLCF) and
enhancing knowledge of computational modeling and simulation through educating a new generation of scientists and engineers well versed in the application of computational modeling and simulation to solving the world’s most challenging scientific and engineering problems.
JICS is staffed by joint faculty who hold dual appointments as faculty members in departments at UT and as staff members in ORNL research groups. The institute also employs professional research staff, postdoctoral fellows and students, and administrative staff.
The JICS facility represents a $10M investment by the state of Tennessee and features a state-of-the-art interactive distance learning center with seating for 66 people, conference rooms, informal and open meeting space, executive offices for distinguished scientists and directors, and incubator suites for students and visiting staff.
The JICS facility is a hub of computational and engineering interactions. Joint faculty, postdocs, students, and research staff share the building, which is designed specifically to provide intellectual and practical stimulation. The auditorium serves as the venue for invited lectures and seminars by representatives from academia, industry, and other laboratories, and the open lobby doubles as casual meeting space and the site for informal presentations and poster sessions, including an annual 200+ student poster session.
In June 2004, JICS moved into a new 52,000 ft2 building next door to the OLCF. The OLCF, which is located on the ORNL campus, is among the nation’s most modern facilities for scientific computing. The OLCF includes 40,000 square feet divided equally into two rooms designed specifically for high-end computing systems.
Within JICS, there are three other centers that are the result of three large National Science Foundation (NSF) awards:
The National Institute for Computational Sciences (NICS) at the University of Tennessee is the product of a $65M NSF Track 2B award. The mission of NICS is to enable the scientific discoveries of researchers nationwide by providing leading-edge computational resources and education, outreach, and training for underrepresented groups. Kraken, the fastest, most powerful supercomputer for academic use, is the flagship NICS computing resource.
The UT Center for Remote Data Analysis and Visualization (RDAV) is sponsored by NSF through a 4-year, $10 million TeraGrid XD award. The centerpiece hardware resource at RDAV is Nautilus, a new SGI UltraViolet shared-memory machine featuring 1,024 cores and 4 terabytes of memory within a single system image. A wide range of software tools is available for TeraGrid users to perform data analysis, visualization, and scientific workflow automation on Nautilus. The machine is located at ORNL and is administered by NICS staff.
The Keeneland Project is a 5-year, $12 million Track 2 grant awarded by NSF for the deployment of an experimental high-performance system. The Georgia Institute of Technology and its project partners, UT-Knoxville and ORNL, have initially acquired and deployed a small, experimental, high-performance computing system consisting of an HP system with NVIDIA Tesla accelerators attached. The machine is located at ORNL and is administered by NICS staff.