Savio's GPU and HTC compute nodes now open to all cluster users

September 16, 2016

Steve Masover

All users of Savio, the campus’s High Performance Computing (HPC) cluster, now have full access to the Graphics Processing Unit (GPU) and High Throughput Computing (HTC) pools of compute nodes on the cluster.

Initially available only to Savio’s Condo contributors and a small number of early adopters, the GPU and HTC compute nodes are now also available to the more than 100 campus faculty members, along with their research groups and collaborators, who are using Savio via the Faculty Computing Allowance program.

GPU nodes offer large numbers of processor cores

Savio’s GPU compute nodes are particularly suitable for research applications that can take advantage of their large number of processor cores. As one example, Professor Hiroshi Nikaido of the Graduate School Division of Biochemistry, Biophysics and Structural Biology, a Condo contributor to the Savio cluster, is engaged in a multidisciplinary collaboration with Physics Professor Attilio Vargiu of the University of Cagliari in Italy, to run all-atom molecular dynamics (MD) simulations on Savio’s GPU nodes, to model the behavior of atoms and molecules in bacterial membranes.

These simulations are part of Prof. Nikaido's pioneering work on how "efflux pumps" expel antibiotics and other therapeutic drugs from bacteria, which are shedding light on a key mechanism in how bacteria develop resistance to medication. A goal of this research is to better understand such mechanisms, and to develop techniques to inhibit them, in order to amplify and extend the effectiveness of antibiotics and other medications.

"MD simulations are very costly," Professor Vargiu explains, "and only the use of large computational facilities such as Savio" can support "studies addressing the dynamics behind the interaction of efflux pumps and antibiotics." The molecular dynamics modeled in this research collaboration includes a protein of "more than 3,000 amino acids, embedded in a model membrane and fully hydrated, amounting to about half a million atoms." Vargiu added that "most of the MD codes nowadays are optimized to run on [GPU-equipped] graphics cards,” significantly reducing the ‘time to science’ for these simulations.

HTC nodes support loosely-coupled computation

The HTC compute nodes on Savio are optimized for running loosely-coupled jobs, with few interdependencies or need to communicate amongst jobs during runtime. Typical examples include Monte Carlo simulations and other parameter sweep applications. Savo’s HTC nodes are being used extensively by Dr. David Olmsted, who is working with Professors Mark Asta, Daryl Chrzan, Andrew Minor, and John Morris in the Materials Science and Engineering Department to understand the mechanisms by which titanium is embrittled by small amounts of oxygen impurities. In their work, extensive computer simulations and experimental characterization studies are combined to derive new mechanistic insights, with a goal of developing processing strategies to lower the cost of titanium, thereby opening up the possibility for expanded use of this material in applications spanning transportation, aerospace and beyond.

"An efficient way to study the behavior of the oxygen interstitials is lattice Monte Carlo," Dr. Olmsted explains. "Because of strong elastic effects induced by the presence of oxygen in the titanium structure, the model must account for long-ranged interactions between these impurities. Such a model is not efficiently supported by standard Monte Carlo packages, and it does not lend itself to straight-forward parallelization. Because of this, an internally developed code is being employed, and the HTC queue on Savio has proved ideal for these computations. This computing resource has enabled us to explore short-range order when there is no precipitation, and precipitate shape at a variety of compositions and temperatures in this work."

Accessing the GPU and HTC pools

Savio users can access the GPU and HTC pools of compute nodes by specifying the savio2_gpu or savio2_htc partitions, respectively, when running their jobs on the cluster. When accessing GPU nodes, a gres option is also required, in order to specifically request the use of one or more GPUs. These and other specifics of selecting compute nodes when running jobs are extensively discussed in Savio’s Running Your Jobs documentation.

If you have questions about how well these specialized pools of compute nodes fit your research computing needs, or about the specifics of running your jobs on these nodes, please feel free to contact Berkeley Research Computing consulting - we’ll be glad to help!