Flexible compute on BRC infrastructure: experiments underway

April 18, 2016

The Berkeley Research Computing (BRC) team is actively investigating ways to enable computation on the shared Savio cluster using researcher-defined environments, and using interactive “notebooks” powered by Jupyter.

While HPC systems are powerful, the architecture of their shared resources imposes constraints on the environment that preclude use of familiar development tools and computation workflows for many researchers. To better support such tools and workflows while making use of subsidized Savio resources, three technologies were selected from a broader set of possibilities for experimentation to determine both utility to campus researchers, and feasibility for deployment within the Savio architecture. If successful, these experiments will shape new services that leverage campus investment in HPC resources while directly addressing use cases that more usually are run on cloud or large virtualized workstation (Analytics Environments on Demand) resources.

The selected technologies are Jupyter notebooks, Shifter, and Singularity.

Jupyter Notebooks are powerful tools that allow researchers to bundle documentation, live code, visualizations, and research workflows that can be published and shared. Two experiments in this technology space are underway. First, Yong Qin, of BRC’s LBNL-based HPC team, is leading an experiment that aims to create a multi-user environment allowing current BRC users to integrate their use of Jupyter notebooks with the Savio HPC platform, enabling notebooks to spawn heavyweight jobs into Savio’s powerful computation nodes. Second, the NSF-funded Pacific Research Platform has provided Berkeley with a JupyterHub server configured to spawn jobs into the Comet HPC cluster at the San Diego Supercomputer Center (SDSC). Research IT is working with the Berkeley Institute for Data Science (BIDS) to explore how Berkeley researchers can take advantage of this resource.

Shifter is a project under development at NERSC, the National Energy Research Scientific Computing Center. This technology facilitates deployment of environments defined as Docker containers into a shared HPC cluster. Michael Jennings, a senior HPC Engineer and Savio’s principal system administrator, is working with NERSC colleagues to explore the feasibility of running Shifter on the Savio cluster. If this experiment succeeds, campus researchers can look forward to leveraging Savio’s capacity to run “Dockerized” research computation workflows.

Singularity is a technology under development by a team of open-source software engineers led by Greg Kurtzer, the Linux Cluster Architect for BRC’s Savio cluster and LBNL’s HPC group. Singularity utilizes a combination of packaging and container principles to enable users to run applications in an HPC environment. The technology permits a researcher to encapsulate only the applications she needs and the libraries (software) those applications require to run -- a lighter-weight alternative to containerized virtual machines.

In initial stages, each of these experiments is evaluating issues that might occur when deploying user-defined environments or interactive notebooks to Savio. Beginning this spring and summer, campus faculty, postdocs, and graduate students will be engaged to provide real-world research use cases that test the utility of these technologies within the bounds of secure computation that Berkeley’s shared Savio cluster can support.

As BRC’s experiments in flexible computation proceed, Research IT is hosting meetings of its biweekly Reading Group to explore the technologies under study.

  • A meeting addressing JupyterHub was held on March 10th.
  • On April 21, NERSC’s Shane Canon and Doug Jacobsen will present and lead discussion of Shifter.
  • Greg Kurtzer will headline a discussion of Singularity on May 19.

Please join us as the Berkeley Research Computing Program broadens support for campus research across departments and computational methods.