UC Berkeley and LBNL researchers: invitation to use JupyterHub interactive high-performance node

September 15, 2016

Steve Masover

UC Berkeley and LBNL researchers are invited to use -- free of charge -- a richly-provisioned JupyterHub node loaned to the campus by the Pacific Research Platform project, and jointly managed by the Berkeley Research Computing (BRC) Program and the Berkeley Institute for Data Science (BIDS).

The introduction and selection criteria of the call for projects to run on this resource are reproduced below. The full call, which includes conditions of usage, the application process, and further details about the node's hardware specifications, are available as a PDF, linked at the end of this article.

Call for projects to run on JupyterHub interactive high-performance node

We would like to invite the Berkeley/LBL community to submit small proposals to access, free of charge, a new computational resource hosted by the Berkeley Research Computing program that provides high-performance interactive computing with a pre-configured Jupyter environment.

UC Berkeley is currently hosting a node loaned by the Pacific Research Platform project, built with 28 Xeon cores, an NVIDIA K80 dual-GPU card, 256GB of RAM, roughly 5TB of SSD storage and 80Gbps of network bandwidth (full specs below). It is configured with a JupyterHub instance for interactive usage, including the entire scientific python stack as well as CUDA support, Caffe, TensorFlow, direct support for spawning kernels on Comet (@SDSC), and other tools.

The purpose of this node is to support the development of better research computing environments that sit at the boundary between interactive usage and large-scale, HPC resources. Jupyter is typically used by individual users on either personal machines or small/medium size cloud/remote nodes. With this experiment, we are offering a system whose performance profile goes beyond that, and is a gateway to HPC-scale resources such as those provided by the Berkeley Research Computing program, by XSEDE resources (including Comet at SDSC), and the NERSC facilities at LBNL.

We note that submitting a proposal to use the system implies that you accept the fact that currently, the system is a shared environment with other "tenants" using it, potentially simultaneously. This deployment is precisely meant to explore how to develop resources and tools to improve the experience of live, interactive work in shared, high-performance systems. We expect to host between 5 and 10 concurrent projects (which may or may not actually use the system simultaneously).

Selection criteria

We seek to host research efforts that will help us explore among other questions, what usage patterns arise, how to build scheduling strategies that maintain interactive flexibility while providing a good user experience, and what is the best way to transition workloads between such a system and batch-oriented HPC environments. We will prioritize proposals that:

have broad applicability, and/or
go beyond what can be done comfortably on a personal system and demonstrate the value of the node capabilities (e.g., emphasizing compute, or data movement), and/or
demonstrate the issues in a workflow that includes compute mobility (from workstation to this node to larger scale compute, etc.).

The full invitation to researchers is available as a PDF: Call for projects.... Please review the full document before applying for access via a simple online intake form. Researchers are welcome to e-mail brc@berkeley.edu with questions about the resource or application process (please use "JupyterHub Call for Proposals" as part of your subject line to help the BRC consultants quicly direct your message).