Steve Masover's picture

A free & fully-loaded JupyterHub server supports campus research computation

Logo images: Jupyter; Pacific Research Platform; Berkeley Institute for Data Science; Berkeley Research Computing

An initial cohort of Berkeley researchers has already made productive use of an experimental, powerfully-provisioned server loaned to UC Berkeley by the Pacific Research Platform (PRP) project and configured with JupyterHub. This free resource is administered by a partnership between Berkeley Research Computing and the Berkeley Institute for Data Science (BIDS), and permits users to run interactive computation from a command line or Jupyter notebook interface. A second cohort of experimenters is currently being onboarded, and additional researchers are welcome to apply.

Brett Naul, a postdoc in Astronomy, describes his use of the resource:

I've been using the JupyterHub/PRP machine for experiments using deep learning models in Tensorflow+Keras to analyze time series data. Roughly 95% of the tasks I've used the machine for have involved the GPU (training Tensorflow models), with the other 5% being simple data manipulation/aggregation. [...] I'd estimate that I've run ~400 hours worth of GPU jobs on the machine; I could have done the same work using XStream at Stanford (through XSEDE), but it's not quite as friendly an environment for testing/prototyping since the jobs have to be scheduled rather than run interactively. Without either, I'd have ended up purchasing a GPU myself, which would run a thousand dollars, and assumes availability of an already-configured server in which to run it.

Lily Hu, a Ph.D. Candidate in Mechanical Engineering, explains how she used the server:

I had some data files that were too large to load into memory on my personal computers. Computations on my personal computers were also impractical due to the days and weeks required to run them. [...] I used jhub-prp to run scripts in python, on the command line, and in jupyter notebooks to process and prepare data, train machine learning models, analyze results, and create visualisations. I submitted jobs and let them run on the resource, sometimes for days. I also used storage available on jhub-prp to store and easily access my input and output files. The web browser interface and ssh ability were useful to check in on my computations and to move files to my personal computer.

A new cohort of researchers will run computation on the JupyterHub resource in early 2017 including faculty, staff, and graduate students in fields that range from City & Regional Planning to Mechanical Engineering.

Berkeley and LBNL researchers who would like to use the experimental computing environment provided by PRP are invited to apply by describing their proposed work. Please see a summary of the call for experimenters on the BIDS web site, and follow the links to the full call and an intake form. We look forward to hearing from you, and expect to evaluate the next set of applications for access later in Spring 2017.