Transferring Data Between Savio and Your UC Berkeley Box account

This document describes how you can transfer files between Savio, UC Berkeley's high performance computing cluster, and your UC Berkeley bConnected account on the Box collaboration/file storage service, via the FTPS file transfer protocol.

Via the instructions below, you'll perform these file transfers using the lftp utility on Savio, which supports the FTPS protocol.

Perform one-time setup

You'll only need to complete the following steps once:

Set up an "external password" on your UC Berkeley Box account. This enables your account to be accessed via methods other than single sign-on, such as via the FTPS protocol.

  1. Log into your Box account (https://box.berkeley.edu) via CalNet authentication.

  2. From the dropdown menu with your name, near the upper right, select Account Settings.

  3. Scroll down to the Authentication section.

  4. Click the Change Password link.

  5. Follow the onscreen prompts to create and save your external password. Your CalNet passphrase is your old password.

Open a connection from Savio to Box

On Savio, use lftp to open an encrypted (FTPS) connection to Box:

  1. Via SSH, log into the Data Transfer Node on Savio, dtn.brc.berkeley.edu

    E.g. for a command line SSH client (substituting your actual Savio username for my_savio_user_name below):

    ssh my_savio_user_name@dtn.brc.berkeley.edu

  2. Launch lftp and connect to Box by entering:

    lftp ftp.box.com

    (You should then see an lftp prompt, similar to lftp my_email@berkeley.edu@ftp.box.com:~> or lftp my_email@berkeley.edu@ftp.box.com:/my/directory_path_on_box> The following commands are all typed at that lftp prompt, after the > symbol.)

  3. To enter your username and password, enter (substituting your actual UC Berkeley email address, used with your Box account, for my_email@berkeley.edu below):

    user my_email@berkeley.edu

  4. When prompted for a password, enter your Box external password. (This is the password that you set up via steps earlier in these instructions.)

Transfer files between Savio and Box

On Savio, you'll use lftp commands to transfer files in either direction, between Savio and Box. You'll enter each of the following commands after lftp's > prompt.

Note: the full prompt might look something more like this:

lftp my_email@berkeley.edu@ftp.box.com:~>

Or this:

lftp my_email@berkeley.edu@ftp.box.com:/my/directory_path_on_box>

The basics ... some of the commands you'll use most often

Viewing and navigating directories
  • Show me what's in my current Box directory:

    ls

  • Change to a different Box directory:

    cd box_directory_name

    (Hint: if the Box directory name contains space characters or other whitespace, enter any unique first part of that name and press the Tab key. This will fill in the remainder of that name.)

  • Show the name of my current Savio directory:

    lpwd

  • Show me what's in my current Savio directory:

    !ls

    (Note: that's a exclamation mark - that is, an ! character - in front of the ls command. Use ! followed by any shell command to execute that command from inside lftp.)

  • Change to a different directory on Savio:

    lcd savio_directory_name

Transferring individual files
  • Transfer a single file from Box to your current directory on Savio. (This assumes the file myotherfile.xyz is in your current directory on Box):

    get myfile.xyz

  • Transfer a single file from Savio to your current directory on Box. (This example assumes the file myotherfile.xyz is in your current directory on Savio):

    put myotherfile.xyz

Transferring multiple files
  • Transfer multiple files from Box, matching some filename pattern, to your current directory on Savio. (This example assumes these files are all in your current directory on Box):

    mget some_filename_pattern_with_wildcards*

    mget some_filename*pattern*with_wildcards

    (You can think of the mget command as "multiple file get".)

  • Transfer multiple files from Savio, , matching some filename pattern, to your current directory on Box. (This example assumes these files are all in your current directory on Savio):

    mput some_filename_pattern_with_wildcards*

    mput some_filename*pattern*with_wildcards

    (You can think of the mput command as "multiple file put".)

'Mirroring' an entire directory on Savio, by copying it to Box

Lftp's mirror command offers the powerful ability to copy entire directories, including all their contents, from one computer to another. That power comes with some risk of accidentally overwriting files, however, so you'll want to carefully study its options, and try some small tests, before using it to copy large numbers of files.

Here's a way that you can use mirror for one commonly-performed task: making a backup copy of some data you have on Savio, onto Box. The following commands make a new directory ("folder") on Box, then copy all of the contents of a specified directory on Savio, into that new directory on Box:

mkdir a_new_box_directory_name

mirror --reverse --delete --no-perms --verbose /source/file/path/on/savio a_new_box_directory_name

For instance, if you have some data you'd like to transfer to Box that's located in your Scratch directory on Savio, contained within a directory called my_directory, you might substitute `/global/scratch/my_savio_user_name/my_directory' for '/source/file/path/on/savio', above; e.g.:

mirror --reverse --delete --no-perms --verbose /global/scratch/my_savio_user_name/my_directory a_new_box_directory_name

Please note that, by default, the mirror command is recursive: it will copy the entire contents of the (source) directory you're copying from, including copying any nested subdirectories within that directory, to the (target) directory you're copying to. As well, the --delete option will delete all files within that target directory, that aren't also present in the source directory, so please use this option with caution.

A tip about sharing files on Box

If you're storing files on Box that need to be accessed by other members of your research group, collaborators, etc. over many months or years, and where that access needs to be reasonably well insulated from the comings and goings of individual members, a recommended practice is to set up a UC Berkeley Special Purpose Account (SPA) and add multiple 'owners' to the SPA. Next, you can follow these instructions to use the SPA to create an initial, shared Box folder. Inside that shared folder, you and others can then collaboratively store and manage files and folders using your individual CalNet accounts.