Transferring Data

This is an overview of how to transfer data to or from the Berkeley Research Computing (BRC) supercluster, consisting of the Savio, Vector, and Cortex high-performance computing clusters, at the University of California, Berkeley.

When transferring data using file transfer software, you should connect only to the supercluster's Data Transfer Node, dtn.brc.berkeley.edu. (Note: if you're using Globus Connect, you'll instead connect to the Globus endpoint ucb#brc)

After connecting to the Data Transfer Node, you can transfer files directly into (and/or copy files directly from) your Home directory, Group directory (if applicable), and Scratch directory.

Medium- to large-sized data transfers

When transferring a large number of files and/or large files, we recommend you use:

  • Globus Connect (formerly Globus Online): This method allows you to make unattended transfers which are fast and reliable. For basic instructions, see Using Globus Connect.

You can additionally use GridFTP or BBCP for this purpose ...

Small-sized data transfers

When transferring a modest number of smaller-sized files, you can also use:

You can additionally use protocols like FTPS and tools like Rsync for this purpose ...

Transfers to/from repositories under version control

When your code and/or data are stored in repositories under version control, client software is available for accessing them via:

  • Git
  • Mercurial
  • Subversion (SVN)

See Accessing and Installing Software for information on finding and loading this software via the BRC supercluster's Environment Modules.

Transfers to/from specific systems

Additional tutorials for transferring files to/from Amazon Web Services (AWS) S3 and other popular data storage systems are in planning or development. If you have any interest in working on or testing one of these, or have suggestions for other data transfer tutorials, please contact us via our Getting Help email address!

More storage options

If you're transferring data, you're probably thinking about storage. If you need additional storage during the active phase of your research, such as longer-term storage to augment Savio's temporary Scratch storage, or off-premises storage for backing up data, the Active Research Data Storage Guidance Grid can help you identify suitable options.

Assistance with research data management

The campus's Research Data Management (RDM) service offers consulting on managing your research data, which includes the design or improvement of data transfer workflows, selection of storage solutions, and a great deal more. This service is available at no cost to campus researchers. To get started with a consult, please contact RDM Consulting.

In addition, you can find both high level guidance on research data management topics and deeper dives into specific subjects on RDM's website. (Visit the links in the left-hand sidebar to further explore each of the site's main topics.)