Accessing and Installing Software

Overview | Module Examples | Provided Software | Chaining Modules | Installing Your Own

1. Overview

To access much of the software available on the Savio cluster - ranging from compilers and interpreters to statistical analysis and visualization software, and much more - you'll use Environment Module commands.

These commands allow you to display a list of many of the software packages provided on the cluster, as well as to conveniently load and unload different packages and their associated runtime environments, as you need them.

As a quick overview, Environment Modules are used to manage users’ runtime environments dynamically on the Savio cluster. This is accomplished by loading and unloading modulefiles which contain the application specific information for setting a user’s environment, primarily the shell environment variables, such as PATH, LD_LIBRARY_PATH, etc. Modules are useful in managing different applications, as well as different versions of the same application, in a cluster environment.

Finally, in addition to the software provided on the Savio cluster, you're also welcome to install your own software.

2. Environment Modules Usage Examples

Below are some of the Environment Module commands that you'll be using most frequently. All of these commands begin with module and are followed by a subcommand. (In this list of commands, a vertical bar (“|”) means “or”, e.g., module add and module load are equivalent. And you'll need to substitute actual modulefile names for modulefile, modulefile1, and modulefile2 in the examples below.)

  • module avail - List all available modulefiles in the current MODULEPATH. (This is the command to use when you want to see the list of software that you can use on Savio via Environment Module commands.)
  • module list - List loaded modules. (This shows you what software you currently have available in your environment. Please note that, by default, no modules are loaded.)
  • module add|load modulefile ... - Load modulefile(s) into the shell environment. (This allows you to add more software packages to your environment.)
  • module rm|unload modulefile ... - Remove modulefile(s) from the shell environment.
  • module swap|switch [modulefile1] modulefile2 - Switch loaded modulefile1 with modulefile2.
  • module show|display modulefile ... - Display configuration information about the specified modulefile(s).
  • module whatis [modulefile ...] - Display summary information about the specified modulefile(s).
  • module purge - Unload all loaded modulefiles.

For more detailed usage instructions for the module command, please run man module on the cluster.

Below are representative examples of how to use these commands. Depending on which system you have access to and when you are reading this instruction, what you see here could be different from the actual output from the system that you work on. On systems like Savio, where a hierarchical structure is used, some modulefiles will only be available after their root modulefile is loaded. (For instance, modulefiles for various Python packages will only become available after you've loaded Python itself. Similarly, R packages, libraries for C compilers, and the like, will only become available after their respective parent modules have been loaded.)

It can be helpful to try out each of the following examples in sequence, to more fully understand how environment modules work. Commands you'll enter are shown in bold, followed by samples of output you might see:

[casey@n0000 ~]$ module avail

---- /global/software/sl-6.x86_64/modfiles/tools ----
cmake/2.8.11.2  gnuplot/4.6.0  octave/3.4.3  paraview/3.12.0  r/3.0.1  texlive/2013

---- /global/software/sl-6.x86_64/modfiles/langs ----
gcc/4.4.7  intel/2013.5.192  intel/2013_sp1.2.144  pgi/13.10  python/2.6.6

---- /global/software/sl-6.x86_64/modfiles/intel/2013.5.192 ----
acml/5.3.1-intel  atlas/3.10.1-intel  fftw/2.1.5-intel  fftw/3.3.3-intel  lapack/3.4.2-intel  mkl/2013.5.192  openmpi/1.6.5-intel
...

[casey@n0000 ~]$ module list
No Modulefiles Currently Loaded.

[casey@n0000 ~]$ module load intel
[casey@n0000 ~]$ module list
Currently Loaded Modulefiles:
  1) intel/2013_sp1.4.211
[casey@n0000 ~]$ module load openmpi mkl
[casey@n0000 ~]$ module list
Currently Loaded Modulefiles:
  1) intel/2013_sp1.4.211   3) mkl/2013_sp1.4.211
  2) openmpi/1.6.5-intel

[casey@n0000 ~]$ module unload openmpi
[casey@n0000 ~]$ module list
Currently Loaded Modulefiles:
  1) intel/2013_sp1.4.211   2) mkl/2013_sp1.4.211

[casey@n0000 ~]$ module switch mkl acml
[casey@n0000 ~]$ module list
Currently Loaded Modulefiles:
  1) intel/2013.5.192   2) acml/5.3.1-intel

[casey@n0000 ~]$ module show mkl
-------------------------------------------------------------------
/global/software/sl-6.x86_64/modfiles/intel/2013.5.192/mkl/2013.5.192:

module-whatis   This module sets up MKL 2013.5.192 in your environment.
setenv                MKL_DIR /global/software/sl-6.x86_64/modules/langs/intel/2013.5.192/mkl
prepend-path     CPATH /global/software/sl-6.x86_64/modules/langs/intel/2013.5.192/mkl/include
prepend-path     CPATH /global/software/sl-6.x86_64/modules/langs/intel/2013.5.192/mkl/include/fftw
prepend-path     FPATH /global/software/sl-6.x86_64/modules/langs/intel/2013.5.192/mkl/include
prepend-path     FPATH /global/software/sl-6.x86_64/modules/langs/intel/2013.5.192/mkl/include/fftw
prepend-path     INCLUDE /global/software/sl-6.x86_64/modules/langs/intel/2013.5.192/mkl/include
prepend-path     INCLUDE /global/software/sl-6.x86_64/modules/langs/intel/2013.5.192/mkl/include/fftw
prepend-path     LD_LIBRARY_PATH /global/software/sl-6.x86_64/modules/langs/intel/2013.5.192/mkl/lib/intel64
prepend-path     LIBRARY_PATH /global/software/sl-6.x86_64/modules/langs/intel/2013.5.192/mkl/lib/intel64
prepend-path     MANPATH /global/software/sl-6.x86_64/modules/langs/intel/2013.5.192/man/en_US
-------------------------------------------------------------------

[casey@n0000 ~]$ module whatis mkl
mkl                  : This module sets up MKL 2013.5.192 in your environment.

[casey@n0000 ~]$ module purge  
[casey@n0000 ~]$ module list
No Modulefiles Currently Loaded.

[casey@n0000 ~]$ module avail

---- /global/software/sl-6.x86_64/modfiles/tools ----
cmake/2.8.11.2  gnuplot/4.6.0  octave/3.4.3  paraview/3.12.0  r/3.0.1  texlive/2013

---- /global/software/sl-6.x86_64/modfiles/langs ----
gcc/4.4.7  intel/2013.5.192  intel/2013_sp1.2.144  pgi/13.10  python/2.6.6
...

[casey@n0000 ~]$ module load python
[casey@n0000 ~]$ module avail

---- /global/software/sl-6.x86_64/modfiles/tools ----
cmake/2.8.11.2  gnuplot/4.6.0  octave/3.4.3  paraview/3.12.0  r/3.0.1  texlive/2013

---- /global/software/sl-6.x86_64/modfiles/langs ----
gcc/4.4.7  intel/2013.5.192  intel/2013_sp1.2.144  pgi/13.10  python/2.6.6

---- /global/software/sl-6.x86_64/modfiles/python/2.6.6 ----
ipython/0.12  matplotlib/1.1.0  numpy/1.6.1  scipy/0.10.0

...
NOTE: Python modulefiles will become available only after the “python” modulefile is loaded. The same is the case for R packages, libraries for C compilers, etc.: they will only become available after their respective parent modules are loaded.
 

3. Software Provided on Savio

Research IT provides and maintains a set of system level software modules. The purpose is to provide an ecosystem that most users can rely on to accomplish their research and studies. The range of applications and libraries that Research IT supports highly depend on the use case and the frequency of how often a support request is received.

For a detailed and up-to-date list of software provided on the cluster, run the module avail command, as described in the usage examples above.

(Note: if you're interested in whether a particular C library, Python module, R package, or the like is provided on the cluster, make sure that you first load the parent software itself – such as the Intel or GCC compilers, Python, or R – before checking the list of provided software, as that list is dynamically adjusted based on your current environment.)

Currently the following categories of applications and libraries are supported, with some key examples of each shown below:

  • Development Tools
  • Data processing and Visualization Tools
  • Typesetting and Publishing Tools
  • Miscellaneous Tools (examples not yet listed below)
     
Category Application/Library Name
Development Tools
Editor/IDE Emacs, Vim, cmake, cscope, ctags
SCM Subversion, Git, Mercurial
Debugger/Profiler/Tracer GDB, gprof, Valgrind, TAU, Allinea DDT
Languages/Platforms GCC, Intel, Perl, Python, Java, Boost, CUDA, UPC, Open MPI, PVM, TBB, MPI4Py, IPython, R, MATLAB (license required), Octave, Julia
Math Libraries ACML, MKL, ATLAS, FFTW, FFTW3, GSL, LAPACK, ScaLAPACK, NumPy, SciPy
IO Libraries HDF5, NetCDF, NCO, NCL
Data processing and Visualization Tools
Data Processing/Visualization Gnuplot, Grace, Graphviz, ImageMagick, MATLAB (license required), Octave, ParaView, R, VisIt, VMD, yt, Matplotlib
Typesetting and Publishing Tools
Typesetting TeX Live, Ghostscript, Doxygen

 

4. Chaining Software Modules

Environment Modules also allow a user to optionally integrate their own application environment together with the system-provided application environment, by allowing different categories of modulefiles to be chained together. This provides a common interface for simplicity, while still maintaining diversity and flexibility:

  • The first category of the modulefiles are provided and maintained by Research IT, which include the commonly used applications and libraries, such as compilers, math libraries, I/O libraries, data processing and visualization tools, etc. We use a hierarchical structure to maintain the cleanness without losing the flexibility of it.
     
  • The second category of the modulefiles are automatically chained for the group of users who belong to the same group on the cluster, if the modulefiles exist in the designated directory. This allows the same group of users to share some of the common applications that they use for collaboration and saves spaces. Normally the user group maintains these modulefiles. But Research IT can also provide assistance under support agreement and on a per request basis.
     
  • The third category of the modulefiles can also be chained on demand by a user if the user chooses to use Environment Modules to manage user specific applications as well. To do that, user needs to append the location of the modulefiles to the environment variable MODULEPATH. This can be done in one of the following ways:1). For bash users, please add the following to ~/.bashrc:
    export MODULEPATH=$MODULEPATH:/location/to/my/modulefiles

    2). For csh/tcsh users, please add the following to ~/.cshrc:

    setenv MODULEPATH "$MODULEPATH":/location/to/my/modulefiles

5. Installing Your Own

In addition to the software provided on the cluster, you are welcome to install your own software. Before installing software yourself, first check if it is already provided on the cluster by running module avail and looking to see if the software is listed. Please note that some modules are listed hierarchically, and will only appear on the list after the parent module has been loaded (e.g. libraries for C compilers will only appear after you’ve loaded the respective parent module.) 

Requirements for software on Savio

Software you install on the cluster will need to:

  • Be runnable (executable) on Scientific Linux 7 (i.e. essentially Red Hat Enterprise Linux 7). Choose the x86_64 tarball where available.
  • Run in command line mode, or - if remote, graphical access is required - provide such access via X Windows (X11).
  • Be capable of installation without root/sudo privileges. This may involve adding command line options to installation commands or changing values in installation-related configuration files, for instance; see your vendor- or community-provided documentation for instructions.
  • Be capable of installation in the storage space you have available. (For instance, the source code, intermediate products of installation scripts, and installed binaries must fit within your 10 GB space provided for your home directory, or within a group directory if the software is to be shared with other members of your group.)
  • If compiled from source, be capable of being built using the compiler suites on Savio (GCC and Intel), or via user-installed compilers.
  • Be capable of running without a persistent database running on Savio. An externally hosted database, to which your software on Savio connects, is OK. So is a database that is run on Savio only during execution of your job(s), which is populated by reading files from disk and whose state is saved (if necessary) by exporting the database state to files on disk.
  • If commercial or otherwise license-restricted, come with a license that permits cluster usage (multi-core, and potentially multi-node), as well as a license enforcement mechanism (if any) that's compatible with the Savio environment.

If your software has installation dependencies – such as libraries, interfaces, or modules – first check whether they are already provided on the cluster before installing them yourself. Make sure that you've first loaded the relevant compiler, interpreted language, or application before examining the list of provided software, because that list is dynamically adjusted based on your current environment.

 

Installation location

The most important part of installing software on Savio is identifying where you should install it, and how you should modify the installation script to point to the right location.

If you are installing software exclusively for your use, you can install it in your Home directory (/global/home/users/YOUR-USER-NAME). More often, people are installing software for their whole group to use; in that case, you should install it in your group directory (/global/home/groups/YOUR-GROUP-NAME). If your group does not have a shared directory defined, and you need one, please email brc-hpc-help@berkeley.edu. In any case, be cognizant of space limitations in your Home directory (10GB) or group directory (see documentation on storage limits for different types of groups).

If you will be doing a lot of software installation, you may want to add sub-directories for sources (source files downloaded), modules (the installed software), scripts (if you want to document and routinize your installation process using a script -- which is recommended), and modfiles (to create module files that will make software installed in a group directory visible to your group members via the modules command).

 

Example installation process

The following example illustrates how to install the GDAL geospatial library. It assumes that you have set up sub-directories as discussed above.

  • Find the URL for the Linux binary tarball for your source code.
  • Change to the directory where you want to install the software (e.g. your Home directory or group directory.) If you have a created sources sub-directory to help keep things tidy, move there; otherwise, you can simply download the source within your installation directory. Example:

cd /global/home/groups/my_group/sources

  • Run wget [URL for source tarball].
    • For this example:

wget http://download.osgeo.org/gdal/2.2.1/gdal-2.2.1.tar.gz

  • Untar the file you downloaded by running tar -zxvf your-file.tar.gz
    • For the example:

tar -zxvf gdal-2.2.1.tar.gz

  • Change to the new directory that was created with the contents of the tarball.
    • For the example:

cd gdal-2.2.1

  • Check the documentation for your software to determine where and how you can set the parameters for where the software will be installed. This varies from package to package, and may require modifying the configuration files in the source code itself. Make those changes as needed.
    • For the gdal example (and this is the case for lots of other software), the documentation indicates that we can specify the installation location by adding --prefix=/put-location-here when running the config file.
  • Run the config file, adding in any required parameters for specifying location. If you’ve created a modules subfolder in your target directory, you may want to additionally create a directory for the software package, and a subdirectory for each version. If your software doesn’t have a config file, you will have to modify the Makefile itself to build it. Building the software can be done from any directory where you have the correct permissions. Once you have a binary, you can copy it to the correct location.
    • For the example:
    • mkdir -p /global/home/groups/my_group/modules/gdal/2.2.1

./configure --prefix=/global/home/groups/my_group/modules/gdal/2.2.1

  • Debug the configuration process as needed.
    • If the configuration fails due to insufficient permissions, then something in the process is probably trying to use a default path. Double-check that you’ve overridden the default paths for every aspect of the configuration process, to ensure that files are written to directories for which you have write permission.
    • One way to log everything from the configuration process for later debugging is as follows:

script /global/home/groups/my_group/sources/logfile-software-version

When you want to stop logging, run exit. All the output will be logged to the file logfile-software-version (e.g., logfile-gdal-2.2.1) in the sources sub-directory. Build and install the software

    • Run:

make

    • Run:

make install

    • In most cases, we recommend using the default compiler; in SL7, this is GCC 4.8.5. If you have a particular reason to use the Intel compiler, you’ll need to load it first with module load intel, which loads the default version.
  • Change the permissions. You’ll want other people in your group to be able to modify and run the software.
    • To allow the group to modify the software, change the UNIX group of the installed software. For the example:

cd modules; chgrp -R my_group gdal/

    • Make the software executable. For the example:

chmod -R g+rwX gdal

OPTIONAL: Create a modulefile. Adding a modulefile will mean that your software will appear on the list when people with the right set of permissions run module avail. Here is an example of a modulefile for gdal:

#%Module1.0
## gdal 2.2.1
## by Lizzy Borden

proc ModulesHelp { } {
       puts stderr "loads the environment for gdal 2.2.1"
}

module-whatis "loads the environment for gdal 2.2.1"

set GDAL_DIR /global/home/groups/my_group/modules/gdal/2.2.1/
setenv GDAL_DIR $GDAL_DIR
prepend-path PATH $GDAL_DIR/bin
prepend-path LD_LIBRARY_PATH $GDAL_DIR/lib
prepend-path MANPATH $GDAL_DIR/man

Name the module file using the version number, in the case of the example “2.2.1”. Place the module file in the modfiles sub-directory and allow access by your group. For the example:

mkdir modfiles/gdal

mv 2.2.1 modfiles/gdal

cd modfiles

chgrp -R my_group gdal

chmod -R g+rwX gdal

 

Finally, tell your group members they will need to add /global/home/groups/my_group/modfiles to their MODULEPATH environment variable, which would usually be done in one’s .bashrc file:

export MODULEPATH=$MODULEPATH:/global/home/groups/my_group/modfiles

 

Example installation scripts

These examples use tee instead of script to create log files. They also compile the software in parallel with the -j8 flag.

 

gnuplot

 

#!/bin/sh
make distclean
./configure --prefix=/global/home/groups/my_group/modules/gnuplot/4.6.0 --with-readline=gnu --with-gd=/usr 2>&1 | tee gnuplot-4.6.0.configure.log
make -j8 2>&1 | tee gnuplot-4.6.0.make.log
make check 2>&1 | tee gnuplot-4.6.0.check.log
make install 2>&1 | tee gnuplot-4.6.0.install.log
make distclean

cgal

#!/bin/sh
module load gcc/4.4.7 openmpi/1.6.5-gcc qt/4.8.0 cmake/2.8.11.2 boost/1.54.0-gcc
make distclean
cmake -DCMAKE_INSTALL_PREFIX=/global/home/groups/my_group/modules/cgal/4.4-gcc . 2>&1 | tee cgal-4.4-gcc.cmake.log
make -j8 2>&1 | tee cgal-4.4-make.log
make install 2>&1 | tee cgal-4.4-install.log
make distclean

FAQ

What if the software doesn’t come with a configure script?

If the software doesn’t come with a configure script, you will have to modify the Makefile itself to build it. Building the software can be done from any directory where you have the correct permissions. Once you have a binary, you can copy it to the correct location.