Stretch your FCA further using ht_helper for improved efficiency on Savio

September 14, 2017

The ht_helper utility is available to assist researchers in using Savio, the shared campus high performance computing (HPC) cluster, more efficiently. The utility can make a dramatic difference in maximizing the quantity of computation a researcher can perform using the Berkeley Research Computing Program’s free Faculty Computing Allowance.

The Savio cluster is comprised of over 300 nodes (servers), and each node consists of up to 24 cores (CPUs). HPC environments such as Savio are best suited to decompose large processing jobs into many small tasks and process those tasks in parallel. When a researcher submits a job to Savio they specify the number of nodes to employ when processing the job. It is the responsibility of the researcher to use the node’s cores and memory efficiently to get the most out of their allocation. If only a fraction of the cores on a node are used during the processing, the account may still be still charged for the entire node

Technologies that are used to parallelize the code that processes data, such as such as OpenMP, can be complicated to use correctly, and take significant time to learn. However, in many cases, researchers can let the ht_helper utility manage the allocation of resources to get the most efficient use of their Savio time without the need to use OpenMP or similar parallelization technologies. Since I joined the RIT team about a year ago, ht_helper has become invaluable for me to process my data efficiently.

One of my engagements involves OCR on large sets of pdf files. Initially I attempted to manage the multiprocessing logic in my python code but struggled to know if I was using my compute resources effectively. Yong Qin, an HPC consultant at Lawrence Berkeley National Laboratory and author of ht_helper, recommend this tool which has simplified the processing logic significantly. Ht_helper takes a ‘task file’ as input, the task file contains one command (call to an application, for example) per line. Then ht_helper allocates the set of tasks to the resources on the node(s) and carries out the execution. See the diagram below for a schematic view of how the utility works.

Additionally, ht_helper works well with applications inside a Singularity container. The two applications used in my OCR processing, Ghostscript and Tesseract, are installed in a Singularity container. The command files are lists of Singularity exec commands that call those applications inside the Singularity container. I used the ht_helper '-L' flag to produce a log file for each command so that I could validate that all the tasks were successful (caution: when a task file is very large, writing thousands of log files may degrade performance).

My workflow logic includes a few lines of python code to generate a task file containing one command for each data file. In my slurm script I specify the number of nodes that I want to use and ht_helper ensures that those tasks are efficiently distributed, to maximize use of the cores on the number of nodes I specified. This means the work will complete more quickly, and the charge to my allocation will be minimized.

Here’s an example. This snippet of python code creates a task file from the set of png files in the specified folder on Savio scratch:

scratchDataDirectory = '/global/scratch/username/test/'
TESSERACTCMD = '  tesseract --tessdata-dir /opt/tessdata \"{}\" \"{}\"  -l eng'
SINGULARITYCMD = 'singularity exec -B {}:/scratch/ /global/scratch/groups/dh/tesseract4.img ' 

with open('taskfile.sh', 'w') as taskfile:
    # get a list of all png files in the current directory
    for filename in os.listdir(os.getcwd()):
        if filename.endswith('.png'):
            filename, file_extension = os.path.splitext(entry)
            relativepath1 = entry[len(scratchDataDirectory):]
            relativepath2 = filename[len(scratchDataDirectory):]
            tcmd = TESSERACTCMD.format(tesseractScratchDataDirectory+relativepath1, tesseractScratchDataDirectory+relativepath2 )
            f.write(scmd + tcmd + '\n')

Some of the lines from the task file produced by the code snippet are shown below. As you can see, the only difference between the tasks defined in each line is the name of the file to be processed:

singularity exec -B /global/scratch/mmanning/chench/test3/:/scratch/  /global/scratch/mmanning/tesseract4.img tesseract --tessdata-dir /opt/tessdata "/scratch/SDNY/605/226/Main Document-2.png" "/scratch/SDNY/605/226/Main Document-2"  -l eng
singularity exec -B /global/scratch/mmanning/chench/test3/:/scratch/  /global/scratch/mmanning/tesseract4.img tesseract --tessdata-dir /opt/tessdata "/scratch/SDNY/605/226/Main Document-1.png" "/scratch/SDNY/605/226/Main Document-1"  -l eng
singularity exec -B /global/scratch/mmanning/chench/test3/:/scratch/  /global/scratch/mmanning/tesseract4.img tesseract --tessdata-dir /opt/tessdata "/scratch/SDNY/605/49/Main Document-3.png" "/scratch/SDNY/605/49/Main Document-3"  -l eng
singularity exec -B /global/scratch/mmanning/chench/test3/:/scratch/  /global/scratch/mmanning/tesseract4.img tesseract --tessdata-dir /opt/tessdata "/scratch/SDNY/605/49/Main Document-6.png" "/scratch/SDNY/605/49/Main Document-6"  -l eng

If you need some pointers when adapting these examples to your code, write to the BRC consultants, at brc@berkeley.edu, and we’ll do our best to help.

There are parameters that the user can adjust to better tune ht_helper to their use case. For example, ht_helper assumes by default that each task will run for a few minutes, and checks for task completion every 60 seconds. But if you have tasks that run for less than two minutes each, you might consider setting the ‘-s’ parameter to a smaller value. You can test the execution time for one of your commands using srun on Savio. BRC consultants can provide assistance with this and other tuning questions.
My slurm script for submitting the ht_helper processing:

#!/bin/bash -l  
# Job name: 
#SBATCH --job-name=test

# Account: 
#SBATCH --account=ac_scsguest 

# Partition: 
#SBATCH --partition=savio2 

## Scale by increasing the number of nodes 
#SBATCH --nodes=5  
#SBATCH --qos=savio_normal 

# Wall clock limit: 
#SBATCH --time=00:30:00 

## Command(s) to run: 
module load gcc openmpi  
/global/home/groups/allhands/bin/ht_helper.sh  -t /global/scratch/username/test/taskfile.sh -n1 -s1 -vL 

If you happen to have a very large set of tasks, the job’s startup time can take a while. To minimize startup time for your job’s processing, you can put all your tasks into a single task file, but process them in sequential groups using the “-i” option  in your slurm script with the following syntax:

ht_helper.sh -i 0-99999 ...
ht_helper.sh -i 100000-199999 ...
ht_helper.sh -i 200000-299999 ...

In the example above the first job will process tasks  0-99999 of the task file, the the next job with run the tasks on lines 100000-199999 of the task file and so on. The task file can also be executed multiple times using the “-r” option. For example the following command will run the commands in the task file three times in succession.

ht_helper.sh -r 3 taskfile.sh

For additional details and examples please see the documentation on the Research IT site or the github repository.