boomerang.py

SOURCE: boomerang.py

Boomerang is a simple program to submit runs to a cluster of machines using SSH. The code is object oriented and therefore should be easily incorporated into other programs. Also, it is easy to call from the command line. The program stores information in a series of data files (described below). The basic essence of the program invloves a few simple steps:

  1. First, queue jobs on your local machine for a given cluster.

  2. Second, determine the status of the cluster by finding out how many jobs are running on each. This is accomplished by a universally updated status, which works for all machines.

  3. Submit the jobs to machines that contain available CPU and memory for your jobs.

  4. While the jobs are running boomerang has convenient functions to let you view the results.

  5. Finally you can retreive any runs that have been finished.

Disk space: Effective August 2014, each of the compute nodes will will go through each directory in /home/*/tmp_runs/* (where boomerang runs) and, if NO file has been touched in 31 days, delete WAVECAR, CHGCAR, and vasprun.xml files. If you need one of these files, just rename it, e.g. gzip CHGCAR, and it will not be deleted.

Files

All data files pertaining to this program are stored in a hidden directory in your home directory .boomerang. “Global” files are stored at ~/.boomerang/file. “Local” files are stored at ~/.boomerang/default/file. Some files can be global or local; this means that if there exists a file at ~/.boomerang/default/file then it will be used; else the file at ~/.boomerang/file will be used. (“default” here can be any queue name, e.g. “dualsocket”.) We will discuss applications of this in a bit.

Here are the files that boomerang uses:

  • environment (global) - This file contains the environment variable LD_LIBRARY_PATH (and any others) that need to be set on the remote machines. The format of this file is one line per machine, with the first word being the machine name and the following words being a bash command. The default value will be applied to all machines unless the machine is explicitly named. This file is actually not required; the default value is hard coded into the program.

  • queue (local) - Shows which directories on your local machine are queued to be submitted.

  • running (local) - Shows which files are currently running on the remote machines.

  • finished (local) - This contains a list of all the jobs that have finished.

  • required (global or local) - A list of files which must be present in order to queue a job.

  • exclude (global or local) - Do not transfer these files back from the remote machine. (See “Disk Space” above for how long they are retained on the remote machine.)

  • machlist (global or local) - contains a list of machines to submit to. These may be shortcuts defined in your ssh config file or complete IP names. Only one machine per line should be listed. Lines beginning with # will not be read.

  • status (global) - The query method will look up how many jobs there are on each machine and write the result here.

  • script (global or local) - This is the default script that will be submitted unless a file name boomerang_submit.sh exists in the directory. An example script is shown here. The first step is to copy the vasp executable to the local directory with a distinct name so you know which executable corresponds to which job. boomerang will replace any instance of XXXX in your script with a four-digit random integer. A time command is included for convenience, though it may not be necessary. There is a an if statement which executes a postprocessing script if it is present in the running directory. Finally, the vasp executable is removed. Please note that this is only an example and the script can be whatever you want. This is not tied to vasp in any particular way.

## SCRIPT DEFAULT
#BOOMMEM=10 # Memory expected in GB
#BOOMCPU=400 # should be 100*ncores
ncores=4
npar=2

# Find executables
vasploc=`which vasp5-intel-parallel`
if [ "$vasploc" = '' ]; then vasploc="/usr/local/bin/vasp5-intel-parallel"; fi
mpiloc=`which mpirun`
if [ "$mpiloc"  = '' ]; then mpiloc="/opt/intel/impi/4.1.3.048/intel64/bin/mpirun"; fi
cp $vasploc vaspexe.XXXX

# Setup and run
echo "NPAR=$npar" >> INCAR
ulimit -s unlimited
/usr/bin/time -a -o out -f "\n%PCPU \n%U user-sec \n%S system-sec \n%E elapsed-hr:min:sec \n%e elapsed-sec" $mpiloc -n $ncores ./vaspexe.XXXX

if [ -f postp.sh ]; then ./postp.sh ; fi

rm ./vaspexe.XXXX

If you’re using this script, notice a few things:

  • The location has a default; boomerang does not source ~/.bashrc, so may have trouble finding the executable in the PATH

  • We change the executable to exe.XXXX, which lets us see e.g. whether the executable is running (boomerang.py -check) or which directory is hogging all the memory (top)

  • NPAR and number of cores (ncores) may depend on the queue (e.g. dual-socket machines), so this just puts it straight into the INCAR

The software repository has a “dotboomerang” directory that contains information for a few different clusters and options; see below how to get started with this.

  • The default queue runs VASP 5 on single-core (3770 and 3930)

  • The dual queue runs VASP5 on dual-core (mostly 2670v2). This has a modified machlist

  • The lowmem queue runs VASP5 on single-core machines, but is willing to scatter with only 3GB of memory available. This has a modified script.

Multiple clusters: If you want a different cluster (e.g. high memory), we can make a new directory ~/.boomerang/highmem, which contains all the local files and an adjusted “machlist” file. If you want a different script (e.g. abinit instead of vasp), we can make a directory ~/.boomerang/abinit with all the local files and an adjusted “script” file. We do this as follows:

$ mkdir ~/.boomerang/highmem
$ touch ~/.boomerang/highmem/{queue,finished,running}
$ vi ~/.boomerang/highmem/machlist

Setup

To setup boomerang for the first time for VASP, just copy the software repository’s dotboomerang/* into ~/.boomerang run:

$ mkdir ~/.boomerang
$ cp dotboomerang/* ~/.boomerang/

Make sure you have boomerang.py in your PATH and inputs.py in your PYTHONPATH. You can adjust these in ~/.bashrc.

Universal status

As of May 2014, boomerang can check for a status maintained across all machines in our cluster. This alleviates the need for everyone to query each machine. There is a script that automatically updates the universal status file every few minutes, and every time someone submits something, boomerang updates the status file accordingly. For more details, see the readme file in the boomerang_public directory (currently on columbuso; hopefully soon to be moved to grandcentral).

If one person uses boomerang without the universal status, there is a greater chance of multiple people submitting to one machine within a short time span, which will make jobs run slower. That said, if you want to do it, here are some ways to do it:

  1. Command-line usage: When using a command that needs querying (e.g. scatter, force, status), add the command “-queryall” to manually check each machine in machlist (alternatively: “-noremote”). If you queried it recently and want to use the saved status in ~/.boomerang/status, use the command “-noquery” instead. Note that your status may be out of date, and there is no way to figure that out without querying.

  2. Python usage: There is no automatic querying; just instantiate the class as bb=boomerang(noremote=True), and the query method won’t look for a universal status.

  3. Change boomerang.py: during __init__, change the default value of noremote to True.

Command-line usage

We will now discuss how to do things from the command line. For the most part, each task is associated with a given method. The methods themselves are documented in the code and these comments are pulled out into the docstring section below. The most important methods are: query, queue, scatter, get.

  • Identifying the cluster - The name keyword may be combined with all of the below commands. If name is not given, it will be assumed that the name is default. One may want to have different clusters for different purposes; see above.

    $ boomerang.py name=dual
    
  • Preparing remote machines - Create the remote run directories on the of the remote hosts (i.e. currently called ~/tmp_runs). Each run will be placed at ~/tmp_runs/directorynameXXXX, where XXXX is a random 4-digit number. This may have been already accomplished when your profile was set up on the remote machine.

    $ boomerang.py -prep
    
  • Querying availability - Check all of the machines to see how many jobs are running on each. This updates the status file. There are a few ways to do this:

    • “boomerang.py -query” : Attempts to retrieve status from the master (universal) status file; any failure or machines that are missing are updated “by hand”

    • “boomerang.py -query -quick” : Attempts to retrieve status from the master (universal) status file; any machines that are missing are NOT updated “by hand”

    • “boomerang.py -queryall” : Queries all machines manually, without inquiring of universal status

    • “boomerang.py -noquery” : Reads the (local) status file that was last written, although it is very likely out of date. This is not recommended.

  • Queueing - Go to the directory which contains the files you want to submit and execute:

    $ boomerang.py -queue
    
  • Summary - Show what jobs are in the queue, running.

    $ boomerang.py -sum
    
  • Status - Show how many jobs are running on the cluster and by whom

    $ boomerang.py -status
    
  • Scatter queued jobs - Send queued jobs to available slots on remote machines. Unless you write -noquery, it will ensure that the usage is current.

    Other command-line flags exist for scatter: hogmax=XX tells boomerang to take no more than XX% (default 75) of the cluster, to leave space for other users; hogmin=XX tells boomerang to take XX% (default 30) even if nothing is left for others; the script will take some amount in between these, depending on how much is empty. To use all default values, just run:

    $ boomerang.py -query
    $ boomerang.py -scatter
    
  • Cat running jobs - Cat a given file for each running directory on the remote machine. If no file is given, the default is OSZICAR:

    $ boomerang.py -cat INCAR
    
  • Go to remote machine - Change to a running directory and execute boomerang with the go option and it will take you to the running directory on the remote machine:

    $ boomerang.py -go
    
  • Get finished job - Retreive the results from all finished calculations:

    $ boomerang.py -get
    
  • Force - Force a job to start on a given remote machine regardless of its status. There is no need to queue the job in this case, and you should be in the directory of interest upon executing the command:

    $ boomerang.py force=gramercy
    
  • Kill - This will kill a job. You can provide the call with a filename, and this file should contain a list of all directories that should be killed. If no filename is given, it will try to kill the current directory:

    $ boomerang.py -kill [filename]
    
  • Force Get - This will force boomerang to try and get files from a remote host for the current directory. It is not relevant if they have already been retrieved. Of course, it will fail if they have already been deleted.

    $ boomerang.py -forceget
    
  • Check - This will check each directory that is currently running to ensure the executable is still running. It returns the directories for which no executable is found. (It assumes that the executable has the four-number XXXX in the name; see the above example script.)

  • remove - This will remove the directory from any finished, queue, or running files.

Docstrings from boomerang.py

exception boomerang.TimeoutException
class boomerang.boomerang(name, noremote=False, debug=False)

This class is a simple queueing system which relies only on SSH. To use:

  1. Choose a script / machine list

(2a) Queue directory(ies) and scatter, or

(2b) Send (force) a directory to a chosen machine

  1. Get the runs back

add_to_queue(direc=None, post_proc=None)

Add a directory to the queue.

calc_users()

Uses self.nodes to get information about users, to be stored in self.users={username:totCPU}, summed only over nodes in the machlist. Returns the dictionary self.users

cat(file='')

Cat the results of the “file” for each running job. If no file is given, the file OSZICAR will be catted.

check(goto=False, timeout=6)

Checks all running directories to ensure that it is running. This works only if the script includes creation of an executable with XXXX in its name.

fetch_local()

Fetch and load file from local status. Does NOT update this status.

fetch_remote(timeout=10, ntries=3, updatewait=10, lockwait=10)

Fetch and load info from remote status, as defined in self.ustatus[‘mach’] and self.ustatus[‘dir’]. Files in that directory are status, lock, and update. Returns list of machines successfully retrieved Default wait time is 10 seconds… after 3 tries, this is still shorter than the about 45 seconds to query all

find_resources(hogmin=25, hogmax=75, verbose=False)

Figure out which machines are available for jobs. Returns list of machines to scatter to (including some repeated elements if appropriate to scatter >1 job to it) Hog specifies the amount to hog the cluster in percent: If there is hogmin% (25 default) or less available, I will hog it all up. On the other extreme, if the cluster is entirely empty, I will only take hogmax%% (75 default). The middle is interpolated.

This won’t work right if your login username isn’t the username on the cluster. So you should hardcode self.username above, e.g. add the line

if self.username == ‘localname’: self.username = ‘remotename’

Algorithm for equitable distribution: If myfrac=.75 (hogmax), I leave .25 (1-hogmax) free. If myfrac<=.3 (hogmin), I leave 0 free. Interpolate: If I have myfrac, I leave freefrac=(1-hogmax)/(hogmax-hogmin) * (myfrac-hogmin) [ but if freefrac<0, set freefrac=0] Let mi, mf = myfrac initial (before) and final (after scattering). Let hp,hm equal hogmax,hogmin. Let a = frac available, l = frac I leave, t = frac I take. . Then the three equations we need are: 1) t = mf - mi 2) a = t + l 3) l = (mf-hm)*(1-hp)/(hp-hm) We want to eliminate l and mf, and solve for t, so we get t = hm - mi - (hp-hm)*(1-a-mi)/(1-hm) or takefrac = hogmax - myfrac - (hogmax-hogmin)*(1-availfrac-myfrac)/(1-hogmin) With the provision that 0 <= t <= a

force(inp)

Force the job in the current directory to be submitted to machine inp.

force_get(gfname=None, ddir=None)

Force boomerang to get the files from the current run directory. It is not relevant when this run was performed. An attempt will be made to transfer the files from the remote node. Of course, this will fail if they have already been deleted.

format_status(prevfile='', remote=False)

Method to format a status file for writing. Output is the string to go to file If prevfile exists (should be the file’s contents, a string or splitted string), any machine that does not exist in self.machlist is retained in the output file. File structure is space-delimited. Each line is machname usedCPU freeCPU freeMEM totCPU user1 user1CPU user2 user2CPU … If remote is enabled, it does not include usages of zero (formerly, “remote” also prohibited writing nodes that were not read from remote.)

get(timeout=6)

Get all running files that have finished. This function checks to see that running_boomerang_X_X no longer exists on the remote and if not it will copy all files back.

go(ddir='')

Go to the machine running the job in directory dir or in the current directory if dir is empty. This will work even if the job has finished and been transferred back as long as the remote directory has not been removed.

kill(inp='')

Kill all or some selected set of jobs. If inp is empty it is assumed that the job in the current directory will be killed. Otherwise, inp can be a filename which contains a list of directories to be killd. Note that a killed directory will not be retrieved with the “get” method, but “force_get” should still work

loop_get_scatter(lpause=5, lsleep=300, hogmax=75)

This method will run boomerang in a continuous loop doing the following: scatter, pause, get, sleep. The inputs are seconds.

parse_script(script='', defaultcpu=400, defaultmem=13)

cpu is given in 100*ncores; memory is given in GB Parse script for information about run requirements If no script argument is given, it uses self.file[‘script’] The script should include the following lines: #BOOMMEM=8 #BOOMCPU=200 (spaces don’t matter)

parse_status(stringorlist)

Method to read a status file. Info is loaded into memory Input is either the filestring or the filestring.strip().splitlines() File structure is space-delimited. Each line is machname usedCPU freeCPU freeMEM totCPU user1 user1CPU user2 user2CPU … Returns list of machines successfully parsed

prep()

Create temporary run directory called tmp_runs on each remote machine in the cluster.

prepare_run(ddir, post_proc=None)

This method handles all tasks associated with creating the submission script.

print_status(allmachs=False, nomachs=False)

Prints information about usage of each node and each user. Argument can request all machines, even those not in self.machl

print_summary()

Print out a summary of the queued files, the running files, but not status of the cluster (use print_status() for that).

query(remotetimeout=10, onetimeout=1, minjp=60, minmem=15, quiet=False, quick=False)

Tries to fetch status remotely; if cannot, queries machines. If only gets some machines and quick==True, queries the rest.

query_all(timeout=4, minjp=60, minmem=15, quiet=False)

Find out how many jobs are running on each node of the cluster and who is running them.

query_one(mach, timeout=2, minjp=60, minmem=15, quiet=False)

Query one node. Returns either 1 (for error), or the self.nodes entry: [usedCPU,freeCPU,freeMEM,totCPU,userdic] where userdic={username:usedCPU} Also writes to self.nodes. Arguments: machine name (as in .ssh/config), timeout for connection, min cpu to count as a job, min memory to count as a job, and quiet mode

remove(inp='')

Removes a directory from queue, running, and finished files. If it is running on the remote computer, it IS NOT killed. If it is queued, running, or finished, the associated local files (queue_boomerang, running_boomerang, bsubmit.sh) are NOT deleted. Input can be a filename with a list of directories, or blank for the current directory

scatter(hogmin=25, hogmax=75)

Scatter the queued jobs out to all the avaiable nodes on the cluster.

Hog specifies the amount to hog the cluster in percent: See find_resources for details

This won’t work right if your login username isn’t the username on the cluster. So you should hardcode self.username above, e.g. add the line

if self.username == ‘localname’: self.username = ‘remotename’

submit_run(ddir, dest)

This function handles all the tasks with submitting a job. This includes creating a finalized submission script, transfering files, and submit the script. Please note that a final line is added to the submission script which removes the file running_boomerang_X_X. This is the only way that boomerang will know that a job has finished.

write_local()

Writes status to local file (usually ~/.boomerang/status)

write_remote(timeout=10, ntries=3, overrideremote=False, selfunlock=True, lockwait=5)

Writes status to remote location. Checks if someone else has locked it overrideremote= write to remote even though nothing was read from remote. selfunlock = write even if locked, provided it was locked by same username.

boomerang.ssh1(mach, remotecommands, sshargs=[], timeout=30)

SSH into mach, with ssh arguments sshargs, and execute remotecommands. If the entire process takes longer than timeout, the process and its children are killed. Timeout here is the python-goverened timeout, and has nothing to do with when SSH ConnectTimeout