Code Invocation

  • Depending on the details of your installation, you have the PyHST script sitting somewhere in a directory. If you have installed PyHST as a Debian packageyou have PyHST in the PATH and you can run it as discussed below. Otherwise you might need to prepend PyHST, with the PyHST script directory path, to form the full path.

    The details of the command line are taken in charge by PyHST.py. The following documentation has been generated automatically from the comments found in the code.

CODE INVOCATION: There are two distinct cases which are dealt differently

  • You are in a OAR_ environment. (OAR is a versatile resource and task manager)

  • You dont have a resource allocation system, but you know which hosts you can use

The details for both cases are given below. An important information to retain is that the resource granularity is one cpu with its eventually associated GPU.

  • Things are extremely easy within an OAR, or SLURM environment :
    • PyHST2_XX input.par

      where input.par is the input file, XX is the version number

    • Resources allocations

      • OAR

        • provided that your OAR request demanded an integer number of CPUs like this for an interactive job :

          oarsub -p”gpu=’YES’ ” -l”nodes=1,walltime=2:20:00” -I

          or using more evoluted, eventually non interactive, requests as thouroughly explained here : http://wikiserv.esrf.fr/software/index.php/Main_Page In any case alway request an integer number of CPUs ( a CPU has several cores )

      • SLURM

        • INTERACTIVE : witha the actual configuration at ESRF this request takes a whole node with gpu, because all nodes have 2 GPUS we ask 2 process per node. Infact PyHST2 is organised in such a way that each (multithreaded ) process uses a GPU

          salloc  -p gpu --exclusive --nodes=1 --tasks-per-node=2 --time=200 --gres=gpu:2  srun --pty bash
          

          Then once you are in the command shell give the simple command PyHST2_XX input.par . Two process will be runned using the allocated gpus. If you dont take the whole node remember to ask a bunch of cpus you can ad the SLURM directive

          --cpus-per-task=N
          

          with N bigger per one. A task is a process, if you use GPUS each allocated GPU will match a process

        • BATCH : to obtain the same result as in the interactive example you can sbatch myscript where script is the following script

          #!/bin/bash
          #SBATCH --nodes=1
          #SBATCH  --exclusive 
          #SBATCH --tasks-per-node=2
          #SBATCH --time=200
          #SBATCH --gres=gpu:2  
          #SBATCH -p gpu
          PyHST2_XXXX input.par
          
  • Things are no more difficult when you have to manually specify the resources you want. If on your machine neither OAR nor SLURM exist, this will be the way to go. Alternatively, if SLURM or OAR are installedm you can manually disable the OAR/SLURM features of PyHST2 by setting

    export PYHSTNOSCHEDULER=YES
    

    Then you can run PyHST : * PyHST2_XX input.par gpuA,1,0,gpuB,1,0

    in this example gpuA and gpuB are the two hosts where you are running PyHST. Each host has two CPUs, the first of which will be associated with GPU number 1 and the second with GPU number 2

    • PyHST2_XX input.par hostname_with_CPUonly,-1

      In this example the -1 flag indicates that no GPUs have to be used

    • You must have created beforehand a file machinefile. The content of which will be, for the first example:

      gpuA slot=1
      gpuA slot=1
      gpuB slot=1
      gpuB slot=1
      

      in other words machinefile file contains one instance of the hostname per each cpu. A process will be spawned per each cpu

      • NOTICE for parallelism : on some cluster things may get complicated with docker being present as an example which creates virtual network interfaces that must be avoided. To avoid an interface name docker0 use this variable in the input file (IF as InterFace):

        IF_EXCLUDE = docker0
        

        if instead you know the name of realiable interface, for example eth1, use this:

        IF_INCLUDE = eth1