FAQs

Q: How to run a processing on the ANITI compute cluster?

A: You need to SSH into the machine "isis.aniti.fr" (port 22) and send a batch file with the "sbatch" command. Consult the SLURM documentation for detailed information.

Q: My Job is running correctly and my data is being processed correctly. Why does it appear as "Failed" in the end email and the sacct command?

A: One possible reason is:

The success or failure of any shell command (standard command, script, function, program...) is determined by its return code (exit code); zero (0) indicates success and ">0" indicates failure. In a shell script (like Batch Slurm), the script's exit code is determined by the exit-code of its last statement or command executed.

For example, if the end of a script is:

# Tests whether /some/file exists or not

if [[ -f '/some/file' ]]; then
    echo "=> Do something"
fi
  • If the file exists, the last command is "echo", which will return 0/Success.

  • If it does not exist, the last command executed is the test "-f" from file which returns 1/Failed. No other command not being evaluated, the whole script will return 1/Failed.

Solution: Add "exit 0" explicitly in cases where the script "passes" even if the previous test or command fails.

Q: I need to run a lot of similar jobs, how To do ?

A: The easiest way is to create a Job Array.

Q: I requested N Tasks of C CPUs in my batch but my Job never uses more C CPUs at the same time!

A: Have you created Job Steps? If more than one Task is requested but no Step is explicitly declared, the total allocation WILL NOT EXCEED 1 Task ( CPUs)!

Q: I am trying to run Job Steps in parallel but they run one after the other!

A: To run Job Steps in parallel, you need to run the srun command in "background" by adding '&' at the end of the line.

Q: I am trying to run Job Steps in parallel but they are not executed, the Batch stops immediately!

A: Once the Job Steps have been declared, it is essential to use the "wait" command to make the parent process (the Job) wait for the end execution of child processes (Steps).

Q: In my "output" file, Slurm gives me the Warning: "srun: Warning: can't run 2 processes on 4 nodes, setting nnodes to 2."

A: When running a Job Step with "srun", the parameter "-n" (ntasks) must be greater than or equal to "-N" (nodes), otherwise Slurm displays this warning and reduces -N to -n.