User Tools

Site Tools


tutorial:torque_administrator_and_operator_commands

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
tutorial:torque_administrator_and_operator_commands [2024/03/01 08:20]
mjm519
tutorial:torque_administrator_and_operator_commands [2024/04/10 14:08] (current)
mjm519 [Commands:]
Line 1: Line 1:
-====== Common Commands for managing PBS Torque useful for Managers and Operators ======+====== Torque Common Commands useful for Managers and Operators ======
  
 +Information about command usage can be obtained using the links below or via the man or info commands.
 +Example:
 +<code>
 +info qdel
 +
 +man qdel
 +
 +</code>
  
 ===== Commands: ===== ===== Commands: =====
  
 == qmgr == == qmgr ==
 +[[http://docs.adaptivecomputing.com/torque/4-1-4/Content/topics/commands/qmgr.htm|Qmgr]] is the main application used to manage PBS Torque. This is used to create queues, modify queue settings, and print the configuration. Please review the man page or the link for information about this command. Any changes implemented using this command are permanent and are written to the running configuration file.
 +
 +<code>
 +user@polyp1:~$ qmgr -c "print queue batch"
 +user@polyp1:~$ qmgr -c "p q batch"
 +#
 +# Create queues and set their attributes.
 +#
 +#
 +# Create and define queue batch
 +#
 +create queue batch
 +set queue batch queue_type = Execution
 +set queue batch resources_max.walltime = 01:00:00
 +set queue batch resources_default.mem = 2gb
 +set queue batch resources_default.ncpus = 1
 +set queue batch resources_default.nodes = 1
 +set queue batch resources_default.pmem = 2gb
 +set queue batch resources_default.vmem = 2gb
 +set queue batch resources_default.walltime = 01:00:00
 +set queue batch enabled = True
 +set queue batch started = True
 +</code>
 +
  
 == qstat == == qstat ==
 +[[http://docs.adaptivecomputing.com/torque/4-1-4/Content/topics/commands/qstat.htm|qstat]] - Qstat shows the status of jobs in the queuing system.
 +
 +<code>
 +mjm519@polyp1:~$ qstat -q
 +
 +server: polyp1
 +
 +Queue            Memory CPU Time Walltime Node  Run Que Lm  State
 +---------------- ------ -------- -------- ----  --- --- --  -----
 +MOSEK              --      --       --      --    0   0 48   E R
 +AMPL               --      --       --      --    0    8   E R
 +long               --      --    72:00:00   --    0   0 30   E R
 +gpu                --      --       --      --    0    4   E R
 +verylong           --      --    240:00:  --    0   0 20   E R
 +medium             --      --    04:00:00   --    0   0 10   E R
 +coraverylong       --      --       --      --    0   0 --   E R
 +special            --      --    72:00:00   --    0   0 24   E R
 +batch              --      --    01:00:00   --    0   0 --   E R
 +short              --      --    02:00:00   --    0   0 --   E R
 +urgent             --      --       --      --    0   0 --   D S
 +background         --      --       --      --    0   0 --   E R
 +mediumlong         --      --    24:00:00   --    0   0 60   E R
 +                                               ----- -----
 +                                                       0
 +mjm519@polyp1:~$ qstat -Q
 +Queue              Max    Tot   Ena   Str   Que   Run   Hld   Wat   Trn   Ext T   Cpt
 +----------------   ---   ----    --    --   ---   ---   ---   ---   ---   --- -   ---
 +MOSEK               48      0   yes   yes                         0 E     0
 +AMPL                      0   yes   yes                         0 E     0
 +long                30      0   yes   yes                         0 E     0
 +gpu                  4      0   yes   yes                         0 E     0
 +verylong            20      0   yes   yes                         0 E     0
 +medium             100      0   yes   yes                         0 E     0
 +coraverylong              0    no    no                         0 E     0
 +special             24      0   yes   yes                         0 E     0
 +batch                0      0   yes   yes                         0 E     0
 +short                0      0   yes   yes                         0 E     0
 +urgent                    0    no    no                         0 E     0
 +background                0   yes   yes                         0 E     0
 +mediumlong          60      0   yes   yes                         0 E     0
 +</code>
  
 == qalter == == qalter ==
 +[[http://docs.adaptivecomputing.com/torque/4-1-4/Content/topics/commands/qalter.htm|qalter]] - alter a non-running queued job.
 +
 +<code>
 +qalter <job number> <change to the queued job>
 +
 +The line below changes the node specification for job id 1398668. Original job submission did not specify a node, this specifies a node with the same ppn (processors per node)
 +
 +                      qalter 1398668 -l nodes=polyp14:ppn=1
 +
 +</code>
  
 == qdel == == qdel ==
 +[[http://docs.adaptivecomputing.com/torque/4-1-4/Content/topics/commands/qdel.htm|qdel]] - Delete a batch job
 +<code>
 +qdel <job id>
 +                   qdel 1186460
 +</code>
  
 == qhold == == qhold ==
 +[[http://docs.adaptivecomputing.com/torque/4-1-4/Content/topics/commands/qhold.htm|qhold]]
 +Place non-running job in a state so it will not run. If the job is running, and checkpointing is not enabled the job will terminate.
  
 == qmove == == qmove ==
 +[[http://docs.adaptivecomputing.com/torque/4-1-4/Content/topics/commands/qmove.htm|qmove]]
 +Move a submitted job to another queue.
  
 == qrun == == qrun ==
 +[[http://docs.adaptivecomputing.com/torque/4-1-4/Content/topics/commands/qrun.htm|qrun]]
 +Command used to take a non-running job and make it run. When I use this, I look at running jobs and then use the qalter command to specify the node I want the job to run on (based on available resources).
  
 == qstart == == qstart ==
 +[[https://www.mankier.com/8/qstart|qstart]] The qstart command directs the torque server process batch jobs. This command can enable the entire server or a queue.
  
 == qstop == == qstop ==
 +[[https://www.mankier.com/8/qstop|qstop]] The qstop command directs the torque server to stop processing batch jobs. Can be use to disable the server or just a queue.
 +
 +== qenable ==
 +[[https://www.mankier.com/8/qenable|qenable]] The qenable command directs the destination (server / queue) to accept jobs for processing.
 +
 +== qdisable ==
 +[[https://www.mankier.com/8/qdisable|qdisable]] The qdisable command directs the destination (server / queue) to stop accepting jobs for processing.
 +
 +== pbsnodes ==
 +[[https://docs.adaptivecomputing.com/torque/4-1-3/Content/topics/commands/pbsnodes.htm|pbsnodes]]
 +This command can be use to enable and disable nodes in the queuing system.
 +<code>
 +manager@polyp1:~$ pbsnodes -l
 +polyp5               down
 +polyp14              offline
 +polyp15              offline
 +polyp30              offline
 +
 +Take a node offline:
 +manager@polyp1:~$ pbsnodes -o polyp15
 +
 +Put a node back online:
 +manager@polyp1:~$ pbsnodes -c polyp15
 +
 +Current Normal Output from pbsnodes -l:
 +manager@polyp1:~$ pbsnodes -l
 +polyp5               down,offline
 +polyp30              offline
 +</code>
 +
  
  
tutorial/torque_administrator_and_operator_commands.1709299259.txt.gz · Last modified: 2024/03/01 08:20 by mjm519