| |
| qstat |
Show ALL jobs - mine and everyone else |
| qstat -n |
Show all jobs and, if the job(s) are running, the nodes used. |
| qstat -u <netid> |
Show all jobs, including the nodes used, for the user identified by
netid |
| qstat -s <job_id> |
Show if the job identified by job_id is running; if not running, the
reason |
| qstat -f <job_id> |
Show detailed information about the job identified by job_id |
Example 1:
qstat
Explanation
Show the status of ALL jobs.
Sample output:
Job id Name User Time Use S Queue
---------------- ---------------- ---------------- -------- - -----
20836.argo-new matgen3 jsmith 952:2 R batch
21026.argo-new sicl22_1.p jsmith 185:4 R batch
24779.argo-new sic123_1.p jsmith 0 Q batch
2627.argo-new runtest homa1 167:50:5 R batch
2628.argo-new runtest homa1 167:48:4 R batch
Four of the five jobs are running - those with the R (highlighted in red) in the
second to last column (the (S)tatus column). The remaining job, the one on line three of the
output, is queued, so indicated by the letter Q in the status column.
Example 2:
qstat -n
Explanation
Show the status of ALL jobs and, for the running ones, the nodes used.
Sample output:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
20836.argo-new jsmith batch matgen3 5702 1 -- -- 336:0 R 952:2
argo2-3/0
21026.argo-new jsmith batch sicl22_1.p -- 4 -- -- 336:0 R 185:4
argo8-1/0+argo8-2/0+argo8-3/0+argo8-4/0
24779.argo-new jsmith batch sic123_1.p 0 Q
2627.argo-new.c homa1 batch runtest 24749 1 -- -- 336:0 R 167:5
argo15-1/0
2628.argo-new.c homa1 batch runtest 19751 1 -- -- 336:0 R 167:5
argo15-2/0
Four of the five jobs are running - those with the R (highlighted in red) in the
(S)tatus column. The nodes used for each running job are shown in the next line. The other job, 24779, is queued and
consequently has no nodes assigned (no second line).
The following table briefly discusses each of the eleven fields in the qstat -n output for a
running job:
| Field |
What it means. |
| 20836.argo.cc.u |
Job id. Disregard the argo-new |
| jsmith |
Owner of the job |
| batch |
Queue |
| matgen3 |
Name of the job |
| 5702 |
Session id. Some jobs will not have a session id. |
| 1 |
Number of nodes used. The particular nodes are identified in the following line |
| -- |
Disregard |
| -- |
Disregard |
| -- |
Disregard |
| R |
Job status. R: running Q: queued. E: exiting after run |
| 952:2 |
Elapsed time |
The second line in the example contains two fields:
| Field |
What it means. |
| argo2-3 |
The name of the node used. The number of node names matches
the entry in field six of line one. Multiple node names are separated
by plus signs. |
| 0 |
Identifier of the job on the node. The first job is given
identifier zero. If a second job begins execution while the first job
is still running, the second job is given the identifer one. |
Example 3:
qstat -u <netid>
For example: qstat -u jsmith
Explanation
Shows all jobs for user jsmith. Replace the jsmith with your netid.
Sample output:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
20836.argo-new. jsmith batch matgen3 -- 1 -- -- 336:0 R 149:1
21026.argo-new jsmith batch sick122_1.p -- 4 -- -- 336:0 R 63:35
24779.argo-new jsmith2 batch sick123_1.p 0 Q
Example 4:
qstat -s <job_id>
For example: qstat -s 20836
Explanation
Show if job 20836 is running. If not, then the reason it is not running
is given.
If the job status field in example one, line one, field six, is the letter
R (for running), then display, among other information, when the job was started.
Sample output: Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
------ -------- ----- ------- ----- --- --- ------ ----- - -----
20836.argo-new jsmith batch matgen3 5702 1 -- -- 336:0 R 952:2
Job started on Thu Mar 31 at 23:26
If the job status field in has the letter Q (for queued) then display the
reason.
Sample output:
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
20836.argo-new jsmith batch sick123_p.1 -- 4 -- -- 336:0 Q --
Not Running: Not enough of the right type of nodes are available
The job is requesting four nodes; one or more is currently not available. Run the
qnodes
command to see why the node is not available:
qnodes | more
Sample output:
Node Total Free Physical 5 Min
Processes np By Zone Memory KB Memory KB Memory KB LoadAvg Status
--------- -- ------- --------- --------- --------- ------- ------------------
1 4 argo2-1 8262023 8048047 4067720 0.00 Accepting Jobs
1 4 argo2-2 8262023 8052751 4067720 0.00 Accepting Jobs
2 2 argo2-3 8262023 8050963 4067720 0.00 Queueing New Jobs
1 4 argo2-4 8262023 7532567 4067720 1.99 Accepting Jobs
1 4 argo3-1 8262023 7874687 4067720 0.99 Accepting Jobs
...
Any new job submissions requesting argo2-3 are queued in a wait state
because the current number of running
jobs on the node, the value in the processes field (in this case, two)
is equal to the maximum number of jobs
permitted to run on the node, the value in the np field.
Node Processes np By Zone Status --------- -- ------- --------- 2 2 argo2-3 Queueing New Jobs
Until one of the running jobs on argo2-3 completes, job 20836 will
not execute. If more than one node is requested, all must be available
for the job to execute; one unavailable node prevents the job from starting.
Example 5:
qstat -f <job_id>
For example: qstat -f 20836
Explanation
Show detailed information about a job
Sample output:
Job Id: 2210.argo-new.cc.uic.edu
Job_Name = test.pbs
Job_Owner = jsmith@argo-new.cc.uic.edu
resources_used.cput = 32:37:20
resources_used.mem = 408200kb
resources_used.vmem = 648680kb
resources_used.walltime = 19:36:24
job_state = R
queue = batch
server = argo-new.cc.uic.edu
Checkpoint = u
ctime = Thu Jan 19 14:22:12 2006
Error_Path = argo-new.cc.uic.edu:/home/homes52/jmsith/test.pbs.e2210
exec_host = argo10-4/0+argo10-1/0+argo10-2/0+argo10-3/0
Hold_Types = n
Join_Path = n
Keep_Files = n
Mail_Points = a
mtime = Thu Jan 19 14:22:16 2006
Output_Path = argo-new.cc.uic.edu:/home/homes52/jsmith/test.pbs.o2210
Priority = 0
qtime = Thu Jan 19 14:22:12 2006
Rerunable = True
Resource_List.neednodes = argo10-4+argo10-1+argo10-2+argo10-3
Resource_List.nodect = 4
Resource_List.nodes = argo10-4+argo10-1+argo10-2+argo10-3
Resource_List.walltime = 336:00:00
substate = 42
Variable_List = PBS_O_HOME=/home/homes50/jsmith,PBS_O_LANG=en_US.UTF-8,
PBS_O_LOGNAME=jmsith,
PBS_O_PATH=/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:/usr/common/pgi
/linux86/6.0/bin:/usr/common/intel/cc/9.0/bin:/usr/common/intel/fc/9.0/
bin:/usr/common/intel/idb/9.0/bin:/usr/common/g03/bsd:/usr/common/g03/l
ocal:/usr/common/g03/extras:/usr/common/g03:/usr/common/g03/bsd:/usr/co
mmon/g03/local:/usr/common/g03/extras:/usr/common/g03,
PBS_O_MAIL=/var/mail/jsmith,PBS_O_SHELL=/bin/csh,
PBS_O_HOST=argo-new.cc.uic.edu,PBS_O_WORKDIR=/home/homes52/jsmith,
USER=jsmith,LOGNAME=jsmith,HOME=/home/homes52/jsmith,
PATH=/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:/usr/common/pgi/linux
86/6.0/bin:/usr/common/intel/cc/9.0/bin:/usr/common/intel/fc/9.0/bin:/u
sr/common/intel/idb/9.0/bin:/usr/common/g03/bsd:/usr/common/g03/local:/
usr/common/g03/extras:/usr/common/g03:/usr/common/g03/bsd:/usr/common/g
03/local:/usr/common/g03/extras:/usr/common/g03,MAIL=/var/mail/jsmith,
SHELL=/bin/csh,SSH_CLIENT=128.248.155.80 55913 22,
SSH_CONNECTION=128.248.155.80 55913 128.248.121.63 22,
SSH_TTY=/dev/pts/13,TERM=xterm,HOSTTYPE=i386-linux,VENDOR=intel,
OSTYPE=linux,MACHTYPE=i386,SHLVL=2,PWD=/home/homes52/jsmith,
GROUP=student,HOST=argo-new,REMOTEHOST=icarus.cc.uic.edu,
LS_COLORS=no=00:fi=00:di=00;34:ln=00;36:pi=40;33:so=00;35:bd=40;33;01:
cd=40;33;01:or=01;05;37;41:mi=01;05;37;41:ex=00;32:*.cmd=00;32:*.exe=00
;32:*.com=00;32:*.btm=00;32:*.bat=00;32:*.sh=00;32:*.csh=00;32:*.tar=00
;31:*.tgz=00;31:*.arj=00;31:*.taz=00;31:*.lzh=00;31:*.zip=00;31:*.z=00;
31:*.Z=00;31:*.gz=00;31:*.bz2=00;31:*.bz=00;31:*.tz=00;31:*.rpm=00;31:*
.cpio=00;31:*.jpg=00;35:*.gif=00;35:*.bmp=00;35:*.xbm=00;35:*.xpm=00;35
:*.png=00;35:*.tif=00;35:,G_BROKEN_FILENAMES=1,
SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass,LANG=en_US.UTF-8,
SUPPORTED=en_US.UTF-8:en_US:en,LESSOPEN=|/usr/bin/lesspipe.sh %s,
QTDIR=/usr/lib/qt-3.1,
LD_LIBRARY_PATH=/usr/common/g03/bsd:/usr/common/g03/local:/usr/common/
g03/extras:/usr/common/g03:/usr/common/gv/lib:/usr/common/g03/bsd:/usr/
common/g03/local:/usr/common/g03/extras:/usr/common/g03:/usr/common/gv/
lib:/lib:/usr/lib:/usr/common/pgi/linux86/6.0/lib:/usr/common/pgi/linux
86/6.0/liblf:/usr/local/lib:/usr/common/intel/cc/9.0/lib:/usr/common/in
tel/fc/9.0/lib:/usr/common/g03:/usr/common/g03,PGI=/usr/common/pgi,
LM_LICENSE_FILE=/usr/common/pgi/license.dat,INTEL=/usr/common/intel,
MANPATH=/usr/common/g03/bsd:/usr/common/g03/bsd:/usr/share/man:/usr/lo
cal/man:/usr/common/pgi/linux86/6.0/man:/usr/common/MPICH/man:/usr/comm
on/intel/fc/9.0/man:/usr/common/intel/cc/9.0/man:/usr/common/intel/idb/
9.0/man:/usr/common/torque/doc/man,HOSTNAME=argo-new,
g03root=/usr/common,GAUSS_SCRDIR=/tmp,
GAUSS_EXEDIR=/usr/common/g03/bsd:/usr/common/g03/local:/usr/common/g03
/extras:/usr/common/g03,GAUSS_ARCHDIR=/usr/common/g03/arch,
GV_DIR=/usr/common/gv,PGIDIR=/usr/common/pgi/linux86/5.1,
PERLLIB=/usr/common/g03/bsd:/usr/common/g03/bsd,_DSM_BARRIER=SHM,
_RLD_ARGS=-log /dev/null,GDVBASIS=/usr/common/g03/basis,
MAKEFLAGS= CSIZE=1048576 CSIZEW=128 MACHTY=p6 VECTOR4=\,
prefetch PGISSELIB=,LINDA_FORTRAN=pgf77,POSTFL_FORTRAN=pgf77,
LINDA_FORTRAN_LINK=pgf77,
TRAP_FPE=OVERFL=ABORT;DIVZERO=ABORT;INT_OVERFL=ABORT,
MP_STACK_OVERFLOW=OFF,KMP_STACKSIZE=10485760,KMP_DUPLICATE_LIB_OK=TRUE,
X1_LOCAL_HEAP_SIZE=0xbff7000000,PBS_O_QUEUE=batch
euser = jsmith
egroup = student
hashname = 2210.argo-n
queue_rank = 1866
queue_type = E
comment = Job started on Thu Jan 19 at 14:22
etime = Thu Jan 19 14:22:12 2006
|
|