Doing situps for MO'AB(s)
From athena
Contents |
[edit] Job Wrangling
[edit] Details on a job
% checkjob <jobid>
Or for details on "Why isn't my job running on node X":
% checkjob -v <jobid>
You can also do:
% mdiag -j <jobid>
[edit] Node Wrangling
Detailed node status:
% mdiag -n <nodeid>
Even more detailed node status:
% mdiag -n -v <nodeid>
[edit] Why is a node in the "Busy w/No Job" state?
See Soft Errors for how to diagnose this condition and correct it.
[edit] Node Draining and "Undraining"
Telling a node to drain:
% mnodectl compute-3-2.local -m state=Draining state on node compute-3-2.local updated
Return it to the pool:
% mnodectl compute-3-2.local -m state=Idle state on node compute-3-2.local updated
[edit] Setting/Changing/Removing Reservations
Make a reservation for the "mops" queue on nodes compute-6-[1-4] starting at 1pm on Sep 8, 2008 and lasting 24 hours. Note that the term "compute-6-1" also matches "compute-6-10," "compute-6-11," etc. Therefore, if you want an exact match, it is necessary to append ".local" so that you only match the first digit of the node number ("^" means that the line must start with "compute..."):
% mrsvctl -c -a CLASS=mops -s 13:00:00_09/08/08 -d 24:00:00 -h '^compute-6-[1-4].local' NOTE: reservation mops.1278 created
Make a reservation for the "mops" queue on 16 nodes selected within compute-6-*:
% mrsvctl -c -a CLASS=mops -s 8:00:00_09/15/08 -d 3:00:00:00 -h 'compute-6-*' -t 16 NOTE: reservation mops.1278 created
Create a reservation for system downtime:
% mrsvctl -c -s 8:00:00_09/22/08 -d 9:00:00 -h ALL NOTE: reservation system.1279 created
Release an existing reservation:
% mrsvctl -r mops.1283 reservation mops.1283 successfully released
Modify starttime of reservation (duration will stay the same, thus moving the entire reservation time slot). Note the use of the "--flags=force":
% mrsvctl -m starttime+=1:00:00 --flags=force mops.2814 successfully changed starttime for rsv mops.2814
Modify duration of reservation:
% mrsvctl -m duration=3:00:00 --flags=force mops.2814 successfully changed duration for rsv mops.2814
Examining current reservations:
% showres
[edit] Restarting MOAB
Tell MOAB to re-read config file:
% mschedctl -R
Force MOAB to completely reconfigure its state:
% /etc/rc.d/init.d/moab stop; rm /opt/moab/.moab.ck*; /etc/rc.d/init.d/moab start