Environmental Monitoring and Emergency Shutdown

From athena

The system athena9 is responsible for monitoring the current environmental state and for initiating a (controlled) emergency shutdown when conditions exceed preset limits.

[edit] Emergency Shutdown Triggers

At the current time, the following events trigger a controlled shutdown:

  • UPS goes on battery not for self-test and for > 60 seconds
  • Any of the six rack-front temp sensors exceeds 35°C (95°F)
  • Any of the six rack-rear temp sensors exceeds 65°C (150°F) [Drop this? --WRS]

[edit] Emergency Shutdown Sequence

A controlled emergency shutdown via athena9 consists of the following steps:

  • Turn off compute node PDU's
  • Issue 'shutdown' commands to all polyserv nodes
  • Wait 120 seconds for polyserv shutdowns
  • Issue 'shutdown' command to cluster head node
  • Wait 60 seconds for head node shutdown to initiate
  • Issue 'graceful shutdown' (??? second delay) to UPS
  • Shut down self

[edit] Environmental Monitoring Webpages

http://athena9.npl.washington.edu/Graphs/