Historical Node Restarts
From athena
Date | Time | Node | Action | Reason/notes | |
---|---|---|---|---|---|
3/19/08 | 18:00 | 3-21 | powercord reboot; | possibly due to power-strip testing | |
3/19/08 | 18:00 | 3-21 | powercord reboot; | possibly due to power-strip testing | |
3/19/08 | 18:00 | 3-21 | powercord reboot; | possibly due to power-strip testing | |
3/20/08 | 11:35 | 6-25 | powercord reboot; | appeared hung, tapping power button didn't affect it. | |
3/25/08 | 11:00 | 6-32 | node-power reboot; | showstate reported "down, with job". Unresponsive. | |
3/26/08 | 16:40 | 6-28 | powercord reboot; | E2110 MBE Error: DIMM 3 and 4 (died at 1am) | |
4/01/08 | 11:49 | 3-21 | powercord reboot; | E1410 CPU 2 IERR, E1410 CPU 1 IERR, E2119 | |
4/01/08 | 22:00 | 6-29 | powercord reboot 4/2; | E1410 CPU 2 IERR, E1410 CPU 1 IERR, E2119 | |
4/03/08 | 13:00 | 3-32 | powercord reboot; | probable overcommit_memory issue - set to 2 after reboot | |
4/03/08 | 14:00 | 3-24 | powercord reboot; | probable overcommit_memory issue - set to 2 after reboot | |
4/03/08 | 14:00 | 3-21 | powercord reboot; | probable overcommit_memory issue - cluster set to 2 after reboot | |
4/03/08 | 14:50 | 6-23 | powercord reboot; | kernel panic upon attaching USB keyboard | |
4/07/08 | 10:00 | 6-20 | powercord reboot; | E1422 CPU Machine Check, VGA screen displays nmi sync error. | |
4/08/08 | 11:15 | 6-26 | powercord reboot; | no obvious reason, died at 04:00 | |
4/08/08 | 13:25 | 6-28 | powercord reboot; | E1410 CPU 2 IERR, E1410 CPU 1 IERR, E2119 | |
4/14/08 | 13:25 | 6-20 | powercord reboot; | 6-20, 6-26, 3-21 between 4-09 to 4-14 | |
4/16/08 | 17:00 | Poly1 | controlled reboot; | E1211 ROMB Batt ... diagnostic repair attempt | |
5/05/08 | 09:00 | many | powercord reboot; | 3-21,4-7,4-9,4-11,4-14,4-17,4-18,4-21,4-25,5-25,5-26,5-27,5-28,6-20 | |
5/05/08 | 17:00 | 3-25 | powercord reboot; | ||
5/07/08 | 13:25 | two | powercord reboot; | 5-30, 6-20 | |
5/08/08 | 13:25 | 11:00 | powercord reboot; | 5-30, 5-32, 4-25 (5-30 was running same jobs as yesterday) | |
5/12/08 | 16:30 | 3-2 | powercord reboot; | no trouble indication on front panel, no job stuck running. | |
5/19/08 | 10:30 | 3-19,6-32 | powercord reboot; | 3-19 had usual messages, 6-32 was happily blue. | |
5/19/08 | 10:30 | 3-22 | dead disk; | awaiting replacement, replace and insert-ethers --replace at 11am 5/20 | |
5/19/08 | 13:30 | 6-31 | powercord reboot; | no trouble indication on front panel. | |
5/20/08 | 11:00 | 3-10,3-27,4-32 | powercord reboot; | no trouble indication on front panel, simultaneous with restoring 3-22. | |
5/20/08 | 13:40 | 6-2,6-4 | powercord reboot; | 6-4 showed E2119 SBE (single-bit-errors) on front panel. | |
5/20/08 | 16:00 | 5-30 | powercord reboot; | nothing on front panel. rebooted at 17:30 | |
5/20/08 | 18:00 | 4-30,5-30,6-4 | powercord reboot; | 6-4 showing SBE after power cycle. | |
5/21/08 | 10:00 | 3-12,6-3 | powercord reboot; | 3-12 went down circa midnight, 6-3 circa 9am | |
5/27/08 | 10:55 | 6-28 | powercord reboot; | E1410 CPU 1 (and 2) IERR on panel, but suspiciously synchronous with a job finishing. | |
5/28/08 | 15:00 | 6-20 | powercord reboot; | E2119, E1410 CPU 1 (and 2) IERR on panel | |
5/29/08 | 02:00 | 6-29 | powercord at 11am; | E2119 SBC Mem, E1410 CPU 1 and 2 IERR on panel | |
6/02/08 | 12:01am | 3-1,3-3 | powercord at 11am; | no errors on panels | |
6/03/08 | 10:00pm | 3-30,3-32 | remote power drop | software testing (3-30 rebooted 6/4/08) | |
6/03/08 | 10:54pm | athena0 | powercord at 10:30am June 4; | E2119 SBC Mem, E1410 CPU 1 and 2 IERR on panel | |
6/05/08 | 10:00pm | 3-11,3-24,6-16 | powercord at 8:30am June 6; | Jeff Gardner testing | |
6/20/08 | 10:00am | 3-21 | powercord soft reboot. | Died doing job 189855, "usual" E2119,E1410 errs on panel | |
6/23/08 | 10:00am | 6-28 | powercord soft reboot. | "usual" E2119,E1410 errs on panel | |
6/23/08 | 10:00am | various | remote reboots | Torque maintenance | |
7/3/08 | 10:00am | 3-6 | soft reboot | done by Duncan, no data | |
7/7/08 | 11:00am | 6-20 | soft reboot | "usual" E2119,E1410 errs on panel | |
7/23/08 | 17:15pm | 6-26 | hard reboot | just hung, no amber displays | |
7/24/08 | 14:30pm | 6-28 | hard reboot | just hung, no amber displays | |
8/4/08 | 13:00pm | 6-20 | hard reboot | "usual" E2119,E1410 errs on panel (died 8/3) | |
8/4/08 | 13:00pm | 3-21 | hard reboot | "usual" E2119,E1410 errs on panel (died 7/31) | |
8/25/08 | 11:00am | 5-2,5-3,6-20 | hard reboot | 6-20 had "usual" E2119,E1410 errs on panel (died 8/22) | |
8/26/08 | 12:00pm | 3-7 | hard reboot | just hung, no amber displays | |
8/27/08 | 10:00am | 3-2,6-20 | hard reboot | 6-20 had "usual" E2119,E1410 errs on panel | |
8/28/08 | 10:00am | 3-1 | hard reboot | no errors shown, had died circa 10pm 8/27 | |
8/28/08 | 14:30am | 3-8,3-9 | hard reboot | no errors shown, had died circa 13:30 8/28 | |
8/31/08 | 18:30am | 3-5 | hard reboot | E2119 only |
Back to Node Restarts
Back to Service Interruptions log