public:stopdayactivities_5jun2018

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revisionBoth sides next revision
public:stopdayactivities_5jun2018 [2018-05-31 07:22] – [Central Services] Reinoud Bokhorstpublic:stopdayactivities_5jun2018 [2018-06-05 11:51] – [Aartfaac] Reinoud Bokhorst
Line 18: Line 18:
 ==== Cobalt ==== ==== Cobalt ====
  
-  * Reboots and idrac reboots. (Hopko/Robin)+  * ✔ Reboots and idrac reboots. (Hopko)
  
  
 ==== CEP3 ==== ==== CEP3 ====
  
-  * Block access at 08:00 (Teun) +  * ✔ Block access at 08:00 (Teun) 
-  * All nodes: file system check and reboot. (Hopko, Robin)+  * ✔ All nodes: file system check and reboot. (Kees)
  
  
 ==== CEP4 ==== ==== CEP4 ====
  
-  * Reboot (Hopko/Robin+  * ✔ Reboot (Hopko) 
-  * Recreate Docker thinpool+  * ✔ Recreate Docker thinpools on CPU nodes 
 +  * ✔ Recabling of Infiniband, details in Jira ticket 
 +  * Performance tests after recabling
  
 ==== LEXARS ==== ==== LEXARS ====
  
-  *  Reboot (Hopko/Robin)+  *  ✔ Reboot (Hopko/Robin)
  
  
Line 43: Line 45:
 ==== Central Services ==== ==== Central Services ====
    
-  * Restart qpidd@ccu001 (ref. https://support.astron.nl/jira/browse/ROADMT-99) +  * ✔ Restart qpidd@ccu001 (ref. https://support.astron.nl/jira/browse/ROADMT-99) 
-  * Test DMZ KVM Failover  (DMZ KVM Hypervisor hosts DMZ services (portal,dns server,smtp,proxy etc)) +  * ✔ Test DMZ KVM Failover  (DMZ KVM Hypervisor hosts DMZ services (portal,dns server,smtp,proxy etc)) 
-  * OS upgrade and reboot+  * ✔ OS upgrade and reboot
 ==== LTA ==== ==== LTA ====
  
-  * Update and reboot (Reinoud) +  * ✔ Update and reboot (Reinoud) 
-  * Migration of Oracle DB to new hardware (Andrey Tsyganov)+  * ✔ Migration of Oracle DB to new hardware (Andrey Tsyganov)
 ==== Aartfaac ==== ==== Aartfaac ====
  
-  * Check for broken disks **Fail**: ais007 had a degraded RAID1, but a controller firmware update helped.+  * ✔ Check for broken disks **Fail**: ais007 had a degraded RAID1, but a controller firmware update helped.
  
  
Line 58: Line 60:
 ==== Core switches ==== ==== Core switches ====
  
-  * Warm reset PD0, RD0 and RD1 (Arjen)+  * ✔ Warm reset PD0, RD0 and RD1 (Arjen)
  
  
Line 75: Line 77:
  
   * none   * none
 +
 +==== LCU ====
 +
 +  * synchronize Python packages, see list in ticket
 +  * ✔ umask change for foreign stations
  
 ==== CEP4 ==== ==== CEP4 ====
  
   * Rollout Docker images   * Rollout Docker images
 +  * ✘ SLURM upgrade  (postponed)
 ==== Aartfaac ==== ==== Aartfaac ====
  
Line 94: Line 101:
 ===== In the field ===== ===== In the field =====
  
-  * none+  * 
  
  
  
  • Last modified: 2018-06-05 11:52
  • by Reinoud Bokhorst