public:stopdayactivities_5jun2018

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revisionBoth sides next revision
public:stopdayactivities_5jun2018 [2018-05-14 08:00] – [Next stopday] Reinoud Bokhorstpublic:stopdayactivities_5jun2018 [2018-06-05 11:51] – [Aartfaac] Reinoud Bokhorst
Line 4: Line 4:
  
 ^ Coordinator | Reinoud Bokhorst | roadmin@astron.nl | ^ Coordinator | Reinoud Bokhorst | roadmin@astron.nl |
-^ Software Support | | softwaresupport@astron.nl | +^ Software Support | Arno Schoenmakers | softwaresupport@astron.nl | 
-^ Science, Operations and Support |  | sos@astron.nl | +^ Science, Operations and Support | Matthijs van der Wiel | sos@astron.nl | 
-^ Observer | | observer@astron.nl |+^ Observer | Henk Mulder | observer@astron.nl |
  
  
Line 18: Line 18:
 ==== Cobalt ==== ==== Cobalt ====
  
-  * Reboots and idrac reboots. (Hopko/Robin)+  * ✔ Reboots and idrac reboots. (Hopko)
  
  
 ==== CEP3 ==== ==== CEP3 ====
  
-  * Block access at 08:00 (Teun) +  * ✔ Block access at 08:00 (Teun) 
-  * All nodes: file system check and reboot. (Hopko, Robin)+  * ✔ All nodes: file system check and reboot. (Kees)
  
  
 ==== CEP4 ==== ==== CEP4 ====
  
-  * Reboot (Hopko/Robin+  * ✔ Reboot (Hopko) 
 +  * ✔ Recreate Docker thinpools on CPU nodes 
 +  * ✔ Recabling of Infiniband, details in Jira ticket 
 +  * Performance tests after recabling
  
 ==== LEXARS ==== ==== LEXARS ====
  
-  *  Reboot (Hopko/Robin)+  *  ✔ Reboot (Hopko/Robin)
  
  
Line 43: Line 45:
 ==== Central Services ==== ==== Central Services ====
    
-  * +  * ✔ Restart qpidd@ccu001 (ref. https://support.astron.nl/jira/browse/ROADMT-99) 
- +  * ✔ Test DMZ KVM Failover  (DMZ KVM Hypervisor hosts DMZ services (portal,dns server,smtp,proxy etc)) 
 +  * ✔ OS upgrade and reboot
 ==== LTA ==== ==== LTA ====
  
-  * Update and reboot (Reinoud) +  * ✔ Update and reboot (Reinoud) 
 +  * ✔ Migration of Oracle DB to new hardware (Andrey Tsyganov)
 ==== Aartfaac ==== ==== Aartfaac ====
  
-  * Check for broken disks **Fail**: ais007 has a degraded RAID1  !!+  * ✔ Check for broken disks **Fail**: ais007 had a degraded RAID1, but a controller firmware update helped.
  
  
Line 58: Line 60:
 ==== Core switches ==== ==== Core switches ====
  
-  * Warm reset PD0, RD0 and RD1 (Arjen)+  * ✔ Warm reset PD0, RD0 and RD1 (Arjen)
  
  
Line 76: Line 78:
   * none   * none
  
-==== CEP4 ====+==== LCU ====
  
-  * none+  * synchronize Python packages, see list in ticket 
 +  * ✔ umask change for foreign stations 
 + 
 +==== CEP4 ====
  
 +  * Rollout Docker images
 +  * ✘ SLURM upgrade  (postponed)
 ==== Aartfaac ==== ==== Aartfaac ====
  
Line 94: Line 101:
 ===== In the field ===== ===== In the field =====
  
-  * none+  * 
  
  
  
  • Last modified: 2018-06-05 11:52
  • by Reinoud Bokhorst