====== Stop-day activities June 5-6, 2018 ====== \\ ^ Coordinator | Reinoud Bokhorst | roadmin@astron.nl | ^ Software Support | Arno Schoenmakers | softwaresupport@astron.nl | ^ Science, Operations and Support | Pietro Zucca | sos@astron.nl | ^ Observer | Henk Mulder | observer@astron.nl | ⇒ [[engineering:stop_day_procedures|Description of stopday procedures]]\\ ⇒ [[https://docs.google.com/spreadsheets/d/e/2PACX-1vSEbxNss-nmOofKDXJRmgACwMDB9zeekBLRl39krswsVGIigfvzD_EdnlKJ_2TF-IGgoX2IXvc2YlXL/pubhtml|LOFAR Schedule cycle 10]]\\ ⇒ The next stopday is scheduled for August 7. ===== Systems ===== ==== Cobalt ==== * ✔ Reboots and idrac reboots. (Hopko) ==== CEP3 ==== * ✔ Block access at 08:00 (Teun) * ✔ All nodes: file system check and reboot. (Kees) ==== CEP4 ==== * ✔ Reboot (Hopko) * ✔ Recreate Docker thinpools on CPU nodes * ✔ Recabling of Infiniband, details in Jira ticket * Performance tests after recabling ==== LEXARS ==== * ✔ Reboot (Hopko/Robin) ==== LCU ==== * ==== Central Services ==== * ✔ Restart qpidd@ccu001 (ref. https://support.astron.nl/jira/browse/ROADMT-99) * ✔ Test DMZ KVM Failover (DMZ KVM Hypervisor hosts DMZ services (portal,dns server,smtp,proxy etc)) * ✔ OS upgrade and reboot ==== LTA ==== * ✔ Update and reboot (Reinoud) * ✔ Migration of Oracle DB to new hardware (Andrey Tsyganov) ==== Aartfaac ==== * ✔ Check for broken disks **Fail**: ais007 had a degraded RAID1, but a controller firmware update helped. ==== Core switches ==== * ✔ Warm reset PD0, RD0 and RD1 (Arjen) ===== Software updates ===== ==== MoM and related ==== * none ==== MAC/SAS ==== * none ==== CEP3 ==== * none ==== LCU ==== * synchronize Python packages, see list in ticket * ✔ umask change for foreign stations ==== CEP4 ==== * Rollout Docker images * ✘ SLURM upgrade (postponed) ==== Aartfaac ==== * none ==== COBALT ==== * none ==== LTA ==== * none ===== In the field ===== *