Differences
This shows you the differences between two versions of the page.
| Next revision | Previous revision | ||
| public:stopdayactivities_5jun2018 [2018-05-14 07:52] – created Reinoud Bokhorst | public:stopdayactivities_5jun2018 [2018-06-05 11:52] (current) – [Stop-day activities June 5-6, 2018] Reinoud Bokhorst | ||
|---|---|---|---|
| Line 4: | Line 4: | ||
| ^ Coordinator | Reinoud Bokhorst | roadmin@astron.nl | | ^ Coordinator | Reinoud Bokhorst | roadmin@astron.nl | | ||
| - | ^ Software Support | | softwaresupport@astron.nl | | + | ^ Software Support | Arno Schoenmakers |
| - | ^ Science, Operations and Support | | sos@astron.nl | | + | ^ Science, Operations and Support | Pietro Zucca | sos@astron.nl | |
| - | ^ Observer | | observer@astron.nl | | + | ^ Observer | Henk Mulder |
| - | [[engineering: | + | ⇒ [[engineering: |
| - | + | ⇒ [[https:// | |
| - | [[https:// | + | ⇒ The next stopday is scheduled for August 7. |
| ===== Systems ===== | ===== Systems ===== | ||
| Line 19: | Line 18: | ||
| ==== Cobalt ==== | ==== Cobalt ==== | ||
| - | * ✔ Reboots and idrac reboots. (Hopko/Robin) | + | * ✔ Reboots and idrac reboots. (Hopko) |
| - | * ✔ CBM010 will be present before the stopday | + | |
| ==== CEP3 ==== | ==== CEP3 ==== | ||
| * ✔ Block access at 08:00 (Teun) | * ✔ Block access at 08:00 (Teun) | ||
| - | | + | * ✔ All nodes: file system check and reboot. (Kees) |
| - | | + | |
| - | * ✔ Was broken: Check/debug persistence of Slurm reservations (Reinoud) | + | |
| - | * ✔ NFS mounts for cep3, from all the lof-nodes are using control! | + | |
| ==== CEP4 ==== | ==== CEP4 ==== | ||
| - | * ✔ Connect DAC cable (Hopko) | + | * ✔ Reboot |
| + | * ✔ Recreate Docker thinpools on CPU nodes | ||
| + | * ✔ Recabling of Infiniband, details in Jira ticket | ||
| + | * Performance tests after recabling | ||
| ==== LEXARS ==== | ==== LEXARS ==== | ||
| - | * ✔ lexar003 reboot using XCAT (148 days up) (Was done by Reinoud 9-4-2018) | + | * ✔ Reboot |
| ==== LCU ==== | ==== LCU ==== | ||
| - | * | + | * |
| ==== Central Services ==== | ==== Central Services ==== | ||
| - | * ✔ Update portals to CentOS/KVM | + | * ✔ Restart qpidd@ccu001 (ref. https://support.astron.nl/ |
| - | * ✔ Update and reboot lcs020 | + | * ✔ Test DMZ KVM Failover |
| - | * ✔ Remove sas001 and sas099 | + | * ✔ OS upgrade |
| - | * ✔ Update & reboot NFS server | + | |
| - | * ✔ Almost all nfs mounts on the lcs115 nfs server are over the control network. Only a few correctly use the offline network: only lexar003, lexar004, and lhd002. We should force CEP3 to use off-line! | + | |
| - | * ✔ Mainly MAC/SAS, LCU' | + | |
| - | * Check resolv.conf settings; see https:// | + | |
| ==== LTA ==== | ==== LTA ==== | ||
| - | * ✔ Update and reboot | + | * ✔ Update and reboot (Reinoud) |
| + | * ✔ Migration of Oracle DB to new hardware (Andrey Tsyganov) | ||
| ==== Aartfaac ==== | ==== Aartfaac ==== | ||
| - | * ✘ Check for broken disks **Fail**: ais007 | + | * ✔ Check for broken disks **Fail**: ais007 |
| - | * **For review**: Add Jasmin & Reinoud to all nodes as admin | + | |
| + | |||
| ==== Core switches ==== | ==== Core switches ==== | ||
| - | * none (probably June) | + | * ✔ Warm reset PD0, RD0 and RD1 (Arjen) |
| - | ==== Communication issues ==== | ||
| - | **For review**: At the end of the 1st day software support needs to report status to coordinator | ||
| ===== Software updates ===== | ===== Software updates ===== | ||
| ==== MoM and related ==== | ==== MoM and related ==== | ||
| - | * ? | + | * none |
| ==== MAC/SAS ==== | ==== MAC/SAS ==== | ||
| - | * ? | + | * none |
| ==== CEP3 ==== | ==== CEP3 ==== | ||
| - | * ✔ Reboot / fs checks | + | * none |
| - | * ✔ Make AOFlagger 2.10 the default version (already installed) | + | |
| - | * ✔ Make LOFAR-Release-3_0_14 the default version (linked against AOFlagger 2.10) | + | |
| - | * ✔ Make WSClean 2.5 the default version (already installed) | + | |
| - | ==== CEP4 ==== | + | ==== LCU ==== |
| - | * ? | + | * synchronize Python packages, see list in ticket |
| + | * ✔ umask change for foreign stations | ||
| + | ==== CEP4 ==== | ||
| + | |||
| + | * Rollout Docker images | ||
| + | * ✘ SLURM upgrade | ||
| ==== Aartfaac ==== | ==== Aartfaac ==== | ||
| Line 102: | Line 101: | ||
| ===== In the field ===== | ===== In the field ===== | ||
| - | * none | + | * |
| - | + | ||
| - | + | ||
| - | ===== External ===== | + | |
| - | * ? | ||
| - | ==== Next stopday ==== | ||
| - | The next stopday is TBD | ||