This is an old revision of the document!
Stop-day activities December 4-5, 2018
Coordinator | Teun Grit | roadmin@astron.nl |
---|---|---|
Software Support | Arno Schoenmakers | softwaresupport@astron.nl |
Science, Operations and Support | Matthijs, Pietro | sos@astron.nl |
Observer | Henk Mulder | observer@astron.nl |
⇒ Description of stopday procedures
⇒ LOFAR Schedule cycle 11
⇒ Stop-day progress sheet (2 tabs!)
Systems
Cobalt
- √ Reboots and idrac reboots. (Hopko)
CEP3
- √ Block access at 08:00 (Teun)
- √ Repair memory bank of lof021 See ticket CIT-25
- √ Debug slurm lost reservations (Reinoud starts reboots, Robin waits for Reinoud)
CEP4
- Switch IPoIB to connected mode on head and gpu nodes, see https://support.astron.nl/jira/browse/ROADMT-186
- The Slurm disk performance tests are planned for after the stopday. (Volume now is 75%)
- mgmt01.cep4 to CentOS 7.5
- Robinhood tests
LEXARS
- √ Disable Supervisor on both lexars at 07:45 (Teun)
- √ Powerdrain of iDracs (Hopko/Robin)
- √ Update iDrac firmware (Hopko/Robin)
- √ The lexars stay on CentOS7.2
- √ Some investigation is needed on the iDrac's (Hopko)
LCU
- Test reboot script on 1 LCU
- √ Reboot cn001 (not announced)
- Some remote stations need reboot
- Install WinCC 3.16 on 1 LCU. Jasmin: Can't be done on CentOS7.2. We need a 7.5 system. There is a spare LCU available in the Dwnigeloo digital lab (RS511). In the end we need WinCC 3.16 on all LCU's someday.
Portals
- √ Update & reboot
- √ Check High Availability of portal2
Central Services lcs020 .. lcs030
- √ OS upgrade and reboot (SLES11_SP4 update contains ~65 packages, incl new kernel)
Other Central Services
- √ Stop and disable supervisor at 07:45 (Teun)
- OS upgrade and reboot
- Remove Zabbix-agent version 2.2 from scu001
- Start Postgres replication ldb003 → lcs119 and database split (Reinoud)
- Recabling network interfaces and p2p ldb003 / lcs119 (Arjen, please inform Reinoud)
LTA
- None
Dragnet
- √ Connect IB switch to CEP4 spine switches, instead of the Cobalt switch. The cables are already in. (Hopko)
Aartfaac
- √ Update & reboot ais001-007, aartfaac-lcu
Core switches
- √ Warm reset 11:00h (Arjen)
Software updates
MoM and related
- none
MAC/SAS
- none
CEP3
LCU
- None
CEP4
- Slurm update needs testing first. GPU04 is available for testing.
- Jasmin: Is there a repo available? Hopko will check.
Aartfaac
- None
COBALT
- none
LTA
- none