public:stopdayactivities_6feb2018

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
public:stopdayactivities_6feb2018 [2018-01-18 12:28] – [Stop-day activities December 5, 2017] Reinoud Bokhorstpublic:stopdayactivities_6feb2018 [2018-02-08 09:31] (current) – [Next stopday] Reinoud Bokhorst
Line 4: Line 4:
  
 ^ Coordinator | Reinoud Bokhorst | roadmin@astron.nl | ^ Coordinator | Reinoud Bokhorst | roadmin@astron.nl |
-^ Software Support |  | softwaresupport@astron.nl | +^ Software Support | Arno Schoenmakers  | softwaresupport@astron.nl | 
-^ Science, Operations and Support |  | sos@astron.nl | +^ Science, Operations and Support | Matthijs vd Wiel | sos@astron.nl | 
-^ Observer | | observer@astron.nl |+^ Observer | Richard Blaauw | observer@astron.nl |
  
  
Line 19: Line 19:
 ==== Cobalt ==== ==== Cobalt ====
  
-  *  ✔ Reboots and idrac reboots. (Hopko/Robin) +  *  √ Reboots and idrac reboots. (Hopko/Robin)
-  *  ✔ Restart the Infiniband Switches (Hopko/Robin) +
-  *  ✔ Install new iDrac firmware BIOS and Netwerk firmware+
 ==== CEP3 ==== ==== CEP3 ====
  
-  *  ✔ All nodes: file system check and reboot. (Hopko, Robin)+  * √ Block access at 08:00 (Teun) 
 +  * √ All nodes: file system check and reboot. (Hopko, Robin
 +  * √ Check/debug persistence of Slurm reservations (Reinoud)
  
  
 ==== CEP4 ==== ==== CEP4 ====
  
-  * ✘ Fix hosts file (when properly prepared) (Hopko) +  * √ Reinstall and upgrade of head nodes (Hopko, Reinoud
-  * ✔ Performance tests: ib perftest's, obdfilter-surveylnet self-test, mdtest (partly done) (Reinoud+  * √ Reinstall and upgrade of mgmt01including Robinhood (Hopko, Kees) 
-  * Reinstall cpu49, look at cpu27 (Hopko/Robin)+  * √ Update hosts file (Hopko
 +  * √ Upgrade Robinhood client on all CEP4 nodes (Hopko)
  
  
 ==== LEXARS ==== ==== LEXARS ====
  
-  *  ✔ no actions+  *  √ no actions
  
 ==== LCU ==== ==== LCU ====
  
-  *  ✔ Reboots all done (Teun) +  *  √ Reboots skipped. All up for at most 90 days.
 ==== Central Services ==== ==== Central Services ====
    
-  * ✔ Reboot lcs120 (391 days up): Active VM's: mcu006 lcs158 scu099 scu001 ccu001 mcu001 smu001 mcu005 kdcprod (Teun/Jasmin+  * √ Clean Zabbix history tables (lcs104
-  * ✔ Reboot lcs121 (391 days up): Active VM's: lcs156 lcs157 ccu199 kdctest mcu199 smu199 sas199 lcs153 lcs155 (Teun/Jasmin) +  * √ Upgrade some portals from Ubuntu 14 to Ubuntu 16 LTS 
-  * ✔ Reboot lcs122 (293 days up): Active VM's: gsm ldb199 zabbix3 tmysql perfsonar ansje (Teun/Jasmin) +  * ✘ Test performance of lcs104 with new kernel update (by using a clone)
-  * ✔ Reboot lcs102, lcs103, lcs104 (Teun/Jasmin) +
-  * ✔ ldb003: increase diskspace Postgres (/pgdata) (Jasmin) +
- +
-lcs120 had a dead disk in RAID1 configuration. Dell has been notified.+
  
  
Line 56: Line 52:
 ==== LTA ==== ==== LTA ====
  
-  *  ✔ Update OS (Reinoud) +  * √ Update OS, reboot when required (Reinoud) 
 +  * √ Expand disk space 
 +  * √ Shutdown old service at http://lofar.target.rug.nl/
 ==== Aartfaac ==== ==== Aartfaac ====
-  * ✔ Check disks (1 broken 4TB disk in ais007)+  * √ Add access for user Shulevski
 ==== Core switches ==== ==== Core switches ====
  
-  * none+  * √ none
  
  
  
 ===== Software updates ===== ===== Software updates =====
 +
 +==== MoM and related ====
 +
 +  * Update MoM version on dop303, lcs023, lcs029 (version 3.0.11) [AS]
 +  * Migrate MoM-OTDB-Adapter from sas001 to ccu001 [JK, AS]
  
 ==== MAC/SAS ==== ==== MAC/SAS ====
  
-  * ✔ Install LOFAR Release 3.0.6 on SCU001 after reboot [AS]+  * √  Tune postgres @ ldb003 (Reinoud)
  
  
 ==== CEP3 ==== ==== CEP3 ====
  
-  * ✔ PyBDSF to 1.8.13 by default (also adapt module lofar!) [AS] +  * √ Reboot / fs checks
-  * ✔ Dysco 1.1 [AS] +
-  * ✔ WSClean 2.5 installed but not default [AS] +
-  * ✔ Install astroquery python module [RB] +
-  * ✔ Fix SLURM reservations persistence [RB] +
-  * ✔ Install virtualenv for Python 2 and 3 [RB] +
-  * ✔ h5py for Python3 [RB]+
  
 ==== CEP4 ==== ==== CEP4 ====
  
-  * none+  * Rebuild of Docker images from Jenkins (related to head nodes upgrade, SwS) See [[cep4:deployment#tools_dependencies]] (not the bootstrap, just the rebuild part [JDM])
  
 ==== Aartfaac ==== ==== Aartfaac ====
Line 97: Line 93:
 ==== LTA ==== ==== LTA ====
  
-  * ✔ Update Ingest XML/RPC service to support ID Service. (Reinoud) +  * √ Update AstroWise common (Reinoud) 
  
 ===== In the field ===== ===== In the field =====
Line 104: Line 100:
  
  
-===== SVN =====+===== External ===== 
 + 
 +  * SURFsara dCache (LTA) maintenance between 09:00 and 17:00. 
 +  * Oracle LTA database: a standby db will be created on new hardware (Tsyganov, CIT).  
 + 
 + 
 +==== Next stopday ====
  
-  * <del>replace DNS alias for https://svn.astron.nl/ instead dop241 to dop803</del> Not on a stop-day!  +The next stopday is April 3+4 or 10+11 (TBD)
-  +
  • Last modified: 2018-01-18 12:28
  • by Reinoud Bokhorst