public:stopdayactivities_6feb2018

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revisionBoth sides next revision
public:stopdayactivities_6feb2018 [2018-01-18 12:28] – [Stop-day activities December 5, 2017] Reinoud Bokhorstpublic:stopdayactivities_6feb2018 [2018-02-08 09:31] – [Central Services] Reinoud Bokhorst
Line 4: Line 4:
  
 ^ Coordinator | Reinoud Bokhorst | roadmin@astron.nl | ^ Coordinator | Reinoud Bokhorst | roadmin@astron.nl |
-^ Software Support |  | softwaresupport@astron.nl | +^ Software Support | Arno Schoenmakers  | softwaresupport@astron.nl | 
-^ Science, Operations and Support |  | sos@astron.nl | +^ Science, Operations and Support | Matthijs vd Wiel | sos@astron.nl | 
-^ Observer | | observer@astron.nl |+^ Observer | Richard Blaauw | observer@astron.nl |
  
  
Line 19: Line 19:
 ==== Cobalt ==== ==== Cobalt ====
  
-  *  ✔ Reboots and idrac reboots. (Hopko/Robin) +  *  √ Reboots and idrac reboots. (Hopko/Robin)
-  *  ✔ Restart the Infiniband Switches (Hopko/Robin) +
-  *  ✔ Install new iDrac firmware BIOS and Netwerk firmware+
 ==== CEP3 ==== ==== CEP3 ====
  
-  *  ✔ All nodes: file system check and reboot. (Hopko, Robin)+  * √ Block access at 08:00 (Teun) 
 +  * √ All nodes: file system check and reboot. (Hopko, Robin
 +  * √ Check/debug persistence of Slurm reservations (Reinoud)
  
  
 ==== CEP4 ==== ==== CEP4 ====
  
-  * ✘ Fix hosts file (when properly prepared) (Hopko) +  * √ Reinstall and upgrade of head nodes (Hopko, Reinoud
-  * ✔ Performance tests: ib perftest's, obdfilter-surveylnet self-test, mdtest (partly done) (Reinoud+  * √ Reinstall and upgrade of mgmt01including Robinhood (Hopko, Kees) 
-  * Reinstall cpu49, look at cpu27 (Hopko/Robin)+  * √ Update hosts file (Hopko
 +  * √ Upgrade Robinhood client on all CEP4 nodes (Hopko)
  
  
 ==== LEXARS ==== ==== LEXARS ====
  
-  *  ✔ no actions+  *  √ no actions
  
 ==== LCU ==== ==== LCU ====
  
-  *  ✔ Reboots all done (Teun) +  *  √ Reboots skipped. All up for at most 90 days.
 ==== Central Services ==== ==== Central Services ====
    
-  * ✔ Reboot lcs120 (391 days up): Active VM's: mcu006 lcs158 scu099 scu001 ccu001 mcu001 smu001 mcu005 kdcprod (Teun/Jasmin+  * √ Clean Zabbix history tables (lcs104
-  * ✔ Reboot lcs121 (391 days up): Active VM's: lcs156 lcs157 ccu199 kdctest mcu199 smu199 sas199 lcs153 lcs155 (Teun/Jasmin) +  * √ Upgrade some portals from Ubuntu 14 to Ubuntu 16 LTS 
-  * ✔ Reboot lcs122 (293 days up): Active VM's: gsm ldb199 zabbix3 tmysql perfsonar ansje (Teun/Jasmin) +  * ✘ Test performance of lcs104 with new kernel update (by using a clone)
-  * ✔ Reboot lcs102, lcs103, lcs104 (Teun/Jasmin) +
-  * ✔ ldb003: increase diskspace Postgres (/pgdata) (Jasmin) +
- +
-lcs120 had a dead disk in RAID1 configuration. Dell has been notified.+
  
  
Line 56: Line 52:
 ==== LTA ==== ==== LTA ====
  
-  *  ✔ Update OS (Reinoud) +  * √ Update OS, reboot when required (Reinoud) 
 +  * √ Expand disk space 
 +  * √ Shutdown old service at http://lofar.target.rug.nl/
 ==== Aartfaac ==== ==== Aartfaac ====
-  * ✔ Check disks (1 broken 4TB disk in ais007)+  * √ Add access for user Shulevski
 ==== Core switches ==== ==== Core switches ====
  
-  * none+  * √ none
  
  
  
 ===== Software updates ===== ===== Software updates =====
 +
 +==== MoM and related ====
 +
 +  * Update MoM version on dop303, lcs023, lcs029 (version 3.0.11) [AS]
 +  * Migrate MoM-OTDB-Adapter from sas001 to ccu001 [JK, AS]
  
 ==== MAC/SAS ==== ==== MAC/SAS ====
  
-  * ✔ Install LOFAR Release 3.0.6 on SCU001 after reboot [AS]+  * √  Tune postgres @ ldb003 (Reinoud)
  
  
 ==== CEP3 ==== ==== CEP3 ====
  
-  * ✔ PyBDSF to 1.8.13 by default (also adapt module lofar!) [AS] +  * √ Reboot / fs checks
-  * ✔ Dysco 1.1 [AS] +
-  * ✔ WSClean 2.5 installed but not default [AS] +
-  * ✔ Install astroquery python module [RB] +
-  * ✔ Fix SLURM reservations persistence [RB] +
-  * ✔ Install virtualenv for Python 2 and 3 [RB] +
-  * ✔ h5py for Python3 [RB]+
  
 ==== CEP4 ==== ==== CEP4 ====
  
-  * none+  * Rebuild of Docker images from Jenkins (related to head nodes upgrade, SwS) See [[cep4:deployment#tools_dependencies]] (not the bootstrap, just the rebuild part [JDM])
  
 ==== Aartfaac ==== ==== Aartfaac ====
Line 97: Line 93:
 ==== LTA ==== ==== LTA ====
  
-  * ✔ Update Ingest XML/RPC service to support ID Service. (Reinoud) +  * √ Update AstroWise common (Reinoud) 
  
 ===== In the field ===== ===== In the field =====
Line 104: Line 100:
  
  
-===== SVN =====+===== External ===== 
 + 
 +  * SURFsara dCache (LTA) maintenance between 09:00 and 17:00. 
 +  * Oracle LTA database: a standby db will be created on new hardware (Tsyganov, CIT).  
 + 
 + 
 +==== Next stopday ====
  
-  * <del>replace DNS alias for https://svn.astron.nl/ instead dop241 to dop803</del> Not on a stop-day!  +The next stopday is April 3+4(Easter Tuesday)
-  +
  • Last modified: 2018-02-08 09:31
  • by Reinoud Bokhorst