Daily Image

11-10-2018
PreviousNext
Click here or on the picture for a full size image.

The Big Wipe

Submitter: Reinoud Bokhorst
Description: The LOFAR post-processing cluster "CEP4" uses the Lustre file system for its main data storage. Lustre is a distributed, shared file system, capable of massive parallel reads and writes of large data files and is used by the largest HPC clusters around the world. In terms of hardware, the CEP4 file system consists of 20 data servers and 19 disk arrays with in total 1104 spinning disks, yielding a net capacity of 3.5 PB. In July, it was decided to give it an upgrade from v2.7 to v2.10 in order to mitigate known problems in the older version.

Since all data would be wiped in the process, some weeks of careful preparation followed. Circa 2 PB of data, worth many observing hours, was to be shuffled to the tape archives (LTA) and local hard disks by various people in the RO, while data was still being added on a daily basis. Production never stops!

Installing a file system on such a large system can be a daunting task but this time it wasn't. All hardware was already in place and since v2.10, Lustre includes the Integrated Manager for Lustre (IML) tool. After a few clicks, sit back and let the auto-discovery and auto-configuration do the work for you.

On the picture our colleagues Hopko Meijering and Robin Teeninga from CIT at the University of Groningen, working in IML just moments before wiping the old data. A slightly nervous moment. After the file system creation, some physical cable pulling was done to test the high availability of the various redundant components.

CEP4 is now operational again and well equipped for a few more years of processing.
Copyright: ASTRON
 
  Follow us on Twitter
Please feel free to submit an image using the Submit page.