Daily Image

01-05-2018
PreviousNext
Click here or on the picture for a full size image.

The Artefact

Submitter: Joeri van Leeuwen & Daniël van der Schuur
Description: Every Tuesday at 14:00 the ARTS weekly meeting takes place in the Fish Bowl (Minnaert) room. We try to use most of the allotted hour brain storming our way out of problems we encounter along the way towards a shiny time-domain survey with Apertif. Using our varying expertise, and collective memory of problems solved in the past, we usually manage to Sherlock Holmes our way out. But in October 2017 Leon Oostrum noticed block-like periodic signals (left-most subpanel of the image) and determined these occurred at a 1.024 s period. Such signals are a serious problem, as they hamper the search for the periodic signals of new pulsars in the data.

As 1024 ms is a oft-used number in the data packetisation, as early as in the Apertif Front End beam-former, we initially concluded the problem must originate elsewhere, upstream from our systems, or be caused by the data capture software we used on the GPU cluster. But after these were ruled out it became apparent we were creating The Artefact, as it was now known, ourselves. Somewhere. The literate reader may know that Sherlock Holmes excels in solving other people's problems, but is not actually all that self-aware. And that, precisely, was what solving our problem required now. And thus, week after week, Jonathan Hargreaves would produce new plots, like the 2nd subpanel, showing the incorrect patterns the data would sometimes make. Poring over numbers of missing bits here, and jumps in the data there, he was able to first pinpoint the problem to a single Uniboard (3rd panel), then to a specific FPGA, over the course of two months. With the offending FPGA identified (there are hundreds in the whole system), a trip to Westerbork confirmed his skills surpass Sherlock's -- the problem was of our own making: the FPGA had a memory module of the wrong type installed, and we had not checked. The firmware is optimized for a specific memory type to achieve the maximum throughput. This wrong module affected both imaging and time-domain Apertif data. After banishing and replacing the memory, the data looked clean.

Case closed.

Lessons learned: 1) Budget significant time for debugging your complex system. 2) Exploit time-domain data for telescope verification.
Copyright: ASTRON
 
  Follow us on Twitter
Please feel free to submit an image using the Submit page.