Skip to main content

ASTRON Hackathon addresses big data challenges

Published by the editorial team, 3 May 2018

The new era of radio astronomy we are entering involves solving the most pressing data challenges in IT. With massive radio telescopes such as the Square Kilometre Array (SKA) coming online, we need new ways to store, transport, explore, and access massive amounts of data. During a two-day Hackathon (23-24 April 2018) at ASTRON, 34 data-experts from academia and industry gathered to address some of these challenges.

The Hackathon opened on 23 April with a talk by Michael Wise, the head of ASTRON’s Astronomy Group. He stressed how astronomy is evolving to be increasingly data-intensive and that the only way we can cope with this is to go beyond standard pipelines as employed in current radio astronomy telescopes. He highlighted the necessity for clever classification of astronomical objects and updated archives to ensure enriching science.

Next, Rob Lyon, a current post-doctoral researcher in the SKA group at University of Manchester, talked about the current SKA design and the challenges expected for every design component. He used an example of pulsar searches at exa-scale (10^18 bytes) to illustrate how the SKA era would breach machine learning frontiers given a pressing imbalance issue. Both these talks can be accessed here:

During the Hackathon, participants were given complete independence to work on a project of their choosing. The projects were proposed in advance and proposers outlined brief idea, required skills, etc. This set the stage for various collaborative teams to be formed on the day.

Teams of 4-5 people worked together on several different projects including topics such as machine learning classifiers, gold standard test vectors in singularity, cloud computing and scalability for astronomy dataset and Apertif/LOFAR Machine learning for transients.

This event served as the first platform to bring together astronomy community and industry experts to identify and address data issues expected in the intensely data-driven radio astronomy era. Participants have started thinking about alternatives to current state-of-the-art GPU driven computing. For example, Gema Parreno and her team could successfully demonstrate Google Cloud Platform scalability for storing, accessing, analyzing and effectively visualising the radio astronomy data.

Discussions ensued during the two-day event about various machine learning classifiers and how to choose ones that would suit our unique needs. Another interesting project was led by Bart Scheers from Centrum Wiskunde Informatica and Matthieu Marseille from Thales group. Here, they used SQL queries in integration with python (HealPy) to rapidly scour large astronomy datasets. They observed that such query combination drastically altered the computational time to only 1.3%.

The final results of the ASTRON Hackathon projects are outlined in a collaborative Google document which can be accessed here:

The final presentations from some of the teams can be found here:

Another way to follow up on developments during the session is to search Twitter for #astronhackathon.



Subscribe to our newsletter. For previous editions, click here.