Cloudify Casa with Jupyter

Submitter: Aard Keimpema
Description: The size of astronomical datasets has increased dramatically over the years; terabyte sized datasets are no longer an exception. This trend will only accelerate; the SKA is expected to produce nearly 1 TB of archived data each day. This means that it will no longer be feasible for astronomers to download these huge datasets and perform the data reduction on their own machines, as is currently the practice. Instead the data reduction is likely to be done close to where the data is archived in central data processing centres, with the astronomer operating remotely on the data.

One way of facilitating this is through Jupyter notebooks . Jupyter is a web-based application which allows users to create interactive notebooks which can include annotated text and graphics as well as executable code. Currently Jupyter supports more than 40 different programming languages, including Python, R, and Matlab. Jupyter is designed be extended and makes it easy to add additional languages.

As part of the Obelics work-package of the EC-funded Asterics project we have created a Jupyter kernel for CASA, a widely-used software package for processing astronomical data. The kernel allows all CASA tasks to be run from inside a Jupyter notebook, albeit non-interactively. Tasks which normally spawn a GUI window are wrapped so that their output is saved to an image instead, which is then displayed inside the notebook. The Jupyter kernel requires a custom build of CASA which we will distribute together with the kernel in a Docker image .

The notebook format also has the great advantage that all steps of the data reduction are preserved inside the notebook. This means that the whole data reduction process is self-documenting and fully repeatable. It also allows users to very easily make changes to their pipeline and then rerun the pipeline steps affected.

The figure shows a Jupyter notebook of the CASA 3C391 VLA continuum tutorial running inside a web browser on a tablet.

ASTERICS is a project supported by the European Commission Framework Programme Horizon 2020
Research and Innovation action under grant agreement n. 653477
Copyright: Aard Keimpema, Des Small
