The SKA will generate more data than we have processed and analysed ever before. To make this possible, innovation in hardware, software and expertise is crucial. This drives the need for a Science Data Centre (SDC) in (the north of) the Netherlands, facilitated by a public-private partnership of science, government and industry.
The main goal of the data centre is to provide a portal to data archives and international high-performance computer systems that will result in groundbreaking science. In line with the vision for a European Open Science Cloud, the centre should also support the generation and sharing of high-quality scientific results in an accessible and open manner.
A distinguishing characteristic of radio astronomical research is the scale at which data is generated and processed. Developments in network and computational technology are opening up new opportunities for international and cross-domain exchange of data and expertise.
At the same time, the technological developments also lead to an exponential increase in the data volumes produced by radio telescopes and to the emergence of a generation of internationally distributed facilities such as the International LOFAR Telescope, and the SKA.
ASTRON addresses these challenges through international partnerships with both public and private organisations. Since the start of LOFAR operations, ASTRON has worked with ICT infrastructure partners such as SURFsara in the Netherlands, the Forschungszentrum Jülich in Germany, and the Poznan Supercomputing and Networking Center in Poland, to develop a distributed data archive that is astronomical both in content and scale.
Currently, ASTRON has a significant role in several EC projects that aim to make research infrastructures interoperable, open and sustainable in the era of Exabyte data archives expected from the SKA.
- EOSCpilot and EOSChub
ASTRON participates in the European Open Science Cloud (EOSC) initiatives to apply best practices developed within astronomy towards the development of a European-scale research data centre that builds on the vast expertise and capabilities of the joint research community and supports cross-domain data and service sharing.
Given the scale of data from instruments like LOFAR and SKA, and the complexity of the processing required to generate science ready data products, many researchers will benefit from getting access to high performing data infrastructures and high throughput processing capabilities without requiring them to organise resources or setting up complex software installations.
To enable easy and portable deployment of software, ASTRON is developing container-based software installations as well as application images for data analysis pipelines that are distributed across connected infrastructure. For the most data-intensive workflows, a user workspace is being realised for temporary storage of data, e.g. to allow computational resources to become available for next steps in the processing workflow or to assess quality before ingesting data into an archive or a science data repository.
Services are made accessible either openly or through a so-called federated authentication and authorisation infrastructure. The latter will allow users to gain access to any service by using home institute login accounts or, if the latter is not an option, through social or special purpose ‘Single Sign On’ accounts. The objective is to minimise the number of user-created accounts and to provide a single mechanism for authentication across internationally distributed services.
A workflow management system in conjunction with a standardised pipeline definition language such as the Common Workflow Language (CWL) will be utilised to develop standard data analysis pipelines that are made available for local deployment, or for data processing on connected compute clusters through the user portal.
As part of our commitment to community provided services and software, ASTRON is hosting the KERN repository of astronomical software packages.