Table of Contents

Self-Calibration documentation page

This page is concecrated to the self-calibration tool for HBA observation with LOFAR. Here is explained what is 'the self-calibration' strategy, how it works, which tools are used and what are their parameters.

Self-Calibration Global Strategy/Context/Ideas

The self-calibration process goals to improve the final quality of an image by an iterative process (N cycles). In our case (LOFAR data), the amplitude calibration by using a calibrator (target whose the flux is well known) is done automatically by the Radio Observatory (RO) in the imaging pipeline (see target and calibrator pipeline). By consequence, the self-calibration with LOFAR data is phase calibration ONLY.
At each cycle, self-calibration process improves the phase calibration of the data. To improve phase's data, we need at each cycle a more accurate sky model than at the previous cycle. To do that, the strategy is to image data at low resolution and extract a sky model, then calibrate data with this extracted skymodel. At each cycle, the image resolution increase with the goal to extract a more complete sky model for calibrating data with this new sky model at the next cycle.
This process is repeated until the image resolution obtained is the best resolution possible with the data (α longuest baseline)

Release Notes (Installed version only on CEP1 (DOWN after the 24/11/2014) , CEP2, Flits, NOT yet on CEP3)

:!: Asap a new version with peeling option will be provided.

:!: Another one is checkNames but not activated yet. It will allow to use mergeSB.py with files using a different nomenclature as intermeadiate or final data nomenclature (but the user will have to provide some informations which cannot be extracted from nomenclature).

:!: at this day: 11/2014; Problems to implement it as a pipeline. Code is ready, but implementation must be done at least for LOFAR cycle 4: 04/2014.

Self-Calibration in the Radio Observatory imaging pipeline

The self-calibration is currently being implemented in the pipeline of the radio observatory. For testing purposes, a stand-alone tool is available. ………………………………………………………………in processing

Self-Calibration Standalone Tool

The self-calibration standalone tool is installed on CEP1, CEP2 and on Flits (the ASTRON Astronomy group server).

To use it on CEP, you can run selfcal by running

/opt/cep/selfcal/selfcal.py

You can load an initialization file to avoid typing the path: source /home/vilchez/selfcal/selfcal_init.sh

On Flits, just type

selfcal.py

Launching 'selfcal.py' without arguments will print short usage info:

MISSING Parameters
Usage: selfcal.py --obsDir= --outputDir= --nbCycle=
selfcal.py parameters

Before Launching selfcal, a pre-processing step

As mentioned before, time chunks can be merged in frequency. A tool named mergeSB.py exists in the selfcal package to facilitate the merging process. To use it on CEP, launch it as

/opt/cep/selfcal/mergeSB.py

or type source /home/vilchez/selfcal/selfcal_init.sh to avoid typing the path.

On Flits, just type

mergedSB.py

Launching mergeSB.py without arguments will print some usage information:

MISSING Parameters
Usage: mergeSB.py --obsDir= --outputDir= --column2Merge=
mergeSB.py parameters

Usually, one or several MSs are missing. It does not matter, NDPPP (use for merging subbands) will fill up with flagged values for missing measurements set. (Just one condition for missing subbands, the first and the last subbands for each time chunks must exist.) As for selfcal.py, the path of obsDir must be an absolute path (no use of ~/mypath/)

Naming scheme

mergeSB.py accepts intermediate or final data from the Radio Observatory, which follows specific naming conventions. So the nomenclature is specific like: * Intermediate data: L123456_SAP123_SB123_uv.MS.dppp * Final data: L123456_SB123_uv.dppp.MS If the name of your data is different, you have to change manually the name of its data using one of these nomenclature. In addition, you have to be be careful that the measurement sets will be ordered alphabetically (so use 00 instead of '0' if you have a number greater than 10 for example).

For selfcal.py the name of the frequency merged MS does not need to follow any convention (except that alphabetical order must correspond to the chronological order), but it will always work on the column CORRECTED_DATA, which is created by mergeSB.py.

So it is highly recommended to use these 2 tools together (except if you want to work on only one subband).

Self-Calibration global strategy

The self-Calibration process consists in a unitary tasks, which is looped, and at each cycle (iteration) the image resolution is increased to improve the sky model, allowing a better phase calibration to be performed at the next step.
The unitary task is defined as:

The scheme below shows an example of the self-calibration process. At each cycle, each time chunk is calibrated separately using (BBS), then each of them is flagged using NDPPP. It is necessary to calibrate and flag them separately because it is not possible (at the moment) to calibrate non continuous time concatenated MS using BBS.

After this step (which is parrallelized on 8 cores), the MSs are concatenated in time using msconcat command from pyrap.tables module, then AWimager image the time concatenated MS with specific parameters. To finish, a sky model is extracted using pybdsm on the new image.

The goal is to increase the resolution at each cycle, that means selfcal must include more long baselines at each cycle to have a beam size coherent with the pixel size (at the moment the beam resolution is equal to 2.5 time the pixel size, but it can change in the future). Because at each cycle, the code includes more long baselines, robust parameter changes also to give weights to long baselines. It starts at robust=1 (quite natural weighting) at low resolution (15 times the best resolution) and ends robust=-2 (uniform weighting) after including all baselines available.

On the scheme, an example of the self-calibrated process with:


Self-calibration starts: determination of parameters

Several checks happens when the code is launched like: number of subbands, frequency, number of channel/subbands of each time chunks etc … Each time chunks must have the same structure, if not, the code stops and write an error message in the log (or in the terminal). If checks are alright, selfcal.py will use the central frequency and the longest baseline to determine the best possible beam resolution using: Beam_Resolution = λ/Max(Baseline) ; where λ is the wavelength (calculated with the frequency).

After the determination of the best beam resolution of the data, then the best pixel size is determined by: Beam resolution = 2.5xPixel size (can be change in the future). So, the code determine the beam resolution (and the pixel size) for cycle 0 (iteration 0) by degrading it like: beam resolution cycle 0= 15x best beam resolution.

With the start and end (cycle 0 and cycle N) pixel size and beam resolution, the code estimates for each cycle its major parameters:

After the determination of major parameters for each cycle, data will be copy in the OUTPUT_DIR/Iter0 directory and start with cycle 0.

Then target coordinates are red in the MS to generate a sky model (GSM skymodel based on NVSSS catalogue) using the gsm.py command. The GSM sky model will be used for calibration at cycle 0.

N.B: A memory criterion is computed to avoid to use too much memory. Calibration, Imaging and source extraction process have a huge memory consumption. To avoid to crash, if data are too voluminous, the best beam resolution is degraded and adapted to have a reasonable memory use. On CEP it is suggested to use (for avoiding degradation of the beam resolution) for a typical observation of 10 hours maximum and 1 subband or 5 hours 2 subband etc. This is due to the memory of lce nodes (only 16Gb). On flits, 10 hours with 20 subbands is alright (memory ~260 Gb).

Self-calibration performances

Then the self-calibration runs and looping extracting a more complete sky model at each cycle, improving the phase calibration of the data, and the quality of the final image.
A comparison between a GSM phase calibration (cycle 0) and a complete self-calibration, at the best beam resolution, it appears an improvment of the noise about a factor between 3 to 5. And the final noise level is about 2-3 times the thermal noise.

For more complete information about parameters used, or computed inside see:self-cal in details