Working with MSs
The first thing to know is that the MeqTimba kernel itself does not deal directly with AIPS++ MSs. Instead, the kernel deals with visibility streams, which consist of:
a VisHeader record, describing the data layout, and carrying auxiliary information (phase centers, antenna positions, etc.) normally found in MS subtables.
a number of VisTiles. Each VisTile is an N timeslots by M channels by K correlations chunk of data and flags for one interferometer. All data is tiled in the time direction only.
a VisFooter record, indicating end of data.
The source of an input stream or the destination of the output stream may be a real MS, a flat file, or another process (possibly across the network). The kernel itself does not know or care about this. The mapping between visibility streams and and actual data source is done by objects called event channels.
Columns And Columns
VisTiles contain columns of data (generally, one or more of DATA, PREDICT and RESIDUALS for correlations, FLAGS for flags, etc.) These columns are distinct from AIPS++ MS columns, so I will try to refer to them as tile columns and MS columns to avoid confusion.
When reading/writing an MS, the system in effect does double mapping on each side of the kernel:
- input mapping
- output mapping
The inner mapping, i.e. between VisTiles and VellSets, is determined by the state records of MeqSink and MeqSpigot nodes. These can be configured to work with any column of the tile (DATA, PREDICT, RESIDUALS), and to treat correlations and flags in various interesting ways. See MeqSink/MeqSpigot for details. This page only deals with the outer mapping of VisTiles to/from MSs.
Event Channels
The MeqServer process maintains two objects called an input channel and an output channel. These objects control where the MeqSpigot and MeqSink nodes get and put their data. The sinks are initialized with an input record and an output record as follows:
# Glish example (Python is similar) mqs.init([output_col="PREDICT"],input=[...],output=[...]);
Each time an init() call is made with an input argument, the input channel is reinitialized with the given record, and begins shooting out a visibility stream -- provided, that is, the record is correct. If no input argument is supplied, the output channel is reconfigured if so specified, but no data is read. The usual mode of operation is to supply both input and output records, and watch the sparks fly.
NB: the output_col field in the first record is a temporary Ugly Kludge(TM) that tells the server what tile columns to initialize in the output. If MeqSinks are told to write to tile column(s) that does not exist in the input tiles, the column(s) must be specified here as a string or a vector of strings containing "DATA", "PREDICT" or "RESIDUALS".
Note that the most recent values of the input and output records are stored in forest state, as the stream sub-record.
Configuring Input Channels
There are three types of input channels, ms_in for reading measurement sets, boio for reading flat files (from the DMI BOIO class -- Block Object Input/Output), and octopussy for publish/subscribe streams. The channel type is selected via the sink_type field of the input record.
MS Input Channels
A MS input channel record has the following general structure:
inputrec := [
sink_type = 'ms_in', # sink type
ms_name = 'test.ms', # MS filename
data_column_name = 'DATA', # which MS column to read
tile_size = 5, # tile size, in time slots
selection = [=], # selection sub-record
python_init = 'read_msvis_header.py', # optional init script
record_input = 'test.ms.boio' # optional, boio file to record input stream to
];
The data_column_name field specifies which MS column is mapped to the DATA column of the tiles. Other tile columns cannot be populated at this time. Note that a typical tree will only be reading one column anyway.
The tile_size field determines the tile size (and therefore snippet domain size), in number of timeslots.
The selection record can be used extract a subset of the MS. It may contain the following fields:
channel_start_index
starting channel (default is 0)
channel_end_index
ending channel (default is -1, for last channel)
ddid_index
DATA_DESCRIPTION_ID (default is 0)
field_index
FIELD_ID (default is 0)
selection_string
any TaQL string for additional selection (default is none)
Only one DATA_DESCRIPTION_ID and FIELD_ID at a time is read in at the moment; multiple ddids and fields have to be represented by separate streams. Note: as customary throughout the system, fields ending with "_index" are 1-based in Glish and 0-based everywhere else, with automatic adjustment done.
The python_init field is described in MeasurementSetHeaders.
The record_input field may be specified to capture a copy of the input visibility stream to the named file, as a flat (BOIO) file. Large MSs take a long time to read and tile properly; reading a BOIO file can be up to 4-5x faster. The captured file may be read in later by using a BOIO input sink. Which brings us to...
BOIO Input Sinks
A BOIO input sink record looks like this:
input := [ sink_type = 'boio', # sink type
boio_file_name = 'test.ms.boio' # input file
];
The input file must be a flat BOIO file containing a visibility stream. Such files are produced either by capturing the input of another sink via record_input, or by writing the output to a BOIO sink (see below).
Configuring Output Sinks
Likewise, there are three types of output sinks, ms_out for writing to a measurement set, boio for writing flat files, and octopussy for publish/subscribe streams. The sink type is selected via the sink_type field of the output record.
MS Output Sinks
At this time, MS output sinks cannot create new MSs of their own. Instead, they dump their data to the same MS that the input stream came from (this information is contained in the VisHeader). If a BOIO sink is providing the input stream, the MS that was used to capture the original stream is used. If the MS does not exist (or does not match the layout of the output stream), an error is reported.
An MS sink output record looks like this:
output := [ sink_type = 'ms_out', # sink type
# data_column = 'DATA', # optional tile column mappings
# predict_column = 'MODEL_DATA',
# residuals_column = 'CORRECTED_DATA'
flag_mask = 0 # output flag mask, 0 for none
];
The _column fields specify where to put the tiles' output columns. Typically, MeqSinks will be configured to dump data into the PREDICT or RESIDUALS column of a tile, these columns are then mapped to MS columns according to the _column fields. If a _column field is missing, then that tile column is ignored. Note that if the named MS column does not exist, the sink will create it; however, many AIPS++ tools (imager, etc.) only support the three standard columns named here, so this is of limited use.
The flag_mask field controls the writing of data flags. Tile flags are 32-bit masks, while AIPS++ flags are boolean. Tile flags are bitwise-ANDed with the mask to yield a boolean. If the mask is 0, then tile flags are not written out.
An MS output sink with no _columns specified and a flag_mask of 0 is the sink equivalent of /dev/nulll.
BOIO Output Sinks
BOIO output is an order of a magnitude faster than writing to an MS, so it may be good to use it when you're just experimenting. A separate command-line tool exists (NB: almost...) to write a BOIO file into an MS.
A BOIO output sink record looks like this:
input := [ sink_type = 'boio', # sink type
boio_file_name = 'test.ms.boio' # output file
boio_file_mode = 'W' # W to write new file, A to append
];
The sink simply dumps the stream to a flat BOIO file. Append mode may be used to concatenate multiple streams, which BOIO input sinks can automatically read in one by one.
An Example
Here's a Glish example of making event sinks, MeqSpigots and MeqSinks work together:
# initialize meqserver
mqs.init([output_col="PREDICT"],
output=[predict_column='MODEL_DATA',flag_mask=0]);
# no stream is read yet since no input record is specified
# ...
# create spigot
spigrec1 := meqnode('MeqSpigot','spigot1');
spigrec1.input_col := 'DATA';
spigrec1.station_1_index := 1;
spigrec1.station_2_index := 2;
mqs.meq('Create.Node',spigrec1);
#
# ...
# create sink
sinkrec := meqnode('MeqSink','sink1',children="compare");
sinkrec.output_col := 'PREDICT';
sinkrec.station_1_index := 1;
sinkrec.station_2_index := 2;
mqs.meq('Create.Node',sinkrec);
# ...
inputrec := [ sink_type='ms_out',
ms_name = 'test.ms',
data_column_name = 'DATA',
tile_size=5,
selection = [=] ];
# this starts the input stream
mqs.init(input=inputrec);
This exaple maps the MS "DATA" column via the tile DATA column to the input of a MeqSpigot, and maps the output of a MeqSink via the tile PREDICT column to MS column "MODEL_DATA".
