| Version 1.9 Build 803
|
|
Next: Summary of changes
Up: MeasurementSet definition version 2.0
Previous: Summary
The MeasurementSet (MS) defines the format in which visibility and
single-dish data are stored in AIPS++(Wieringa and Cornwell
1996). The format has been chosen to accommodate synthesis and
single-dish data from a variety of instruments in as broad a framework
as possible. In addition, it has been designed to be compatible with
the requirements of the measurement equation formalism (Noordam 1996),
which has been adopted to model instrumental errors in the AIPS++
calibration system.
The original definition will be referred to as version 1.0 in what
follows. The reasons for the modifications proposed here are as
follows:
- VLBI data reduction: Some changes are necessary in v1.0 to
accommodate the requirements of VLBI data processing. An extension of
the basic synthesis MeasurementSet format is presented to provide a
common VLBI data format.
- Synthesis calibration: General synthesis development
currently in progress requires some modifications to the existing
format, particularly in support of cross-calibration. Some changes
are also proposed to meet more recent requirements in areas such as mosaicing.
- Single-dish processing: Some changes have been proposed to
enhance support for single-dish processing and to ensure greater
compatibility between single-dish and synthesis data reduction.
- Accumulated changes: Several changes of a general nature
have been proposed since v1.0, and this is a good time to evaluate
and incorporate these as appropriate.
This is an opportune time to revise the MeasurementSet, as an
increased level of synthesis development is underway. Later revisions
will have a broader impact on existing code.
The design philosophy underlying this MS definition is summarized in
terms of the following objectives:
- Incremental change: The changes proposed here are designed
to be as incremental as possible and no extensive re-design has been
attempted. The scientific benefit of each modification has been weighed
against the scope of the proposed change to the MS design.
- Compatibility: Compatibility between single-dish and
synthesis data has been retained within one basic MS definition. The
proposed VLBI extensions are constrained to be compatible with both
the basic synthesis format and across existing VLBI networks and
correlators.
- Separation of information: A fundamental distinction is
made between a priori information which is known at the time of
observation or shortly thereafter, and calibration information
subsequently derived in post-processing. The MS definition is
primarily designed to encompass a priori information. The format of
calibration tables in given elsewhere.
- Storage: A future document will define standard Data
Managers for each column, which are recommended but not
required. Attention has been paid to the physical file sizes implied
by the MS definition, particularly for larger datasets, and storage
managers for a standard compressed MS format will be discussed in the
same document.
- Combining measurement sets: The MS definition has been
revised to facilitate the combining of diverse observational datasets
within one MeasurementSet if required, in a manner that is compatible
with the calibration system. This does not hold the implication that
all observations to be processed jointly require that the underlying
MeasurementSets be combined. Applications will support the ability to
process groups of MeasurementSets. However, the facility to combine
data needs to be provided in the MeasurementSet design.
The combining of different observations within one MeasurementSet
is permitted subject to the conventions of Section 3.3, which
separates data primarily by OBSERVATION_ID, but also by
PROCESSOR_ID and ARRAY_ID. Cross-calibration between observations
is however subject to the capabilities of the calibration
system. Specialized inter-conversion of calibration information may be
necessary if the calibration in separate observations is sufficiently
disjoint. For example, transferring calibration between frequencies
sufficiently far apart will likely be performed in an intermediate
step, using external utilities.
An application is envisaged to combine data from one or more
separate MeasurementSets, creating a new output MS by copying the
input data row by row, and renumbering the indices and interleaving
data as required. No sort order is prescribed for a MeasurementSet in
general, but it is expected that data sorted in time order will be
most useful to the broadest range of applications.
Specific principles adopted in the design are given below:
- Signal path: The MeasurementSet provides a format to
represent data from a generic radio-telescope or interferometer. Along
with the basic observed data, for which a limited set of accepted
types are specified in the MAIN table section, there are associated
data characterizing the state of the instrument as a whole. These
include: i) abstracted antenna properties of components in a generic
telescope, such as feeds and spectral windows, which serve also to
label the output data; ii) external information, such as flagging,
history or weather data; and iii) instrument-specific back-end data
which may be difficult to represent in a completely generic form. The
final category includes intermediate data and state information from
devices such as correlators, radiometers, spectrometers or pulsar
timers, amongst others, where they do not overlap with abstracted
antenna properties defined elsewhere. This state information may be
used in computing or initializing calibration corrections. Thus the MS
represents the signal path and state of the instrument, using as
generic an interface as possible, but allowing specialization where
appropriate. This conforms closely with the overall calibration
model.
- Use of Measures: Some columns in the MeasurementSet
require coordinate and unit specification. This is done in a manner
compatible with the AIPS++ Measures system. Measure frame information
is implicit in the underlying MS data. Row-based measures are avoided
wherever possible due to the overhead this would often impose on the
data reduction system through frequent coordinate conversion to a
common reference frame. Thus, column-based measures are the default
unless otherwise noted. The only place in which a row-based Measure
reference is currently allowed is the frequency axis in the
SPECTRAL_WINDOW sub-table, where it is supplied in order to allow an
efficient representation of Doppler tracking. Column-based Measures of
a specific type (e.g. EPOCH), should have a common reference across
the MS as a whole, unless there is a compelling reason otherwise. This
requirement is necessary for efficiency, and in minimizing
transformations when combining diverse MS datasets. No standard
reference is enforced for any given Measure type; these should be
chosen prudently. Units are also required to be column-based, unless a
row-based Measure is allowed for a given column. Recommended units are
specified for each Measure column. Access to the MS is assumed to take
place through the MS access classes. TableMeasures will be used
wherever practical in the MS access classes.
- Relative indexing: All indices for antennas, feeds,
spectral windows, or related quantities are assumed to start at zero.
Thus for direct indices into sub-tables, such as SPECTRAL_WINDOW_ID,
a value ID=n maps to row=n. MS indexing is zero-based in C++, and
one-based in Glish, as before.
- Sub-table completeness: Not all data represented in
associated sub-tables are assumed to be present in MAIN, although this
is encouraged. For example, extra antennas may appear in the ANTENNA
sub-table, even if they have no associated data in MAIN.
- Blanking and defaults: Measured data are blanked by
setting associated Boolean flag data, and magic value blanking is not
used. For integer values constrained to be non-negative, or sub-table
indices, however, a value of -1 will generally denote an unset
value. All required columns should be filled with suitable defaults if
not actually used.
- Times and intervals: For quantities which may vary with
time, an INTERVAL of zero implies a constant value with no
time-dependence, while a negative INTERVAL implies that the value is
valid until re-defined. The latter case accommodates values that are
time-stamped but have an undefined or unknown period of validity.
- Non-standard columns: The naming convention for
non-standard columns remains the same as in MS definition v1.0,
namely: i) general, non-supported columns start with the prefix
NS_$SITE_, as in NS_NRAO_WHATEVER; and ii) columns supported by a
consortium site start with the prefix $SITE_, as in
NRAO_WHATEVER. Standard columns, such as those described in this
definition, are not permitted to use either of the above two prefix
forms. Non-standard columns may not be defined to store data already
stored in standard columns. In addition, applications are not required
to support non-standard columns.
- Archiving: MeasurementSet data are assumed to be archived
in an external format such as FITS or HDF (Hierarchical Data Format)
for example, in as lossless a format as possible. This format will be
specified separately. Changes to the MS format will not be made
without a compelling scientific reason, but the format will, of
necessity, evolve over time. The archiving application has the
responsibility to restore the data to the latest format. It is deemed
too complex for the underlying MS access code, and data reduction
code, to be made capable of recognizing multiple MS revisions at the
C++ level. A utility is envisaged, however, to convert MS v1.0 data to
the MS v2.0 format.
Suggestions regarding this revision of the MS definition have been
contributed broadly from within the AIPS++project. No specific
attribution is given for each change but the accompanying text
reflects the rationale behind the modification and includes relevant
points that were raised in public discussion.
Next: Summary of changes
Up: MeasurementSet definition version 2.0
Previous: Summary
Contents
Please send questions or comments about AIPS++ to aips2-request@nrao.edu.
Copyright © 1995-2000 Associated Universities Inc.,
Washington, D.C.
Return to AIPS++ Home Page
2004-08-28