This is an old revision of the document!
User Software :: CR-Tools :: DataReader
Overview
The DataReader
class implements the processing framework, which can be applied to data before entering further processing.
Fields in the Header Record
This is the list of (mandatory) fields in the header record (as accessed with dr.header()
) of a DataReader object. The mandatory fields have to be set by all child classes of the DataReader in order to be usable by the (upcoming) standard tools.
(* Mandatory)
The field-names are case sensitive, and should be put into the record exactly as they are written here.
Filestream positions
The DataReader handles progression through the data volume via a set of DataIterator objects, providing N positions for N data streams. These stream- and position pointers allow a variety of access schemes:
- Access to the same segment in multiple streams, e.g. when reading raw data recorded with the LOFAR ITS.
- Access to different segments in multiple streams
- Access to different segments withhin a single stream, e.g. when reading data from a LopesEvent file.
Data flow
The figure below illustrates the data flow inside the DataReader:
There is the option to insert a Hanning Filter step before performing the Fourier transform; this can be used to reduce the sidelobes in the frequency domain, originating from cutting out a block of data (which is equivalent to the multiplication of the data with a box function).
Performance
A clear trend can be seen when going towards smaller blocksizes, by which data are read from disk. One possible approach for tuning the performance would be read multiple blocks from disk and then dispatch them subsequently to the requesting routine; this of course requires some intelligence to be build into the data reading code, in order to do the bookkeeping.
Development
Adding a new data format
The DataReader framework has been set up in such a way, that adding the capability do read in data from new data formats should be kept as simple as possible:
DataReader
works as base class, from which all data type specific classes are inherited; by this the internal data processing framework is kept.- Only reimplement the function performing the actual input from the data file, returning a standard product to the internal pipeline.
At the present time, the following classes are part of the data input framework:
Example
- In the header file of the new class (here:
ITSBeam.h
) define a private variabledatatype_p
which is of the type as which the data are stored in the data file.class ITSBeam : public DataReader { //! Information contained in experiment.meta are stored in their own object ITSMetadata metadata_p; //! Type as which the data are stored in the data file float datatype_p; public: //! Get the raw time series after ADC conversion Matrix<Float> fx (); protected: //! Connect the data streams used for reading in the data Bool setStreams (); };
The two methods/functions are reimplemented from the
DataReader
class; a detailed description is given below. - In the implementation file (here:
ITSBeam.cc
) we need to reimplement two functions, which are already defined as virtual functions in theDataReader
class:setStreams()
– connect the data streams used for reading in the data from diskBool ITSBeam::setStreams () { bool status (true); uint blocksize (blocksize_p); Vector<uint> antennas (metadata_p.antennas()); Vector<Float> adc2voltage (DataReader::adc2voltage()); Matrix<Complex> fft2calfft (DataReader::fft2calfft()); Vector<String> filenames (metadata_p.datafiles(true)); DataIterator *iterator; /* Configure the DataIterator objects: for ITSBeam data, the values are stored as short integer without any header information preceeding the data within the data file. */ uint nofStreams (filenames.nelements()); iterator = new DataIterator[nofStreams]; for (uint file (0); file<nofStreams; file++) { // data are stored as short integer iterator[file].setStepWidth(sizeof(datatype_p)); // no header preceeding data iterator[file].setDataStart(0); } /* Setup of the conversion arrays */ uint nofAntennas (antennas.nelements()); uint fftLength (blocksize/2+1); IPosition shape (fft2calfft.shape()); if (adc2voltage.nelements() != nofAntennas) { double weight (adc2voltage(0)); adc2voltage.resize (nofAntennas); adc2voltage = weight; } if (uint(shape(0)) != fftLength || uint(shape(1)) != nofAntennas) { fft2calfft.resize (fftLength,nofAntennas); fft2calfft = 1.0; } // -- call to DataReader::init(...) ------------------------------------------ DataReader::init (blocksize, antennas, adc2voltage, fft2calfft, filenames, iterator); DataReader::setNyquistZone (1); return status; }
Even though the setup of the
DataIterator
can be quite different for your specific data format, the single-most important instruction – that needs to be issued before callingDataReader::init
– isiterator[file].setStepWidth(sizeof(datatype_p));
which takes care of adjusting the width by which the stepping through the data volume is done. Once all parameters have been set up correctly, they are passed to the base class.
fx()
– Reading in of the data and formatting to one of the standard products within the processing chain internal to theDataReader
; keep in mind that your data may be something else but the raw time series after ADC, such that you will need to re-implement another method (e.g.fft()
).
Internal data initialization
Usage with C/C++ code
- Creation of a new DataReader object:
#include <lopes/IO/DataReader.h> #include <lopes/Data/LopesEvent.h> DataReader *dr; LopesEvent *le = new LopesEvent (eventfile, // location the LopesEvent file blocksize, // nof. samples per block of data adc2voltage, // conversion weights [optional] fft2calfft); // calibration weights [optional] dr = le;
With the conversion arrays optional you can even use the simpler construction method:
LopesEvent *le = new LopesEvent (eventfile, // location the LopesEvent file blocksize) // nof. samples per block of data
- Reading in data for processing:
for (int block(0); block<nofBlocks; block++) { fft = dr->fft(); }
- Data selection: Selection of frequency channels and antennas can be performed directly within the DataReader, e.g.
dr->setSelectedChannels (selection);
where
selection
is an array of boolians of lengthfftLength
.