3C147_spw1
Flagging
For starters, I flagged channels 0:3, 59:63, and 5,7,9,... The reason for flagging odd channels is that we use Hanning tapering, and then calibrate on even channels only. However, the imager is unable to use channel stepping (due to a bug -- confirmed by Kumar Golap), so it's easier to just flag the odd channels so that the imager ignores them. A kludge, but it works.
Then I used autoflag to flag RFI and such:
include 'autoflag.g' af:=autoflag('3C147_spw1.MS') af.settimemed(fignore=T) af.setuvbin(plotchan=T,nbins=100,thr=.01) af.setdata() af.run(plotdev=[2,2])Antenna C had lots of RFI, but autoflag dealt with it wonderfully.
Idea: should make a separate "lightweight autoflagger" and put it into casacore. In the meantime, can make a TDL interface to the sucker (which would currently generate glish code on-the-fly), and incorporate the flag transfer job into it, which sits oddly in view-ms.
Idea: eventually need to flag on solutions (i.e. flag ewhere |G| goes wild). I can build a tree for this...
Calibration
18/09/08:
- Using 10 sources from NEWSTAR model, need to check if that is sufficient.
- First calibrate Gdiag, then Bdiag (NOT the other way around)
- Only use XX/YY, as one of the cross-corrs still has mysterious signal
- Source coherency decomposition actually makes things slower when solving (due to less efficient use of cache), so disabled it.
- Antenna 7 has garbage. C and D are their usual misbehaved self.
- Made G/B solutions without antennas 7,C,D. Residuals are wonderful. Should now try a 7,C,D solution while keeping the rest fixed.
Idea: make a "job manager". Every time a TDL job is run, save the current options (and the job name) into a binary file. Provide an interface to save this as a "job profile" with a descriptive name. Then, we can run job profiles with a single click of the mouse...
19/09/08:
Implemented ParmGroup.Subgroup, to make it easier to switch between parameter sets
- Use 20 sources in sky model, solve G-B again for antennas 0:6,8:11. Residuals within 0.1.
Now solving for ants 7,C,D only. For some reason, residuals on other baselines are now fucked up!!! This seems impossible, gotta be a bug. Somehow, the predict vor visibility:0:8 changes sign depending on whether we build the tree for a subset of antennas or for all antennas! OK, this was a bug in IfrArray.py, the station indices for the UVW spigots were wrong when dealing with subsets of stations.
- OK, this bug has cleared up a lot of problems. Doing a separate G/B solution for 7,C,D produces good residuals, with typical residuals on baselines involving 7,C,D within 0.5. Residuals also show clear outliers, so the obvious next step is to flag on those.
- But before, let's try a simultaneous solution on all baselines, and compare residual levels. Just to see if 7/C/D will ruin the solutions of the other antennas. Answer: residuals look roughly the same, so I'll abandon the separate solutions gig for now. Note that antenna C/D residuals are now quite good, it's only 7 that's screwing things up.
- flagged residuals with:
af:=autoflag('3C147_spw1.MS') af.settimemed(column='CORR') af.setuvbin(plotchan=T,nbins=100,thr=.01,column='CORR') af.setdata() af.run(plotdev=[2,2])
- next problem: view-ms does not show the same CORRECTED_DATA column as calico-wsrt did. Solution: disable hanning tapering, moron! No point tapering CORRECTED_DATA when every other channel is not written in the first place!
Idea: make a MEP table manager, launch as a widget from within the browser. Allow to make backups of all MEP tables, with descriptive comments, restore backups, and view contents using some kind of plotter.
Idea: add antenna selection to imaging options. We want to try imaging with and without the antenna 7 baselines.
20/09/08:
- Residual flagging above flagged too much (esp. on shorter baselines), but ignored the outliers on baselines with ant 7. Tried some more conservative flagging like this:
af:=autoflag('3C147_spw1.MS') af.setdata() af.setuvbin(plotchan=T,nbins=100,thr=.01,column='CORR') af.run(plotdev=[2,2]) af:=autoflag('3C147_spw1.MS') af.settimemed(column='CORR',hw=20,rowhw=20) af.setdata() af.run(plotdev=[2,2])Flags saves as RESIDUAL1 flagset.
- Made images without antenna 7, and with antenna 7:
.
- Looks like we need to flag all baselines with antenna 7, and be done with it...
- Tony suggests throwing out short baselines during calibration, since the model only contains point sources. Makes sense to me, now what's a sensible UV-distance for this threshold?
Idea: add antenna subset selector to imaging control. Currently done with TaQL string: ANTENNA1 NOT IN [7] && ANTENNA2 NOT IN [7].
Idea: add UV-distance selector to MSUtils. Can be done with TaQL string such as sumsqr(UVW) > x
21/09/08:
- Repeated G/B solutions with flagged data. Residuals look nice, incl. short baselines, so I won't be throwing them out.
- Tony pointed out the spokes in the map above. Made B a 1-st degree polynomial in time, and the spokes are gone. Next step is to solve for off-diagonals.
Note a semi-regular interference pattern over the field. Could this correspond to the crap we see on C/D baselines in the residual plot below?
.
22/09/08:
- repeated flagging on residuals, and also killed antenna 7:
af:=autoflag('3C147_spw1.MS') af.setdata() af.setuvbin(plotchan=T,nbins=100,thr=.01,column='CORR') af.settimemed(column='CORR',hw=20,rowhw=20) af.setselect(ant=['RT7']) af.run(plotdev=[2,2])Flags saved as RESIDUAL2 flagset.
- This seems to have removed the interference pattern over the field, and improved residuals on C/D baselines. See residuals and image:
Idea: integrate the backup of parameter sets, residual images and flag reports. Keep them in subdirectories in the MS. Automatically generate an HTML log. What's needed then is automatic generation of residual plots from the collector.
- Here's the same image with an LSM overlay. It appears we need to solve for some source parameters (e.g. NEWS7, NEWS1000, NEWS1009)
- Note pitfall: we NEED to use a custom selection for imaging, due to the channel selection bug pointed out under Flagging at the top of this page. So, to image every even channel from 4 to 58, we need to flag the odd channels, then tell the imager to image channels 4-58 with a stepping of 1. If we use the default selection (4-58, stepping 2), this tells the imager to add 27 channels with a step of 2, it cheerfully ignores the step parameter and ends up imaging 4-31 only...
- Solving for I and Q of NEWS7, NEWS1000, NEWS1009 (tile size 480) did not produce any improvement.
- Solving for I, Q and positions improved things appreciably. Still something left from NEWS1000, will leave this problem for later.
- Will solve for off-diagonals. This seems to do nothing (probably the perturbations are too small).
- Solved for ifr gains. Residual from central source removed, but no other improvement.
- Solving for ifr biases. This got rid of some crud at the phase centre, but no other improvement.
23/09/08:
- Tree building is relatively fast again: I've eliminated ns.search() which was taking a lot of time, and instead keep track of solvables explicitly. It is now fully practical to build a tree for with all 300 model sources. I have also modified Patch.py to add solvable and non-soklvable sources separately: this really speeds things up as the sum of non-solvables may be cached.
- Repeated G solution with all 300 sources. Memory footprint is wonderfully compact: only about 1.2Gb. Whole job took on the order of 30 minutes (dual-CPU laptop). Here's the residual map:
- Jan encourages me to try solving for source parameters in the image plane. This will require a wholly separate set of trees, will have to give it some more thought.
- Residual histogram is also interesting:
This shows a rather broad distribution out to 5-10 mJy, with a gaussian peak on top (FWHM ~ 1.2 uJy). I conjecture that the gaussian peak is the thermal noise. The rest is calibration artifacts/leftover flux from poorly subtracted sources.
- Here is a Q map. Mostly empty, but I see an interference pattern as well. Will try autoflagging on Q next.
Flagging on Q. See flag report: flagreport.ps.
af:=autoflag('3C147_spw1.MS') af.settimemed(column='CORR',hw=10,rowhw=10,expr='ABS - XX YY') af.setuvbin(plotchan=T,nbins=100,thr=.01,column='CORR',expr='ABS - XX YY') af.setdata() af.run(plotdev=[2,2]) NORMAL: 2 (0.00%) rows have been flagged. NORMAL: 53622 of 38626560 (0.14%) pixels have been flagged. NORMAL: UVBinner[ NORMAL: TimeMedian[]: 51495 pixel flags, 2 row flags
- No appreciable difference to the map though, so we'll forget about it for now.
- Another thing to try is to replace one of the brighter residual sources for which I tried to solve I/Q/pos with an extended source, and see if a better solution may be obtained in this way. Did this for NEWS1000, but it didn't really help (though it did cause a lot of LSM bugs to be fixed).
- Solver really gets into trouble solving for extended source parameters. In fact, I had to make the source symmetric, as it's P.A. solutions were all over the place.
- NEWS1040 is on a grating ring of NEWS1000. Conjecture: sources interfere, so a simultaneous solution is required. Did that (solving for I Q ra dec spi for both, and shape for NEWS1000). Got significantly diffeerrent solution for spi of NEWS1040 and it subtracted cleanly, but NEWS1000 still leaves some crud.
- Things to try: solve for spis of other sources that don't subtract cleanly.
25/09/08:
- Playing with the updated NEWSTAR models that Ger sent me (this time for the correct band). BAND1_BEAMED.MDL produces a rather bad solution (residuals an order of a magnitude higher, see image). Since the far-off-center sources appear to produce deep holes, I conjecture that this model actually contains INTRINSIC fluxes (Ger has confirmed this -- the BEAMED models actually have inverse beam correction applied, so that they contain intrinsic fluxes). In the meantime, will try BAND1.MDL instead.
- Indeed BAND1.MDL produces better solutions, chi-sq on the order of my best prior calibrations (~5e-8). Done solutions for G-B-G-B, no ifr gain solutions yet:
- The grating rings from NEWS1000 look rather odd. Trying to flag residuals again:
af:=autoflag('3C147_spw1.MS') af.setdata() af.setuvbin(plotchan=T,nbins=100,thr=.01,column='CORR') af.settimemed(column='CORR',hw=20,rowhw=20) af.run(plotdev=[2,2])
Saved this to flagset RESIDUALS3. See flagreport1.ps.
26/09/08:
- Sources are still not subtracted as cleanly as in Ger's maps (waiting for a map from Ger for a more quantitative comparison). My favour conjecture right now is that this is due to lack of correction for smearing.
- Tried an alternative calibration with G-Jones only (plus ifr gains). G_Jones was subtiled by 1 in time and frequency, which effectively emulated Ger's channel-by-channel selfcal approach. The resulting map was not improved (here, 50 brightest sources have been subtracted.) So, the problem with source subtraction is probably not due to calibration, lending more weight to the smearing conjecture.
The last solution revealed a serious problem with gdbm -- parmtables would simply crash after a 100 timeslots. This seems odd, as that's only 4*14*28*100 = 156800 funklets. That's more than we ever had before, but still not an unreasonable number. Debugged this problem down to gdbm's internals -- somehow fdh->header->dirsize grows uncontrollably and overflows an int. Tried relinking with qdbm (it has gdbm compatibility mode) and the problem went away. So, will recommend instalaltion of qdbm from now on (system will still build against gdbm if qdbm is not found).
Inserted an optimization into FastParmTable: 0-deg polcs are now stored as just a single double. This ought to make parmtables even faster. Seat-of-the-pants estimate says this is true -- when running solutions for ifr gains, the system would load about 4*14*28*120 = 188K funklets (the G solutions) for each tile within a second or two.
- Things to try later: pointing errors.
- For now, I intend to clean up Calico a bit. Ifr errors should be inserted as a separate optional module (a-la Jones modules); smearing corrections can follow the same contract. Otherwise trees built for source subtraction only will not work quite as well.
30/09/08:
- Tony pointed out the strange residual pattern in the last image. I think it has something to do with the way kvis handles zoom. Earlier images were made at 72'x72' 512x512 pixels, but from 25/09/08 I switched to 120'x120' 1024x1024 pixels. Here's the last image at full resolution, I see no such pattern in this (click on image for full-res version.)
- Fixed some bugs in Calico so that subtract-only mode now works right. Here's the best residual image so far, with all 298 sources subtracted (this has G and B and ifr gain solutions as previously, NOT per-channel selfcal.)
- Implemented a simple time/bandwidth smearing correction from first principles (so far applies to K_Jones only, so no decoherence.) Repeated GBG-ifr gains-BG solutions. First residual image for 20 sources -- note how much better the residuals already are (flux scale is the same as that of previous map, but only 20 sources have been subtracted.)
- This first implementation of smearing is somewhat expensive in terms of node counts: on the order of 2*Nsrc*Nifr extra nodes are needed. I was unable to smear all 298 sources on my laptop (2Gb needed for meqserver). Need to think about reducing this. In the meantime, I added the option of only smearing N brightest sources (really need to be more sophisticated here, and select on basis of apparent flux + distance from phase centre, but this will do for a start). This allowed me to subtract all 298 sources, while smearing the 100 brightest ones (click on image for full resolution): Map is dominated by (as far as I can see):
- grating rings from NEWS1000, NEWS1009. These are either poiting errors, or the sources are slightly extended.
- negative residuals from NEWS1003 (just right of phase centre). This is a strongly polarized source (according to Ger), so maybe UV flux is leaking in? Can't solve for that without decent XY/YX signal. I can try solving for Q though...
- crud from 3C147: radial spokes, and some faint grating rings. The former could probably be eliminated by smoothing the bandwidth/ifr gain solutions in time, but this requires proper parm management tools to be written first.
01/10/08:
- Ger thinks poining errors are the cause of off-centre residuals in his maps. Can possibly try solving for these, but I need a good beam model first. For starters, let's put in a cos6 beam model, and use his BAND1_BEAMED.MDL sky model (where source fluxes are intrinsic), to see if my beam model produces the same apparent fluxes.
Ger quoted a beam model of cos^3(64*freq_GHz*r). This produced more residuals in off-center sources than with BAND1.MDL. According to http://www.astron.nl/wsrt/wsrtGuide/node6.html, the scaling constant should be 68, will try that instead.
- C=68 also produces off-center crud. So looks like NEWSTAR's beam model is slightly different. Will try solving for C next.
- Solving for C yields values of 64.7~65.0 (used tilesize of 120), but the residuals still look crappy.
02/10/08:
- A day of two epiphanies. First of all, why solve for pointing when I can just solve for differential beam gain in the direction of troublesome sources? A pointing solution is only as good as the beam model, and a cos^3 model doesn't include irregularities in the beam caused by feed struts, etc. So if a troublesome source hits one of these, any pointing solution is useless. Differential beam gains introduce more parameters of course, but if we keep them fixed on long enough timescales, we have plenty of data for a good solution.
- With that in mind, I solved for differential gains for NEWS7, NEWS1000 and NEWS1009, with a tile size of 60. Initial run had the solver give up after 1 iteration. I had to increase the convergence criterion to 1e-6 to get it to do some iterations. The change to chi-sq is very tiny (~1e-9), yet the solution is an undeniable improvement (see images below). I presume that the "bulk" of chi-sq is due to the subtraction of 3C147 itself; any improvements to "Cat II" sources show up as only tiny changes to chi-sq. A way around this is to subtract 3C147 only and write this to MODEL_DATA, then work with MODEL_DATA from that point on.
- Second epiphany: the gains seem to jump wildly in the last channel (58). And of course, Ger mentioned that he uses channels up to 56 in his reductions. I removed 58 from my images and solutions, and this immediately improved the residual histograms: they are now dominated by thermal noise (see below).
- Solving for E (differential gain) for NEWS7, NEWS1000 and NEWS1009 causes them to subtract much more cleanly. Two images for comparison: the first has 20 sources subtracted without an E solution, the second has the same, but with an E solution. Note that the histograms in these images are complete (unlike previous histograms, which were zoomed in). The "gentle slope" in the histogram extending out to 10 mJy is gone. All this just because I removed channel 58?? Also note that NEWS7 is more or less at the location of NEWS1003 (this is not obvious from the annotations).
- Now here's a mystery: the difference between the above two maps looks like this: The contributions from the three sources are obvious, but what are these extra sources at the top of the map? Is this some kind of aliasing (with NEWS1000?) And why are other sources not aliased?
- Finally, a residual map with all 298 sources subtracted, and differential beam gains applied to the three troublesome sources: My conclusions from this:
- The histogram is now pretty much just gaussian noise, which is great.
- NEWS1000, NEWS7 (NEWS1003) still leave some grating rings. Perhaps they are slightly extended?
- 3C147 itself also leaves some grating rings.
Next step: use current solutions to subtract 3C147 and sources 21-298, write this to MODEL_DATA. Then, use this to solve for the remaining troublesome sources to see if I can get rid of them completely.
04/10/08:
- Subtracted sources 0-5 (i.e. 3C147 components, "Cat I") and 25-298 ("Cat III" -- too faint to leave any residuals, even with imperfect calibration) from the DATA column, wrote residuals to MODEL_DATA. This leaves sources 6-25 in the MODEL_DATA column ("Cat II" sources.) Trying to calibrate on this showed a number of pitfalls -- lots of options need to be clicked to get things right:
- When generating MODEL_DATA, need to click "subtract model" on, but "correct residuals" off. The residuals need to stay corrupted (since we subsequently apply a full calibration to them)!
- Since we invert phases in input data, the residuals in MODEL_DATA do NOT need to have phases inverted anymore.
- Hanning tapering needs to be switched off when reading from MODEL_DATA (since we already tapered when reading DATA).
- Once all that was done, used MODEL_DATA to solve for E gains in the direction of NEWS7+NEWS1003, NEWS1009, NEWS1000 (freq tiling 1, tile size 60). As I suspected, the solver iterated a bit more than when I was solving for them on the full DATA column.
- The residual image is considerably improved:
- What's still visible:
- crud around 3C147. Need to try smoothing out the bandpass solutions, perhaps this will make it go away.
- NEWS7 (polarized source just left of phase center). Tried to make it extended and solve for parameters, but no go. I suppose the structure of this source is too complicated for our model.
- NEWS1037 (way to left of center). I could try assigning an separate E term to this.
- a few faint traces of other sources.
- Opinions solicited: is there anything else useful that I could squeeze out of this band?
- Ger has suggested that pointing is, to the first order, a freq-independent effect. To check this, I repeated the previous solution (for E in four directions) with no frequency tiling, and a time tile of 60 (i.e. one constant gain per direction, per each 60 timeslots, no freq dependence.) The resulting map looks pretty much the same, which is good, since we've gotten there with 1/27th of the solvable parameters...
- Ger has sent me a residual map produced by NEWSTAR. Here's a side-by-side comparison of his map (first image), and my latest residuals (second image.) I have zoomed in on a central are where the troublesome off-axis sources are. NB: the intensity scale is the same in both images (+- .1 mJy), contrary to what the second histogram indicates. This is a "feature" of kvis.
For reference, here are the FITS files: B1-NEWSTAR.FITS B1-Calico.FITS (these are rather large, 4096x4096, 4x4 degrees.)
- Note that NEWSTAR produces a more compact noise distribution. This is due to more unknowns in the NEWSTAR model:
- NEWSTAR selfcal is one gain per channel, so effectively 14 gains per 80 baselines (Ger used a slightly different set of baselines), for a selfcal bias of 1+(14/80) = 1.2. This is the factor that noise needs to be multiplied by.
- My calibration uses 78 baselines, with:
- 13/27 receiver gains (one G gain, for each 1 timeslot and 27 channels)
- 13*2/120 bandpasses (two B bandpass coefficients, for each 120 timeslots and 1 channel)
- 78/120 interferometer gains (one interferometer gain, for each 120 timeslots)
- 4*13/60 directional gains (four sources per antenna, for each 60 timeslots)
- Here's another solution, this time another directional gain has been added to NEWS1037, which was clearly visible in the previous map (far right and above phase center.)
The upshot here is that it is relatively trivial to remove pointing errors even from relatively faint off-center sources (note that this is different from the 3C343 case, where the off-center source was only ~4 times fainter. The sources removed here are ~1000 times fainter than 3C147), at the cost of a few more unknowns in the model. Very few unknowns, in fact -- only one extra complex gain per antenna per source, for all 27 channels and 60 timeslots...
Gone on to calibrate band 0 and try some other approaches. This will be documented on a separate page: OlegSmirnov/CalibrationEfforts/3C147_Band0













