Once memory use goes above ~2.5 gigs on 32 bit machines, nodes will start failing due to bad allocations and all sorts of general hell breaks loose.
How many timeslots/channels are you using? With 10 unknows and large domains (on the order of 1000 timeslots/50 channels, etc.), memory use can be a huge problem. In the meantime, the only way to conserve memory is to reduce domain size:
try playing with tile_size, if you solve for, say, 50 or 100 timeslots at a time, it ought to work fine (it would also be interesting to see how stable the flux solutions would be between each tile.)
use fewer channels.
Also, I suggest you read the SmartCaching page in the wiki, this will help you understand where exactly your memory is going. Basically, caching per-baseline is the killer. When you solve for fluxes only, smart caching will end up as per-baseline unless told otherwise (see wiki for telling it otherwise), however, as soon as you have at least one station-dependant parameter in your solvables, it's no longer an issue since caches move up the tree to station-dependant branches.
(NB: no longer true. At the moment the solver has effectively an extra internal per-baseline cache. This is an atavism. As soon as Michiel eliminates it, we'll all breathe a lot easier.)
My contention is that we should be able to handle "WSRT-sized" trees on one machine. Basically, if memory is a problem while CPU time isn't, then your application isn't using resources in a balanced way. In the case of trees, you can trade off memory vs. cpu by manipulating caching parameters.
