CountedRef itself is thread-unsafe. That is, if two threads access the same ref in certain ways, all sort of nastiness may result. For example:
- Thread 1 goes to release the ref and delete the object
- Meanwhile, thread 2 tries to make a copy() of the ref, or simply accesses the target.
- Dereferencing the same ref (as long as that is the only operation being performed) is thread-safe though.
Making CountedRef itself fully thread-safe would involve:
- Locking a mutex on the ref for EVERY SINGLE ACCESS, including dereferencing.
- Disallowing dereferencing to a pointer, instead always returning a {lock,pointer} duplet. This could be implemeted to behave exactly like a pointer would, with the lock being released automatically upon destruction.
- On the other hand, ref targets ARE thread-safe. If each thread has its own ref to the same target, everything is safe including dereferencing (since deref-for-write would invoke COW). The same goes for hooks and container accesses -- as long as no threads use the same ref to the BOTTOMMOST container, COW ensures that the whole structure can never change under us.
Therefore, as long as threads take care to not access the same ref, everything is safe. OCTOPUSSY is fine, since messages are delivered by placing a new ref in each WPs queue, so a threaded WP always works with its own ref.
For multithreaded MeqTrees, the following considerations apply:
- The only data shared by nodes are Result::Refs. Since a parent gets its own ref to a child result, this is completely safe.
- During MT polling, worker threads deposit child results in the parent's child_results_ vector. Since no two workers ever deal with the same child, this is safe.
- Upon entering execute(), workers try to obtain an execute lock. So no two workers are ever executing the same node. This is safe.
- The markStateDependency() method and its ilk works in reverse, from child to parent. However, all it does is (1) raise a flag and (2) dereference its parent refs. This is safe since parent refs are fixed during execution. But, CHECK THIS.
However, if we allow a separate control thread with access to node state, etc., the following considerations apply:
- It is NOT SAFE to access node state while a node is executing. On the other hand, execute() may take a while depending on the children. Solution: have a separate state_mutex in the Node, lock this mutex in execute(), unlock during child polling, relock when child poll is finished.
- Debugging ops will of course need to honor the state lock.
- Any access to the forest object involving its refs needs to be protected by a forest mutex.
All other MeqServer ops (debugging, etc.) need to be reviewed with regard to state safety given a separate control thread.
If I implement an EventQueue for I/O sinks, this needs to make ref copies just like OCTOPUSSY does.
