In 7e48886fd3, `Node::Pruned` was
introduced in order to fix problems with `LocatedTree::max_position`
resulting from reinsertion of subtree information as children of
a previously pruned subtree represented by a `Node::Leaf` at its
root. This fix introduced a large amount of complexity that is
better resolved by fixing the `max_position` function in a different
way and generally minimizing its usage.
In circumstances where insertion into a subtree results in pruning, and
then a subsequent insertion within the same range contains leaves that
must be retained, it is necessary to be able to distinguish the maximum
position among notes that have been observed but later pruned. This
fixes a bug wherein an insertion into an already-pruned tree could cause
the maximum position reported for the subtree to regress.
This new retention type is intended to be used when inserting frontiers
that should not automatically be pruned.
Also, improve documentation for the `Retention` type.
There are no `ShardTree` APIs that allow for checkpointing at arbitrary
positions, which makes sense as the relevant leaf may have been pruned.
For this API to be a viable substitute for `ShardTree::batch_insert`, it
therefore needs to support marking any checkpoints associated with the
inserted tree.
`ShardTree::max_leaf_position` relies on the invariant that the last
shard in the subtrees vector is never created without a leaf then being
added to it. `LocatedTree::decompose_to_level` can return a trailing
empty subtree for some inputs, and given that it is always correct to
not insert an empty subtree into `self`, we maintain the invariant by
skipping empty subtrees.
`LocatedTree::decompose_to_level` will return the tree as-is if it is
smaller than a shard, so we can't assume that the address of `subtree`
is a valid shard address.
The general logic in this function is:
- Build a stack that contains all of the complete subtrees contained
within the iterator.
- Take nodes off the top of the stack, and combine them together with
each other and empty nodes until the resulting right-hand subtree is
larger than all remaining subtrees on the stack.
- Make a single pass over the stack from bottom to top, accumulating
subtrees and adding empty nodes on the left until a single tree is
obtained.
The bug was that the second step was not leaving the right-hand subtree
sufficiently large. The final single-pass step expects every subtree in
the stack to be no smaller than the accumulated tree up to that point,
and in particular the right-hand subtree must be the direct right child
of the final tree. Some specific sequences of leaves were leaving the
right-hand subtree in a state where it was the same size as the largest
individual subtree of the rest of the stack, but smaller than their
accumulated tree.
The solution is to pre-compute what the address of the final tree will
be, and then add empty nodes on the right to the right-hand subtree
until it is the direct right child.
Co-authored-by: Kris Nuttycombe <kris@nutty.land>
The logic in `LocatedPrunableTree::from_iter` is violating an invariant
of the `unite` helper function. The new assertion prevents an invalid
tree state from being returned.
The previous logic had the non-obvious behaviour that if the inserted
subtree was a leaf, then the `.iter().all()` was not rejecting subtrees
with non-matching values, because `Node::annotation` returns `None` for
leaves. This didn't cause a bug, because this behaviour is actually
intentionally triggered by the previous `is_complete` branch of the
conditional (as leaves are by definition complete subtrees).
The simplified logic is more obvious, because `PrunableTree::node_value`
returns the value of leaves and the annotation of nodes (which are the
same type for `PrunableTree`). It does change how leaves are treated by
this branch of the conditional, so a test has been added to ensure that
the intended behaviour is preserved if the `is_complete` branch is
refactored in future.
Co-authored-by: Kris Nuttycombe <kris@nutty.land>