The general logic in this function is:
- Build a stack that contains all of the complete subtrees contained
within the iterator.
- Take nodes off the top of the stack, and combine them together with
each other and empty nodes until the resulting right-hand subtree is
larger than all remaining subtrees on the stack.
- Make a single pass over the stack from bottom to top, accumulating
subtrees and adding empty nodes on the left until a single tree is
obtained.
The bug was that the second step was not leaving the right-hand subtree
sufficiently large. The final single-pass step expects every subtree in
the stack to be no smaller than the accumulated tree up to that point,
and in particular the right-hand subtree must be the direct right child
of the final tree. Some specific sequences of leaves were leaving the
right-hand subtree in a state where it was the same size as the largest
individual subtree of the rest of the stack, but smaller than their
accumulated tree.
The solution is to pre-compute what the address of the final tree will
be, and then add empty nodes on the right to the right-hand subtree
until it is the direct right child.
Co-authored-by: Kris Nuttycombe <kris@nutty.land>
The logic in `LocatedPrunableTree::from_iter` is violating an invariant
of the `unite` helper function. The new assertion prevents an invalid
tree state from being returned.
The previous logic had the non-obvious behaviour that if the inserted
subtree was a leaf, then the `.iter().all()` was not rejecting subtrees
with non-matching values, because `Node::annotation` returns `None` for
leaves. This didn't cause a bug, because this behaviour is actually
intentionally triggered by the previous `is_complete` branch of the
conditional (as leaves are by definition complete subtrees).
The simplified logic is more obvious, because `PrunableTree::node_value`
returns the value of leaves and the annotation of nodes (which are the
same type for `PrunableTree`). It does change how leaves are treated by
this branch of the conditional, so a test has been added to ensure that
the intended behaviour is preserved if the `is_complete` branch is
refactored in future.
Co-authored-by: Kris Nuttycombe <kris@nutty.land>
If we have a retained checkpoint at the same position and with the same
flags as a pruned checkpoint, we need to preserve the flags on the node.
The flags are cleared when the last of the checkpoints at the position
is removed.
Co-authored-by: Kris Nuttycombe <kris@nutty.land>
The intent of SHARD_HEIGHT is that each subtree should contain
2^SHARD_HEIGHT nodes. The `ShardTree::subtree_level` function was
incorrectly computing the level of the root address, and consequently
the resulting subtrees were too small.
This introduces a `ShardTreeError` type that is used to wrap errors
that arise from the undelying `ShardStore` when used in `ShardTree`
methods. The previous approach that relied on `From<QueryError> +
From<InsertionError>` bounds resulted in a situation where the shard
store error was forced to encompass the bounds of these types, even
though the methods on the `ShardStore` itself could never generate them.
This resulted in a number of awkward problems for error conversion.
The cap is a truncated tree that can be used to cache precomputed
subtree roots at tree heights greater than `SHARD_HEIGHT`. Only roots
whose hashes cannot be altered by rewinds or by the addition of data to
the tree may be persisted to the cap.
The choice to use an initial checkpoint as an anchor for removals in
`unmark` operations produced an awkward situation where it was necessary
to choose a checkpoint identifier just to construct an empty tree. This
removes that decision in favor of eagerly removing data if there is no
checkpoint to tie the removal to.
When implementing `ShardStore` atop a persistent data store, we need to
be able to reflect failures of the storage layer that are independent of
the presence or absence of data.
The primary implementations of `ShardStore` will be constructing
owned values from an underlying persistence layer, and it is not
possible to return a reference to an owned value.