* Add db_window module for windowing functions from RocksDb
* Replace window with db_window functions in window_service
* Fix tests
* Make note of change in db_window
* Create RocksDb ledger in bin/fullnode
* Make db_ledger functions generic
* Add db_ledger to bin/replicator
* Cluster Replicated Data Store
Separate the data storage and merge strategy from the network IO boundary.
Implement an eager push overlay for transporting recent messages.
Simulation shows fast convergence with 20k nodes.
That's supposed to be an ASCII format, but we're not making use
of it. We can switch back to that some day, but if we do, it shouldn't
be JSON-encoded.
* Stub out architecture documentation
* Add book HTML generation and book tests to CI
* Add heading
* Better table of contents
* Reference existing documentation
Move ASCII art from code comments into rendered SVG
* Attempt to fix CI
* Add lamport docs
And truncate lines to 80 characters
* Fix links
And reference shorter, newer description of PoH.
* Replace ASCII art with SVG
* Streamline for Pillbox
* Update path before optional install
* Use $CARGO_HOME instead of $HOME
* Delete code
Attempt to describe all data structures without code.
* Boot RPU from docs, add JsonRpcService
Also, use Rust naming conventions in the block diagrams to
minimize the jump from docs to code.
* Latest code uses tick_height
* Rename bob/ folder to art/
A home for any ASCII art
* Import JSON RPC API
* More mdbook docs
* Add Ncp
* Cleanup links
* Move pipelining description into fullnode description
* Move high-level transaction docs into top-level doc
* Delete unused files
add linked-list capability to accounts
change accounts from a linked list to a VecDeque
add checkpoint and rollback for lastids
add subscriber notifications for rollbacks
checkpoint transaction count, too
* Add first leader to genesis entries, consume in genesis.sh
* Set bootstrap leader in the bank on startup, remove instantiation of bootstrap leader from bin/fullnode
* Remove need to initialize bootstrap leader in leader_scheduler, now can be read from genesis entries
* Add separate interface new_with_leader() in mint for creating genesis leader entries
* Add Vote Contract
* Move ownership of LeaderScheduler from Fullnode to the bank
* Modified ReplicateStage to consume leader information from bank
* Restart RPC Services in Leader To Validator Transition
* Make VoteContract Context Free
* Remove voting from ClusterInfo and Tpu
* Remove dependency on ActiveValidators in LeaderScheduler
* Switch VoteContract to have two steps 1) Register 2) Vote. Change thin client to create + register a voting account on fullnode startup
* Remove check in leader_to_validator transition for unique references to bank, b/c jsonrpc service and rpcpubsub hold references through jsonhttpserver
* Move ledger write to its own stage
- Also, rename write_stage to leader_vote_stage, as write functionality
is moved to a different stage
* Address review comments
* Fix leader rotation test failure
* address review comments
* Add PoH height to process_ledger()
* Moved broadcast_stage Leader Scheduling logic to use Poh height instead of entry_height
* Moved LeaderScheduler logic to PoH in ReplicateStage
* Fix Leader scheduling tests to use PoH instead of entry height
* Change is_leader detection in repair() to use PoH instead of entry height
* Add tests to LeaderScheduler for new functionality
* fix Entry::new and genesis block PoH counts
* Moved LeaderScheduler to PoH ticks
* Cleanup to resolve PR comments
* Added LeaderScheduler module and tests
* plumbing for LeaderScheduler in Fullnode + tests. Add vote processing for active set to ReplicateStage and WriteStage
* Add LeaderScheduler plumbing for Tvu, window, and tests
* Fix bank and switch tests to use new LeaderScheduler
* move leader rotation check from window service to replicate stage
* Add replicate_stage leader rotation exit test
* removed leader scheduler from the window service and associated modules/tests
* Corrected is_leader calculation in repair() function in window.rs
* Integrate LeaderScheduler with write_stage for leader to validator transitions
* Integrated LeaderScheduler with BroadcastStage
* Removed gossip leader rotation from crdt
* Add multi validator, leader test
* Comments and cleanup
* Remove unneeded checks from broadcast stage
* Fix case where a validator/leader need to immediately transition on startup after reading ledger and seeing they are not in the correct role
* Set new leader in validator -> validator transitions
* Clean up for PR comments, refactor LeaderScheduler from process_entry/process_ledger_tail
* Cleaned out LeaderScheduler options, implemented LeaderScheduler strategy that only picks the bootstrap leader to support existing tests, drone/airdrops
* Ignore test_full_leader_validator_network test due to bug where the next leader in line fails to get the last entry before rotation (b/c it hasn't started up yet). Added a test test_dropped_handoff_recovery go track this bug
* Add check in window_service to exit in checks for leader rotation, and propagate that service exit up to fullnode
* Added logic to shutdown Tvu once ReplicateStage finishes
* Added test for successfully shutting down validator and starting up leader
* Add test for leader validator interaction
* fix streamer to check for exit signal before checking socket again to prevent busy leaders from never returning
* PR comments - Rewrite make_consecutive_blobs() function, revert genesis function change
* Move recycler instances to the point of allocation
* sinks no longer need to call `recycle`
* Remove the recycler arguments from all the apis that no longer need them
- The write stage will output vector of entries
- Broadcast stage will create blobs out of the entries
- Helps reduce MIPS requirements for write stage
* rename NodeInfo field of Node from "data" to "info"
(touches a lot of files)
* update client to use gossip to find leader, a la drone
* rework multinode scripts
* move more stuff into rust
* added usage to all
* no more rsync unless you're a validator (TODO: whack that, too)
* fullnode doesn't bail if drone isn't up yet, just keeps trying
* drone doesn't bail if network isn't up yet, just keeps trying
* move gossip/NCP off assuming anything about its address
* use a single socket to send and receive gossip
* remove --addr/-a from CLIs
* rearrange networking utility code
* use Arc<UdpSocket> to share the Sync-safe UdpSocket among threads
* rename TestNode to Node
TODO:
* re-enable 127.0.0.1 as a valid address in crdt
* change repair request/response to a similar, single socket
* pick cloned sockets or Arc<UdpSocket> for all these (rpu uses tryclone())
* update contact_info with network truthiness instead of what the node
says?
Don't recover() for copy(), as copy() is already tolerant of things
recover() guards against. Note: recover() is problematic if the ledger is
"live", i.e. is currently being written to.
* add recover_ledger() to deal with expected common ledger corruptions
* add verify_ledger() for future use cases (ledger-tool)
* increase ledger testing
* allow replicate stage to run without a ledger
* ledger-tool to output valid json
* fixup!
* fixups!
* send the vote and count it
* actually vote
* test
* Spelling fixes
* Process the voting transaction in the leader's bank
* Send tokens to the leader
* Give leader tokens in more cases
* Test for write_stage::leader_vote
* Request airdrop inside fullnode and not the script
* Change readme to indicate that drone should be up before leader
And start drone before leader in snap scripts
* Rename _kp => _keypair for keypairs and other review fixups
* Remove empty else
* tweak test_leader_vote numbers to be closer to testing 2/3 boundary
* combine creating blob and transaction for leader/validator
Clippy told us to change function parameters to references, but
wasn't able to then tell us that the clone() before borrowing
was superfluous. This patch removes those by hand.
No expectation of a performance improvement here, since we were
just cloning reference counts. Just removes a bunch of noise.
* log responder error to warn
* log responder error to warn
* fixup!
* fixed assert
* fixed bad ports issue
* comments
* test for dummy address in Crdt::new instaad of NodeInfo::new
* return error if ContactInfo supplied to Crdt::new cannot be used to connect to network
* comments
wip
voting
wip
move voting into the replicate stage
update
fixup!
fixup!
fixup!
fixup!
fixup!
fixup!
fixup!
fixup!
fixup!
fixup!
update
fixup!
fixup!
fixup!
tpu processing votes in entries before write stage
fixup!
fixup!
txs
make sure validators have an account
fixup!
fixup!
fixup!
exit fullnode correctly
exit on exit not err
try 50
add delay for voting
300
300
startup logs
par start
100
no rayon
retry longer
log leader drop
fix distance
50 nodes
100
handle deserialize error
update
fix broadcast
new table every time
tweaks
table
update
try shuffle table
skip kill
skip add
purge test
fixed tests
rebase 2
fixed tests
fixed rebase
cleanup
ok for blobs to be longer then window
fix init window
60 nodes