Commit Graph

163 Commits

Author SHA1 Message Date
carllin 0bd1412562
Switch leader scheduler to use PoH ticks instead of Entry height (#1519)
* Add PoH height to process_ledger()

* Moved broadcast_stage Leader Scheduling logic to use Poh height instead of entry_height

* Moved LeaderScheduler logic to PoH in ReplicateStage

* Fix Leader scheduling tests to use PoH instead of entry height

* Change is_leader detection in repair() to use PoH instead of entry height

* Add tests to LeaderScheduler for new functionality

* fix Entry::new and genesis block PoH counts

* Moved LeaderScheduler to PoH ticks

* Cleanup to resolve PR comments
2018-10-18 22:57:48 -07:00
carllin 47f69f2d24
1) Switch broken tests to generate an empty tick in their ledgers to use as last_id, 2) Fix bug where PoH generator in BankingStage did not referenced the last tick instead of the last entry on startup, causing ledger verification to fail on the new tick added by the PoH generator (#1479) 2018-10-12 00:39:10 -07:00
carllin 9931ac9780
Leader scheduler plumbing (#1440)
* Added LeaderScheduler module and tests

* plumbing for LeaderScheduler in Fullnode + tests. Add vote processing for active set to ReplicateStage and WriteStage

* Add LeaderScheduler plumbing for Tvu, window, and tests

* Fix bank and switch tests to use new LeaderScheduler

* move leader rotation check from window service to replicate stage

* Add replicate_stage leader rotation exit test

* removed leader scheduler from the window service and associated modules/tests

* Corrected is_leader calculation in repair() function in window.rs

* Integrate LeaderScheduler with write_stage for leader to validator transitions

* Integrated LeaderScheduler with BroadcastStage

* Removed gossip leader rotation from crdt

* Add multi validator, leader test

* Comments and cleanup

* Remove unneeded checks from broadcast stage

* Fix case where a validator/leader need to immediately transition on startup after reading ledger and seeing they are not in the correct role

* Set new leader in validator -> validator transitions

* Clean up for PR comments, refactor LeaderScheduler from process_entry/process_ledger_tail

* Cleaned out LeaderScheduler options, implemented LeaderScheduler strategy that only picks the bootstrap leader to support existing tests, drone/airdrops

* Ignore test_full_leader_validator_network test due to bug where the next leader in line fails to get the last entry before rotation (b/c it hasn't started up yet). Added a test test_dropped_handoff_recovery go track this bug
2018-10-10 16:49:41 -07:00
Greg Fitzgerald 95701114e3 Crdt -> ClusterInfo 2018-10-09 03:49:39 -06:00
Pankaj Garg e10574c64d Remove recycler and it's usage
- The memory usage due to recycler was high, and incrementing with
  time.
2018-09-27 10:42:37 -06:00
Greg Fitzgerald c83dcea87d Move SystemTransaction into its own module 2018-09-26 14:17:15 -06:00
Greg Fitzgerald 694add9919 Move budget-specific and system-specific tx constructors into traits
These functions pull in budget-specific and system-specific
dependencies that aren't needed by the runtime.
2018-09-26 14:17:15 -06:00
carllin e7383a7e66
Validator to leader (#1303)
* Add check in window_service to exit in checks for leader rotation, and propagate that service exit up to fullnode

* Added logic to shutdown Tvu once ReplicateStage finishes

* Added test for successfully shutting down validator and starting up leader

* Add test for leader validator interaction

* fix streamer to check for exit signal before checking socket again to prevent busy leaders from never returning

* PR comments - Rewrite make_consecutive_blobs() function, revert genesis function change
2018-09-25 15:41:29 -07:00
Tyera Eulberg 751dd7eebb Move vote into ReplicateStage after process_entries 2018-09-25 13:43:35 -06:00
Rob Walker be31da3dce
lastidnotfound step 2: (#1300)
lastidnotfound step 2:
  * move "record stage", aka poh_service into banking stage
  * remove Entry.has_more, is incompatible with leader rotation
  * rewrite entry_next_hash in terms of Poh
  * simplify and unify transaction hashing (no embedded nulls)
  * register_last_entry from banking stage, fixes #1171 (w00t!)
  * new PoH doesn't generate empty ledger entries, so some fixes necessary in 
         multinode tests that rely on that (e.g. giving validators airdrops)
  * make window repair less patient, if we've been waiting for an answer, 
          don't be shy about most recent blobs
   * delete recorder and record stage
   * make more verbost  thin_client error reporting
   * more tracing in window (sigh)
2018-09-21 21:01:13 -07:00
Anatoly Yakovenko 431692d9d0 Use a Drop trait to keep track of lifetimes for recycled objects.
* Move recycler instances to the point of allocation
* sinks no longer need to call `recycle`
* Remove the recycler arguments from all the apis that no longer need them
2018-09-19 16:59:42 -06:00
Carl 6d27751365 give fullnode ownership of state needed to dynamically start up a tpu or tvu for role transition 2018-09-19 10:48:05 -06:00
anatoly yakovenko 6ec0e42220
budget as separate contract and system call contract (#1189)
* budget and system contracts and verification

* contract check_id methods
* system call contract
* verify contract execution rules
* move system into its own file
* allocate before transfer for budget
* store error in budget context
* budget contract and tests without bank
* moved budget of of bank
2018-09-17 13:36:31 -07:00
Pankaj Garg e142aafca9
Use multiple sockets for receiving blobs on validators (#1228)
* Use multiple sockets for receiving blobs on validators

- The blobs that are broadcasted by leader or retransmitted by peer
  validators are received on replicate_port
- Using reuse_addr/reuse_port, multiple sockets can be opened for
  the same port
- This allows the kernel to queue data to user space app on multiple
  socket queues, preventing over-running one queue
- This helps with reducing packets dropped due to queue over-runs

Fixes #1224

* Fixed failing tests
2018-09-14 16:56:06 -07:00
carllin 8706774ea7
Rewrote service trait join() method to allow thread join handles to return values other than () (#1213) 2018-09-13 14:00:17 -07:00
Rob Walker a8fdb8a5a7 use a single BlobRecycler per fullnode 2018-09-11 16:56:54 -07:00
Greg Fitzgerald fc64e1853c Initialize Window, not SharedWindow
Wrap with Arc<RwLock>> when/if needed, no earlier.
2018-09-10 11:40:26 -06:00
Greg Fitzgerald 8cc030ef84 Use Vec instead of VecDeque for SharedBlobs 2018-09-04 07:50:23 -10:00
Greg Fitzgerald c9a1ac9b8c Don't propogate errors we'll never handle 2018-09-04 06:01:32 -10:00
Rob Walker 176e806d94 rework of netwrk rendezvous
* rename NodeInfo field of Node from "data" to "info"
      (touches a lot of files)

  * update client to use gossip to find leader, a la drone

  * rework multinode scripts
      * move more stuff into rust
      * added usage to all
      * no more rsync unless you're a validator (TODO: whack that, too)
  * fullnode doesn't bail if drone isn't up yet, just keeps trying
  * drone doesn't bail if network isn't up yet, just keeps trying
2018-08-31 23:21:07 +09:00
Rob Walker 63e44dcc35 continue rendezvous refactor for gossip and repair
* remove trailing whitespace in ci/audit.sh

  * code review fixups
     * rename GOSSIP_PORT_RANGE => SOLANA_PORT_RANGE
     * remove out-of-date TODO in localnet-sanity.sh

  * remove features=test and code that was using it (localhost prohibitions in
      crdt) added TODO in crdt.rs, maybe we should boot localhost in production
      networks?

  * boot tvu_window from NodeInfo: instead, send repair requests from the repair
      socket (to gossip on peer) and answer repair requests via the sockaddr
      from the repair request

  * remove various unused pub functions

  * banish SocketAddr parse().unwrap() to a macro that can also accept simpler stuff
2018-08-31 23:21:07 +09:00
Rob Walker 1af4cee63b fix #1079
* move gossip/NCP off assuming anything about its address
  * use a single socket to send and receive gossip
  * remove --addr/-a from CLIs
  * rearrange networking utility code
  * use Arc<UdpSocket> to share the Sync-safe UdpSocket among threads
  * rename TestNode to Node

TODO:

  * re-enable 127.0.0.1 as a valid address in crdt
  * change repair request/response to a similar, single socket
  * pick cloned sockets or Arc<UdpSocket> for all these (rpu uses tryclone())
  * update contact_info with network truthiness instead of what the node
      says?
2018-08-31 23:21:07 +09:00
Greg Fitzgerald 2727067b94 Move winow into its own module 2018-08-13 20:17:16 -06:00
Greg Fitzgerald 6a8a494f5d Rename WindowStage to RetransmitStage
The window is used for both broadcasting from leader to validator
and retransmitting between validators.
2018-08-13 20:17:16 -06:00
Greg Fitzgerald c2bbe4344e Rename KeyPair to Keypair 2018-08-09 13:41:37 -06:00
Rob Walker fbc754ea25 plug in LedgerWindow
fixes #872
2018-08-07 17:27:53 -07:00
Rob Walker c3db2df7eb tweak random access ledger
* add recover_ledger() to deal with expected common ledger corruptions
  * add verify_ledger() for future use cases (ledger-tool)
  * increase ledger testing
  * allow replicate stage to run without a ledger
  * ledger-tool to output valid json
2018-08-06 08:51:41 -07:00
Rob Walker 715a3d50fe Revert "Revert "clippy fixup""
This reverts commit d173e6ef87.
2018-08-06 08:51:41 -07:00
Rob Walker 692b125391 Revert "Revert "fixups""
This reverts commit e2c68d8775.
2018-08-06 08:51:41 -07:00
Rob Walker 5193819d8e Revert "Revert "plug in new ledger""
This reverts commit 57e928d1d0.
2018-08-06 08:51:41 -07:00
Rob Walker 57e928d1d0 Revert "plug in new ledger"
This reverts commit 46d9ba5ca0.
2018-08-03 10:24:51 -07:00
Rob Walker e2c68d8775 Revert "fixups"
This reverts commit b72e91f681.
2018-08-03 10:24:51 -07:00
Rob Walker d173e6ef87 Revert "clippy fixup"
This reverts commit 384b486b29.
2018-08-03 10:24:51 -07:00
Rob Walker 384b486b29 clippy fixup 2018-08-02 21:50:47 -07:00
Rob Walker b72e91f681 fixups 2018-08-02 21:50:47 -07:00
Rob Walker 46d9ba5ca0 plug in new ledger 2018-08-02 21:50:47 -07:00
Tyera Eulberg 448b8b1c17 Add Hash wrapper and supporting traits 2018-08-01 17:00:51 -07:00
sakridge 2ea6f86199 Submit leader's vote after observing 2/3 validator votes (#780)
* fixup!

* fixups!

* send the vote and count it

* actually vote

* test

* Spelling fixes

* Process the voting transaction in the leader's bank

* Send tokens to the leader

* Give leader tokens in more cases

* Test for write_stage::leader_vote

* Request airdrop inside fullnode and not the script

* Change readme to indicate that drone should be up before leader

And start drone before leader in snap scripts

* Rename _kp => _keypair for keypairs and other review fixups

* Remove empty else
* tweak test_leader_vote numbers to be closer to testing 2/3 boundary
* combine creating blob and transaction for leader/validator
2018-07-31 22:07:53 -07:00
anatoly yakovenko 308b6c3371
Follow Shared prefix convention for Window alias (#798)
Follow Shared prefix convention for Window alias.
2018-07-30 16:56:01 -07:00
Michael Vines 7672506b45 Validators now vote once a second regardless 2018-07-26 17:07:42 -07:00
Greg Fitzgerald 28af9a39b4 Don't clone before borrowing
Clippy told us to change function parameters to references, but
wasn't able to then tell us that the clone() before borrowing
was superfluous. This patch removes those by hand.

No expectation of a performance improvement here, since we were
just cloning reference counts. Just removes a bunch of noise.
2018-07-18 08:04:31 -04:00
anatoly yakovenko d8c9655128
Dynamic test assert (#643)
* log responder error to warn

* log responder error to warn

* fixup!

* fixed assert

* fixed bad ports issue

* comments

* test for dummy address in Crdt::new instaad of NodeInfo::new

* return error if ContactInfo supplied to Crdt::new cannot be used to connect to network

* comments
2018-07-16 19:31:52 -07:00
Greg Fitzgerald 30f0c25b65 Fix all remaining clippy warnings
Fixes #586
2018-07-12 09:40:40 -06:00
Greg Fitzgerald 73ae3c3301 Apply most of clippy's feedback 2018-07-12 09:40:40 -06:00
Stephen Akridge bed5438831 Improved streamer debug messages
distinguish between threads
2018-07-11 18:26:16 +02:00
Anatoly Yakovenko d531b9645d review comments 2018-07-10 13:32:31 -06:00
Anatoly Yakovenko be2bf69c93 initial vote stage
wip

voting

wip

move voting into the replicate stage

update

fixup!

fixup!

fixup!

fixup!

fixup!

fixup!

fixup!

fixup!

fixup!

fixup!

update

fixup!

fixup!

fixup!

tpu processing votes in entries before write stage

fixup!

fixup!

txs

make sure validators have an account

fixup!

fixup!

fixup!

exit fullnode correctly

exit on exit not err

try 50

add delay for voting

300

300

startup logs

par start

100

no rayon

retry longer

log leader drop

fix distance

50 nodes

100

handle deserialize error

update

fix broadcast

new table every time

tweaks

table

update

try shuffle table

skip kill

skip add

purge test

fixed tests

rebase 2

fixed tests

fixed rebase

cleanup

ok for blobs to be longer then window

fix init window

60 nodes
2018-07-10 13:32:31 -06:00
Greg Fitzgerald c65c0d9b23 Expose fewer exit variables 2018-07-10 11:11:36 -06:00
Anatoly Yakovenko 63985d4595 renamed to contact_info 2018-07-09 20:40:14 -06:00
Anatoly Yakovenko 2ea030be48 stick all the addrs into one struct 2018-07-09 20:40:14 -06:00
Greg Fitzgerald c4fa841aa9 Remove exit variable from respond [stage]
And drop the sender that feeds input to the responder.
2018-07-05 17:32:41 -06:00
Greg Fitzgerald f284af1c3d Remove exit variable from WindowStage and retransmit [stage] 2018-07-05 17:32:41 -06:00
Greg Fitzgerald 46602ba9c3 Remove exit variable from ReplicateStage 2018-07-05 17:32:41 -06:00
Greg Fitzgerald 77bf17064a Add Service trait
Added a consistent interface to all the microservices.
2018-07-04 16:40:34 -06:00
Greg Fitzgerald 0dabdfd48e Use zero to represent a nonexistent account
This also fixes a bug in the thin client where a nonexistent account
would have triggered a panic because we were using `balances[k]` instead
of `balances.get(key)`.

Fixes #534
2018-07-02 18:48:40 -06:00
Greg Fitzgerald 5d17c2b58f Return output receivers from each stage
Reaching into the stages' structs for their receivers is, in hindsight,
more awkward than returning multiple values from constructors. By
returning the receiver, the caller can name the receiver whatever it
wants (as you would with any return value), and doesn't need to
reach into the struct for the field (which is super awkward in
combination with move semantics).
2018-07-02 16:18:32 -06:00
Stephen Akridge 1c9e7dbc45 Don't recycle in the replicate stage
Windowing stage owns all the blobs now
2018-06-29 07:14:47 -06:00
Rob Walker 2f42658cd4 ... 2018-06-27 14:51:18 -07:00
Greg Fitzgerald 4aedd3f1b6 Cleanup type aliases and imports 2018-06-27 15:06:18 -06:00
Greg Fitzgerald d5c0557891 Fix test_replicate too 2018-06-26 16:51:07 -06:00
Rob Walker 55ec7f9fe9 add entry.has_more
* quick fix for really big genesis
 * longer term fix for possible parallel verification over multiple
      Blobs/Entries
2018-06-26 13:57:10 -07:00
Rob Walker 1919ec247b add a clock to validator windows (part 3 of #309) (#448)
* count entries processed by Bank
 * initialize windows with initial height of Entries
2018-06-25 15:07:48 -07:00
Rob Walker 5e91d31ed3 issue 309 part 1
* limit the number of Tntries per Blob to at most one
* limit the number of Transactions per Entry such that an Entry will
    always fit in a Blob

With a one-to-one map of Entries to Blobs, recovery of a validator
  is a simple fast-forward from the end of the initial genesis.log
  and tx-*.logs Entries.

TODO: initialize validators' blob index with initial # of Entries.
2018-06-22 09:58:51 -07:00
Anatoly Yakovenko 586141adb2 Cleanup TVU docs 2018-06-15 22:45:35 -06:00
Greg Fitzgerald 327ee1dae8 Apply feedback from @aeyakovenko 2018-06-15 17:01:38 -06:00
Greg Fitzgerald 22885c3e64 Add TVU ASCII art 2018-06-15 17:01:38 -06:00
anatoly yakovenko c24b0a1a3f
TVU rework (#352)
Refactored TVU, into stages
* blob fetch stage for blobs
* window stage for maintaining the blob window
* pulled out NCP out of the TVU so they can be separate units
TVU is now just the fetch -> window -> request and bank processing
2018-06-13 21:52:23 -07:00
Greg Fitzgerald 7aa05618a3 data_replicator -> ncp
Fixes #327
2018-06-07 17:11:17 -06:00
anatoly yakovenko 216510c573
repair socket and receiver thread (#303)
repair socket and receiver thread
2018-06-02 08:32:51 -07:00
Anatoly Yakovenko cef1c208a5 Crdt pipeline, coalesce window repair requests in the listener by examining all of them at once, and ublock those threads from doing io. 2018-05-30 14:04:48 -06:00
Greg Fitzgerald cf5671d058 tr -> tx
Missed a few.
2018-05-29 10:38:58 -06:00
Greg Fitzgerald 58c1589688 More typos 2018-05-26 00:36:50 -06:00
Greg Fitzgerald fc00594ea4 Move multinode test to integration tests 2018-05-26 00:36:50 -06:00
Greg Fitzgerald 9f5a3d6064 events -> transactions 2018-05-25 16:47:21 -06:00
Greg Fitzgerald 4cdf873f98 Delete event.rs 2018-05-25 16:47:21 -06:00
Greg Fitzgerald 73d3c17507 Migrate from Event to Transaction Timestramp/Signature 2018-05-24 10:10:41 -06:00
Greg Fitzgerald 7f647a93da Add last_id to Event timestamp/signature constructors 2018-05-24 10:10:41 -06:00
Anatoly Yakovenko 87e025fe22 fmt 2018-05-23 12:07:44 -06:00
Anatoly Yakovenko 8049323ca8 @garious review 2018-05-23 12:07:44 -06:00
Anatoly Yakovenko b38c7ea2ff fmt 2018-05-23 12:07:44 -06:00
Anatoly Yakovenko 239b925fb3 woop 2018-05-23 12:07:44 -06:00
Anatoly Yakovenko 60da7f7aaf wip 2018-05-23 12:07:44 -06:00
Anatoly Yakovenko 8646ff4927 refactor wip 2018-05-23 12:07:44 -06:00
Anatoly Yakovenko 437c485e5c cleanup 2018-05-23 12:07:44 -06:00
Greg Fitzgerald abfd7d6951
Merge pull request #234 from sakridge/fix_events_addr
Send events to the right address
2018-05-22 16:59:28 -06:00
Anatoly Yakovenko 021953d59a cleanup 2018-05-22 15:30:46 -07:00
Anatoly Yakovenko bbe89df2ff fmt 2018-05-22 15:18:07 -07:00
Anatoly Yakovenko a638ec5911 builds 2018-05-22 15:17:59 -07:00
Stephen Akridge 8454eb79d0 Send events to the right address and set recv socket timeout 2018-05-22 13:52:50 -07:00
Greg Fitzgerald 6c1f1c2a7a Promote create_entry() to Entry::new() 2018-05-16 23:18:58 -07:00
Greg Fitzgerald 9c62f8d81f Add Event::Transaction constructor 2018-05-16 23:18:58 -07:00
Greg Fitzgerald f7083e0923 Remove transaction processing from RPU and request processing from TVU 2018-05-15 12:15:29 -06:00
Greg Fitzgerald 7e44005a0f Don't do error-prone things in functions that spawn threads 2018-05-15 09:53:51 -06:00
Greg Fitzgerald ee3fb985ea Hoist set_timeout 2018-05-15 09:42:28 -06:00
Greg Fitzgerald 0a46bbe4f9
Merge pull request #219 from garious/add-write-stage
Move write_service and drain_service into new write_stage module
2018-05-14 17:18:04 -06:00
Greg Fitzgerald 81706f2d75 Move write_service and drain_service into new write_stage module 2018-05-14 16:31:31 -06:00
Anatoly Yakovenko 2d635386af rebased 2018-05-14 15:20:41 -07:00
Greg Fitzgerald 7736b9cac6 Boot Alice and Bob from the unit tests 2018-05-14 15:39:34 -06:00
Greg Fitzgerald d2dd005a59 accountant -> bank 2018-05-14 15:33:11 -06:00
Greg Fitzgerald 6e8f99d9b2 Purge EventProcessor 2018-05-14 14:45:29 -06:00