Commit Graph

36 Commits

Author SHA1 Message Date
Greg Fitzgerald 5e703dc70a Free up the term 'replicate' for exclusive use in replicator
Also, align Sockets field names with ContactInfo.
2018-12-10 15:26:43 -07:00
Greg Fitzgerald a8d6c75a24 cargo +nightly fix --features=bpf_c,cuda,erasure,chacha --edition-idioms 2018-12-08 23:19:55 -07:00
Greg Fitzgerald ec5a8141eb cargo fix --edition 2018-12-08 23:19:55 -07:00
Greg Fitzgerald 0a83b17cdd
Upgrade to Rust 1.31.0 (#2052)
* Upgrade to Rust 1.31.0
* Upgrade nightly
* Fix all clippy warnings
* Revert relaxed version check and update
2018-12-07 20:01:28 -07:00
Greg Fitzgerald 97b1156a7a Rename Ncp to GossipService
And BroadcastStage to BroadcastService since it's not included in the
TPU pipeline.
2018-12-06 15:48:19 -07:00
carllin 57a384d6a0
Rocks db window service (#1888)
* Add db_window module for windowing functions from RocksDb

* Replace window with db_window functions in window_service

* Fix tests

* Make note of change in db_window

* Create RocksDb ledger in bin/fullnode

* Make db_ledger functions generic

* Add db_ledger to bin/replicator
2018-11-24 19:32:33 -08:00
Rob Walker 3d113611cc
remove Result<> return from ClusterInfo::new() (#1869)
strip Result<> for ClusterInfo::new()
2018-11-19 11:25:14 -08:00
Michael Vines d96a6b42a5 Move drone into its own crate 2018-11-16 20:42:21 -08:00
Michael Vines 6ac5700f2e Move metrics into its own crate 2018-11-16 15:10:07 -08:00
anatoly yakovenko a41254e18c
Add scalable gossip library (#1546)
* Cluster Replicated Data Store

Separate the data storage and merge strategy from the network IO boundary.
Implement an eager push overlay for transporting recent messages.

Simulation shows fast convergence with 20k nodes.
2018-11-15 13:23:26 -08:00
Rob Walker 6c10458b5b
leader slots in Blobs (#1732)
* add leader slot to Blobs
* remove get_X() methods in favor of X() methods for Blob
* add slot to get_scheduled_leader()
2018-11-07 13:18:14 -08:00
Rob Walker 13bfdde228
remove ledger tail code, WINDOW_SIZE begone (#1617)
* remove WINDOW_SIZE, use window.window_size()
* move ledger tail, redundant with ledger-based repair
2018-10-30 10:05:18 -07:00
Michael Vines e47fcb196b s/solana_program_interface/solana[_-]sdk/g 2018-10-25 12:31:45 -07:00
Pankaj Garg df9ccce5b2
Remove hostname() from calls to metrics as it's expensive operation (#1557) 2018-10-20 06:38:20 -07:00
carllin 0bd1412562
Switch leader scheduler to use PoH ticks instead of Entry height (#1519)
* Add PoH height to process_ledger()

* Moved broadcast_stage Leader Scheduling logic to use Poh height instead of entry_height

* Moved LeaderScheduler logic to PoH in ReplicateStage

* Fix Leader scheduling tests to use PoH instead of entry height

* Change is_leader detection in repair() to use PoH instead of entry height

* Add tests to LeaderScheduler for new functionality

* fix Entry::new and genesis block PoH counts

* Moved LeaderScheduler to PoH ticks

* Cleanup to resolve PR comments
2018-10-18 22:57:48 -07:00
Pankaj Garg 31e779d3f2
Added counters to track more metrics on dashboard (#1535)
- Total number of IP packets TX/RX from all nodes in the testnet
- Last consumed index on validator
- Last transmitted index on leader
2018-10-17 17:32:50 -07:00
Pankaj Garg f6c10d8a2e
Add channel pressure for validator TVU stages (#1509) 2018-10-16 12:54:23 -07:00
carllin 9931ac9780
Leader scheduler plumbing (#1440)
* Added LeaderScheduler module and tests

* plumbing for LeaderScheduler in Fullnode + tests. Add vote processing for active set to ReplicateStage and WriteStage

* Add LeaderScheduler plumbing for Tvu, window, and tests

* Fix bank and switch tests to use new LeaderScheduler

* move leader rotation check from window service to replicate stage

* Add replicate_stage leader rotation exit test

* removed leader scheduler from the window service and associated modules/tests

* Corrected is_leader calculation in repair() function in window.rs

* Integrate LeaderScheduler with write_stage for leader to validator transitions

* Integrated LeaderScheduler with BroadcastStage

* Removed gossip leader rotation from crdt

* Add multi validator, leader test

* Comments and cleanup

* Remove unneeded checks from broadcast stage

* Fix case where a validator/leader need to immediately transition on startup after reading ledger and seeing they are not in the correct role

* Set new leader in validator -> validator transitions

* Clean up for PR comments, refactor LeaderScheduler from process_entry/process_ledger_tail

* Cleaned out LeaderScheduler options, implemented LeaderScheduler strategy that only picks the bootstrap leader to support existing tests, drone/airdrops

* Ignore test_full_leader_validator_network test due to bug where the next leader in line fails to get the last entry before rotation (b/c it hasn't started up yet). Added a test test_dropped_handoff_recovery go track this bug
2018-10-10 16:49:41 -07:00
Greg Fitzgerald 95701114e3 Crdt -> ClusterInfo 2018-10-09 03:49:39 -06:00
Greg Fitzgerald 423e7ebc3f Pacify clippy 2018-09-27 16:21:12 -06:00
Pankaj Garg e10574c64d Remove recycler and it's usage
- The memory usage due to recycler was high, and incrementing with
  time.
2018-09-27 10:42:37 -06:00
jackcmay 9c47e022dc
break dependency of programs on solana core (#1371)
* break dependency of programs on Solana core
2018-09-27 07:49:26 -07:00
Greg Fitzgerald b7ae5b712a Move Pubkey into its own module 2018-09-26 20:40:40 -06:00
Stephen Akridge a5b28349ed Add max entry height to download for replicator 2018-09-26 09:57:22 -07:00
carllin e7383a7e66
Validator to leader (#1303)
* Add check in window_service to exit in checks for leader rotation, and propagate that service exit up to fullnode

* Added logic to shutdown Tvu once ReplicateStage finishes

* Added test for successfully shutting down validator and starting up leader

* Add test for leader validator interaction

* fix streamer to check for exit signal before checking socket again to prevent busy leaders from never returning

* PR comments - Rewrite make_consecutive_blobs() function, revert genesis function change
2018-09-25 15:41:29 -07:00
Rob Walker 304d63623f give replication some time to happen
fixes #1307
2018-09-24 23:57:09 -07:00
Rob Walker a51c2f193e fix Rob and Carl crossing wires 2018-09-21 21:37:25 -07:00
Rob Walker be31da3dce
lastidnotfound step 2: (#1300)
lastidnotfound step 2:
  * move "record stage", aka poh_service into banking stage
  * remove Entry.has_more, is incompatible with leader rotation
  * rewrite entry_next_hash in terms of Poh
  * simplify and unify transaction hashing (no embedded nulls)
  * register_last_entry from banking stage, fixes #1171 (w00t!)
  * new PoH doesn't generate empty ledger entries, so some fixes necessary in 
         multinode tests that rely on that (e.g. giving validators airdrops)
  * make window repair less patient, if we've been waiting for an answer, 
          don't be shy about most recent blobs
   * delete recorder and record stage
   * make more verbost  thin_client error reporting
   * more tracing in window (sigh)
2018-09-21 21:01:13 -07:00
carllin c50ac96f75
Moved deserialization of blobs to entries from replicate_stage to window_service (#1287) 2018-09-21 16:01:24 -07:00
Greg Fitzgerald 6073cd57fa Boot Recycler::recycle() 2018-09-20 17:08:51 -06:00
Anatoly Yakovenko 431692d9d0 Use a Drop trait to keep track of lifetimes for recycled objects.
* Move recycler instances to the point of allocation
* sinks no longer need to call `recycle`
* Remove the recycler arguments from all the apis that no longer need them
2018-09-19 16:59:42 -06:00
Pankaj Garg e142aafca9
Use multiple sockets for receiving blobs on validators (#1228)
* Use multiple sockets for receiving blobs on validators

- The blobs that are broadcasted by leader or retransmitted by peer
  validators are received on replicate_port
- Using reuse_addr/reuse_port, multiple sockets can be opened for
  the same port
- This allows the kernel to queue data to user space app on multiple
  socket queues, preventing over-running one queue
- This helps with reducing packets dropped due to queue over-runs

Fixes #1224

* Fixed failing tests
2018-09-14 16:56:06 -07:00
Michael Vines 4196cf43e8 cargo fmt 2018-09-14 16:37:49 -07:00
Greg Fitzgerald a093d5c809 Fix erasure build 2018-09-10 11:40:26 -06:00
Greg Fitzgerald fc64e1853c Initialize Window, not SharedWindow
Wrap with Arc<RwLock>> when/if needed, no earlier.
2018-09-10 11:40:26 -06:00
Greg Fitzgerald 7f669094de Split window into two modules 2018-09-10 11:40:26 -06:00