Reduce Avalanche redundancy and implement traditional fanout (#4174)

* Reduce Avalanche redundancy and implement traditional fanout

* Revert tiny fanout

* Update diagrams and docs based on review comments
This commit is contained in:
Sagar Dhawan 2019-05-07 13:24:58 -07:00 committed by GitHub
parent 4f3b22d04e
commit 2107e15bd3
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
8 changed files with 218 additions and 214 deletions

View File

@ -0,0 +1,19 @@
+------------------------------------------------------------------+
| |
| +-----------------+ Neighborhood 0 +-----------------+ |
| | +--------------------->+ | |
| | Validator 1 | | Validator 2 | |
| | +<---------------------+ | |
| +--------+-+------+ +------+-+--------+ |
| | | | | |
| | +-----------------------------+ | | |
| | +------------------------+------+ | |
| | | | | |
+------------------------------------------------------------------+
| | | |
v v v v
+---------+------+---+ +-+--------+---------+
| | | |
| Neighborhood 1 | | Neighborhood 2 |
| | | |
+--------------------+ +--------------------+

View File

@ -0,0 +1,15 @@
+--------------+
| |
+------------+ Leader +------------+
| | | |
| +--------------+ |
v v
+------------+----------------------------------------+------------+
| |
| +-----------------+ Neighborhood 0 +-----------------+ |
| | +--------------------->+ | |
| | Validator 1 | | Validator 2 | |
| | +<---------------------+ | |
| +-----------------+ +-----------------+ |
| |
+------------------------------------------------------------------+

View File

@ -1,28 +1,18 @@
+--------------+
| |
+------------+ Leader +------------+
| | | |
| +--------------+ |
v v
+--------+--------+ +--------+--------+
| +--------------------->+ |
+-----------------+ Validator 1 | | Validator 2 +-------------+
| | +<---------------------+ | |
| +------+-+-+------+ +---+-+-+---------+ |
| | | | | | | |
| | | | | | | |
| +---------------------------------------------+ | | |
| | | | | | | |
| | | | | +----------------------+ | |
| | | | | | | |
| | | | +--------------------------------------------+ |
| | | | | | | |
| | | +----------------------+ | | |
| | | | | | | |
v v v v v v v v
+--------------------+ +--------------------+ +--------------------+ +--------------------+
| | | | | | | |
| Neighborhood 1 | | Neighborhood 2 | | Neighborhood 3 | | Neighborhood 4 |
| | | | | | | |
+--------------------+ +--------------------+ +--------------------+ +--------------------+
+--------------------+
| |
+--------+ Neighborhood 0 +----------+
| | | |
| +--------------------+ |
v v
+---------+----------+ +----------+---------+
| | | |
| Neighborhood 1 | | Neighborhood 2 |
| | | |
+---+-----+----------+ +----------+-----+---+
| | | |
v v v v
+------------------+-+ +-+------------------+ +------------------+-+ +-+------------------+
| | | | | | | |
| Neighborhood 3 | | Neighborhood 4 | | Neighborhood 5 | | Neighborhood 6 |
| | | | | | | |
+--------------------+ +--------------------+ +--------------------+ +--------------------+

View File

@ -5,16 +5,15 @@ broadcast transaction blobs to all nodes in a very quick and efficient manner.
In order to establish the fanout, the cluster divides itself into small
collections of nodes, called *neighborhoods*. Each node is responsible for
sharing any data it receives with the other nodes in its neighborhood, as well
as propagating the data on to a small set of nodes in other neighborhoods.
as propagating the data on to a small set of nodes in other neighborhoods.
This way each node only has to communicate with a small number of nodes.
During its slot, the leader node distributes blobs between the validator nodes
in one neighborhood (layer 1). Each validator shares its data within its
neighborhood, but also retransmits the blobs to one node in each of multiple
neighborhoods in the next layer (layer 2). The layer-2 nodes each share their
data with their neighborhood peers, and retransmit to nodes in the next layer,
etc, until all nodes in the cluster have received all the blobs.
<img alt="Two layer cluster" src="img/data-plane.svg" class="center"/>
in the first neighborhood (layer 0). Each validator shares its data within its
neighborhood, but also retransmits the blobs to one node in some neighborhoods
in the next layer (layer 1). The layer-1 nodes each share their data with their
neighborhood peers, and retransmit to nodes in the next layer, etc, until all
nodes in the cluster have received all the blobs.
## Neighborhood Assignment - Weighted Selection
@ -23,48 +22,50 @@ cluster is divided into neighborhoods. To achieve this, all the recognized
validator nodes (the TVU peers) are sorted by stake and stored in a list. This
list is then indexed in different ways to figure out neighborhood boundaries and
retransmit peers. For example, the leader will simply select the first nodes to
make up layer 1. These will automatically be the highest stake holders, allowing
the heaviest votes to come back to the leader first. Layer-1 and lower-layer
nodes use the same logic to find their neighbors and lower layer peers.
make up layer 0. These will automatically be the highest stake holders, allowing
the heaviest votes to come back to the leader first. Layer-0 and lower-layer
nodes use the same logic to find their neighbors and next layer peers.
## Layer and Neighborhood Structure
The current leader makes its initial broadcasts to at most `DATA_PLANE_FANOUT`
nodes. If this layer 1 is smaller than the number of nodes in the cluster, then
nodes. If this layer 0 is smaller than the number of nodes in the cluster, then
the data plane fanout mechanism adds layers below. Subsequent layers follow
these constraints to determine layer-capacity: Each neighborhood contains
`NEIGHBORHOOD_SIZE` nodes and each layer may have up to `DATA_PLANE_FANOUT/2`
neighborhoods.
`DATA_PLANE_FANOUT` nodes. Layer-0 starts with 1 neighborhood with fanout nodes.
The number of nodes in each additional layer grows by a factor of fanout.
As mentioned above, each node in a layer only has to broadcast its blobs to its
neighbors and to exactly 1 node in each next-layer neighborhood, instead of to
every TVU peer in the cluster. In the default mode, each layer contains
`DATA_PLANE_FANOUT/2` neighborhoods. The retransmit mechanism also supports a
second, `grow`, mode of operation that squares the number of neighborhoods
allowed each layer. This dramatically reduces the number of layers needed to
support a large cluster, but can also have a negative impact on the network
pressure on each node in the lower layers. A good way to think of the default
mode (when `grow` is disabled) is to imagine it as chain of layers, where the
leader sends blobs to layer-1 and then layer-1 to layer-2 and so on, the `layer
capacities` remain constant, so all layers past layer-2 will have the same
number of nodes until the whole cluster is covered. When `grow` is enabled, this
becomes a traditional fanout where layer-3 will have the square of the number of
nodes in layer-2 and so on.
neighbors and to exactly 1 node in some next-layer neighborhoods,
instead of to every TVU peer in the cluster. A good way to think about this is,
layer-0 starts with 1 neighborhood with fanout nodes, layer-1 adds "fanout"
neighborhoods, each with fanout nodes and layer-2 will have
`fanout * number of nodes in layer-1` and so on.
This way each node only has to communicate with a maximum of `2 * DATA_PLANE_FANOUT - 1` nodes.
The following diagram shows how the Leader sends blobs with a Fanout of 2 to
Neighborhood 0 in Layer 0 and how the nodes in Neighborhood 0 share their data
with each other.
<img alt="Leader sends blobs to Neighborhood 0 in Layer 0" src="img/data-plane-seeding.svg" class="center"/>
The following diagram shows how Neighborhood 0 fans out to Neighborhoods 1 and 2.
<img alt="Neighborhood 0 Fanout to Neighborhood 1 and 2" src="img/data-plane-fanout.svg" class="center"/>
Finally, the following diagram shows a two layer cluster with a Fanout of 2.
<img alt="Two layer cluster with a Fanout of 2" src="img/data-plane.svg" class="center"/>
#### Configuration Values
`DATA_PLANE_FANOUT` - Determines the size of layer 1. Subsequent
layers have `DATA_PLANE_FANOUT/2` neighborhoods when `grow` is inactive.
`NEIGHBORHOOD_SIZE` - The number of nodes allowed in a neighborhood.
`DATA_PLANE_FANOUT` - Determines the size of layer 0. Subsequent
layers grow by a factor of `DATA_PLANE_FANOUT`.
The number of nodes in a neighborhood is equal to the fanout value.
Neighborhoods will fill to capacity before new ones are added, i.e if a
neighborhood isn't full, it _must_ be the last one.
`GROW_LAYER_CAPACITY` - Whether or not retransmit should be behave like a
_traditional fanout_, i.e if each additional layer should have growing
capacities. When this mode is disabled (default), all layers after layer 1 have
the same capacity, keeping the network pressure on all nodes equal.
Currently, configuration is set when the cluster is launched. In the future,
these parameters may be hosted on-chain, allowing modification on the fly as the
cluster sizes change.
@ -72,13 +73,10 @@ cluster sizes change.
## Neighborhoods
The following diagram shows how two neighborhoods in different layers interact.
What this diagram doesn't capture is that each neighbor actually receives
blobs from one validator per neighborhood above it. This means that, to
cripple a neighborhood, enough nodes (erasure codes +1 per neighborhood) from
the layer above need to fail. Since multiple neighborhoods exist in the upper
layer and a node will receive blobs from a node in each of those neighborhoods,
we'd need a big network failure in the upper layers to end up with incomplete
data.
To cripple a neighborhood, enough nodes (erasure codes +1) from the neighborhood
above need to fail. Since each neighborhood receives blobs from multiple nodes
in a neighborhood in the upper layer, we'd need a big network failure in the upper
layers to end up with incomplete data.
<img alt="Inner workings of a neighborhood"
src="img/data-plane-neighborhood.svg" class="center"/>

View File

@ -1,7 +1,7 @@
//! A stage to broadcast data from a leader node to validators
//!
use crate::blocktree::Blocktree;
use crate::cluster_info::{ClusterInfo, ClusterInfoError, NEIGHBORHOOD_SIZE};
use crate::cluster_info::{ClusterInfo, ClusterInfoError, DATA_PLANE_FANOUT};
use crate::entry::EntrySlice;
use crate::erasure::CodingGenerator;
use crate::packet::index_blobs_with_genesis;
@ -78,7 +78,7 @@ impl Broadcast {
);
inc_new_counter_info!("broadcast_service-num_peers", broadcast_table.len() + 1);
// Layer 1, leader nodes are limited to the fanout size.
broadcast_table.truncate(NEIGHBORHOOD_SIZE);
broadcast_table.truncate(DATA_PLANE_FANOUT);
inc_new_counter_info!("broadcast_service-entries_received", num_entries);

View File

@ -51,11 +51,8 @@ use std::time::{Duration, Instant};
pub const FULLNODE_PORT_RANGE: PortRange = (8000, 10_000);
/// The Data plane "neighborhood" size
pub const NEIGHBORHOOD_SIZE: usize = 200;
/// Set whether node capacity should grow as layers are added
pub const GROW_LAYER_CAPACITY: bool = false;
/// The Data plane fanout size, also used as the neighborhood size
pub const DATA_PLANE_FANOUT: usize = 200;
/// milliseconds we sleep for between gossip requests
pub const GOSSIP_SLEEP_MILLIS: u64 = 100;
@ -91,17 +88,17 @@ pub struct Locality {
/// The bounds of the current layer
pub layer_bounds: (usize, usize),
/// The bounds of the next layer
pub child_layer_bounds: Option<(usize, usize)>,
pub next_layer_bounds: Option<(usize, usize)>,
/// The indices of the nodes that should be contacted in next layer
pub child_layer_peers: Vec<usize>,
pub next_layer_peers: Vec<usize>,
}
impl fmt::Debug for Locality {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
write!(
f,
"Packet {{ neighborhood_bounds: {:?}, current_layer: {:?}, child_layer_bounds: {:?} child_layer_peers: {:?} }}",
self.neighbor_bounds, self.layer_ix, self.child_layer_bounds, self.child_layer_peers
"Locality {{ neighborhood_bounds: {:?}, current_layer: {:?}, child_layer_bounds: {:?} child_layer_peers: {:?} }}",
self.neighbor_bounds, self.layer_ix, self.next_layer_bounds, self.next_layer_peers
)
}
}
@ -484,16 +481,8 @@ impl ClusterInfo {
.collect()
}
/// Given a node count, neighborhood size, and an initial fanout (leader -> layer 1), it
/// calculates how many layers are needed and at what index each layer begins.
/// The `grow` parameter is used to determine if the network should 'fanout' or keep
/// layer capacities constant.
pub fn describe_data_plane(
nodes: usize,
fanout: usize,
hood_size: usize,
grow: bool,
) -> (usize, Vec<usize>) {
/// Given a node count and fanout, it calculates how many layers are needed and at what index each layer begins.
pub fn describe_data_plane(nodes: usize, fanout: usize) -> (usize, Vec<usize>) {
let mut layer_indices: Vec<usize> = vec![0];
if nodes == 0 {
(0, vec![])
@ -505,8 +494,8 @@ impl ClusterInfo {
let mut remaining_nodes = nodes - fanout;
layer_indices.push(fanout);
let mut num_layers = 2;
let mut num_neighborhoods = fanout / 2;
let mut layer_capacity = hood_size * num_neighborhoods;
// fanout * num_nodes in a neighborhood, which is also fanout.
let mut layer_capacity = fanout * fanout;
while remaining_nodes > 0 {
if remaining_nodes > layer_capacity {
// Needs more layers.
@ -515,11 +504,8 @@ impl ClusterInfo {
let end = *layer_indices.last().unwrap();
layer_indices.push(layer_capacity + end);
if grow {
// Next layer's capacity
num_neighborhoods *= num_neighborhoods;
layer_capacity = hood_size * num_neighborhoods;
}
// Next layer's capacity
layer_capacity *= fanout;
} else {
//everything will now fit in the layers we have
let end = *layer_indices.last().unwrap();
@ -534,61 +520,64 @@ impl ClusterInfo {
fn localize_item(
layer_indices: &[usize],
hood_size: usize,
fanout: usize,
select_index: usize,
curr_index: usize,
) -> Option<(Locality)> {
let end = layer_indices.len() - 1;
let next = min(end, curr_index + 1);
let value = layer_indices[curr_index];
let localized = select_index >= value && select_index < layer_indices[next];
let mut locality = Locality::default();
let layer_start = layer_indices[curr_index];
// localized if selected index lies within the current layer's bounds
let localized = select_index >= layer_start && select_index < layer_indices[next];
if localized {
let mut locality = Locality::default();
let hood_ix = (select_index - layer_start) / fanout;
match curr_index {
_ if curr_index == 0 => {
locality.layer_ix = 0;
locality.layer_bounds = (0, hood_size);
locality.layer_bounds = (0, fanout);
locality.neighbor_bounds = locality.layer_bounds;
if next == end {
locality.child_layer_bounds = None;
locality.child_layer_peers = vec![];
locality.next_layer_bounds = None;
locality.next_layer_peers = vec![];
} else {
locality.child_layer_bounds =
locality.next_layer_bounds =
Some((layer_indices[next], layer_indices[next + 1]));
locality.child_layer_peers = ClusterInfo::lower_layer_peers(
locality.next_layer_peers = ClusterInfo::next_layer_peers(
select_index,
hood_ix,
layer_indices[next],
layer_indices[next + 1],
hood_size,
fanout,
);
}
}
_ if curr_index == end => {
locality.layer_ix = end;
locality.layer_bounds = (end - hood_size, end);
locality.layer_bounds = (end - fanout, end);
locality.neighbor_bounds = locality.layer_bounds;
locality.child_layer_bounds = None;
locality.child_layer_peers = vec![];
locality.next_layer_bounds = None;
locality.next_layer_peers = vec![];
}
ix => {
let hood_ix = (select_index - value) / hood_size;
locality.layer_ix = ix;
locality.layer_bounds = (value, layer_indices[next]);
locality.layer_bounds = (layer_start, layer_indices[next]);
locality.neighbor_bounds = (
((hood_ix * hood_size) + value),
((hood_ix + 1) * hood_size + value),
((hood_ix * fanout) + layer_start),
((hood_ix + 1) * fanout + layer_start),
);
if next == end {
locality.child_layer_bounds = None;
locality.child_layer_peers = vec![];
locality.next_layer_bounds = None;
locality.next_layer_peers = vec![];
} else {
locality.child_layer_bounds =
locality.next_layer_bounds =
Some((layer_indices[next], layer_indices[next + 1]));
locality.child_layer_peers = ClusterInfo::lower_layer_peers(
locality.next_layer_peers = ClusterInfo::next_layer_peers(
select_index,
hood_ix,
layer_indices[next],
layer_indices[next + 1],
hood_size,
fanout,
);
}
}
@ -599,19 +588,25 @@ impl ClusterInfo {
}
}
/// Given a array of layer indices and another index, returns (as a `Locality`) the layer,
/// layer-bounds and neighborhood-bounds in which the index resides
fn localize(layer_indices: &[usize], hood_size: usize, select_index: usize) -> Locality {
/// Given a array of layer indices and an index of interest, returns (as a `Locality`) the layer,
/// layer-bounds, and neighborhood-bounds in which the index resides
fn localize(layer_indices: &[usize], fanout: usize, select_index: usize) -> Locality {
(0..layer_indices.len())
.find_map(|i| ClusterInfo::localize_item(layer_indices, hood_size, select_index, i))
.find_map(|i| ClusterInfo::localize_item(layer_indices, fanout, select_index, i))
.or_else(|| Some(Locality::default()))
.unwrap()
}
fn lower_layer_peers(index: usize, start: usize, end: usize, hood_size: usize) -> Vec<usize> {
/// Selects a range in the next layer and chooses nodes from that range as peers for the given index
fn next_layer_peers(index: usize, hood_ix: usize, start: usize, fanout: usize) -> Vec<usize> {
// Each neighborhood is only tasked with pushing to `fanout` neighborhoods where each neighborhood contains `fanout` nodes.
let fanout_nodes = fanout * fanout;
// Skip first N nodes, where N is hood_ix * (fanout_nodes)
let start = start + (hood_ix * fanout_nodes);
let end = start + fanout_nodes;
(start..end)
.step_by(hood_size)
.map(|x| x + index % hood_size)
.step_by(fanout)
.map(|x| x + index % fanout)
.collect()
}
@ -1427,31 +1422,28 @@ impl ClusterInfo {
/// 1.1 - If yes, then broadcast to all layer 1 nodes
/// 1 - using the layer 1 index, broadcast to all layer 2 nodes assuming you know neighborhood size
/// 1.2 - If no, then figure out what layer the node is in and who the neighbors are and only broadcast to them
/// 1 - also check if there are nodes in lower layers and repeat the layer 1 to layer 2 logic
/// 1 - also check if there are nodes in the next layer and repeat the layer 1 to layer 2 logic
/// Returns Neighbor Nodes and Children Nodes `(neighbors, children)` for a given node based on its stake (Bank Balance)
pub fn compute_retransmit_peers<S: std::hash::BuildHasher>(
stakes: &HashMap<Pubkey, u64, S>,
cluster_info: &Arc<RwLock<ClusterInfo>>,
fanout: usize,
hood_size: usize,
grow: bool,
) -> (Vec<ContactInfo>, Vec<ContactInfo>) {
let (my_index, peers) = cluster_info.read().unwrap().sorted_peers_and_index(stakes);
//calc num_layers and num_neighborhoods using the total number of nodes
let (num_layers, layer_indices) =
ClusterInfo::describe_data_plane(peers.len(), fanout, hood_size, grow);
let (num_layers, layer_indices) = ClusterInfo::describe_data_plane(peers.len(), fanout);
if num_layers <= 1 {
/* single layer data plane */
(peers, vec![])
} else {
//find my layer
let locality = ClusterInfo::localize(&layer_indices, hood_size, my_index);
let locality = ClusterInfo::localize(&layer_indices, fanout, my_index);
let upper_bound = cmp::min(locality.neighbor_bounds.1, peers.len());
let neighbors = peers[locality.neighbor_bounds.0..upper_bound].to_vec();
let mut children = Vec::new();
for ix in locality.child_layer_peers {
for ix in locality.next_layer_peers {
if let Some(peer) = peers.get(ix) {
children.push(peer.clone());
continue;
@ -2043,78 +2035,72 @@ mod tests {
assert!(val.verify());
}
fn num_layers(nodes: usize, fanout: usize, hood_size: usize, grow: bool) -> usize {
ClusterInfo::describe_data_plane(nodes, fanout, hood_size, grow).0
fn num_layers(nodes: usize, fanout: usize) -> usize {
ClusterInfo::describe_data_plane(nodes, fanout).0
}
#[test]
fn test_describe_data_plane() {
// no nodes
assert_eq!(num_layers(0, 200, 200, false), 0);
assert_eq!(num_layers(0, 200), 0);
// 1 node
assert_eq!(num_layers(1, 200, 200, false), 1);
assert_eq!(num_layers(1, 200), 1);
// 10 nodes with fanout of 2 and hood size of 2
assert_eq!(num_layers(10, 2, 2, false), 5);
// 10 nodes with fanout of 2
assert_eq!(num_layers(10, 2), 3);
// fanout + 1 nodes with fanout of 2 and hood size of 2
assert_eq!(num_layers(3, 2, 2, false), 2);
// 10 nodes with fanout of 4 and hood size of 2 while growing
assert_eq!(num_layers(10, 4, 2, true), 3);
// fanout + 1 nodes with fanout of 2
assert_eq!(num_layers(3, 2), 2);
// A little more realistic
assert_eq!(num_layers(100, 10, 10, false), 3);
assert_eq!(num_layers(100, 10), 2);
// A little more realistic with odd numbers
assert_eq!(num_layers(103, 13, 13, false), 3);
assert_eq!(num_layers(103, 13), 2);
// A little more realistic with just enough for 3 layers
assert_eq!(num_layers(111, 10), 3);
// larger
let (layer_cnt, layer_indices) = ClusterInfo::describe_data_plane(10_000, 10, 10, false);
assert_eq!(layer_cnt, 201);
// distances between index values should be the same since we aren't growing.
let capacity = 10 / 2 * 10;
let (layer_cnt, layer_indices) = ClusterInfo::describe_data_plane(10_000, 10);
assert_eq!(layer_cnt, 4);
// distances between index values should increase by `fanout` for every layer.
let mut capacity = 10 * 10;
assert_eq!(layer_indices[1], 10);
layer_indices[1..layer_indices.len()]
.chunks(2)
.for_each(|x| {
if x.len() == 2 {
assert_eq!(x[1] - x[0], capacity);
}
});
layer_indices[1..].windows(2).for_each(|x| {
if x.len() == 2 {
assert_eq!(x[1] - x[0], capacity);
capacity *= 10;
}
});
// massive
let (layer_cnt, layer_indices) = ClusterInfo::describe_data_plane(500_000, 200, 200, false);
let capacity = 200 / 2 * 200;
let cnt = 500_000 / capacity + 1;
assert_eq!(layer_cnt, cnt);
// distances between index values should be the same since we aren't growing.
let (layer_cnt, layer_indices) = ClusterInfo::describe_data_plane(500_000, 200);
let mut capacity = 200 * 200;
assert_eq!(layer_cnt, 3);
// distances between index values should increase by `fanout` for every layer.
assert_eq!(layer_indices[1], 200);
layer_indices[1..layer_indices.len()]
.chunks(2)
.for_each(|x| {
if x.len() == 2 {
assert_eq!(x[1] - x[0], capacity);
}
});
layer_indices[1..].windows(2).for_each(|x| {
if x.len() == 2 {
assert_eq!(x[1] - x[0], capacity);
capacity *= 200;
}
});
let total_capacity: usize = *layer_indices.last().unwrap();
assert!(total_capacity >= 500_000);
// massive with growth
assert_eq!(num_layers(500_000, 200, 200, true), 3);
}
#[test]
fn test_localize() {
// go for gold
let (_, layer_indices) = ClusterInfo::describe_data_plane(500_000, 200, 200, false);
let (_, layer_indices) = ClusterInfo::describe_data_plane(500_000, 200);
let mut me = 0;
let mut layer_ix = 0;
let locality = ClusterInfo::localize(&layer_indices, 200, me);
assert_eq!(locality.layer_ix, layer_ix);
assert_eq!(
locality.child_layer_bounds,
locality.next_layer_bounds,
Some((layer_indices[layer_ix + 1], layer_indices[layer_ix + 2]))
);
me = 201;
@ -2126,11 +2112,11 @@ mod tests {
layer_indices[layer_ix]
);
assert_eq!(
locality.child_layer_bounds,
locality.next_layer_bounds,
Some((layer_indices[layer_ix + 1], layer_indices[layer_ix + 2]))
);
me = 20_201;
layer_ix = 2;
me = 20_000;
layer_ix = 1;
let locality = ClusterInfo::localize(&layer_indices, 200, me);
assert_eq!(
locality.layer_ix, layer_ix,
@ -2138,13 +2124,13 @@ mod tests {
layer_indices[layer_ix]
);
assert_eq!(
locality.child_layer_bounds,
locality.next_layer_bounds,
Some((layer_indices[layer_ix + 1], layer_indices[layer_ix + 2]))
);
// test no child layer since last layer should have massive capacity
let (_, layer_indices) = ClusterInfo::describe_data_plane(500_000, 200, 200, true);
me = 20_201;
let (_, layer_indices) = ClusterInfo::describe_data_plane(500_000, 200);
me = 40_201;
layer_ix = 2;
let locality = ClusterInfo::localize(&layer_indices, 200, me);
assert_eq!(
@ -2152,23 +2138,23 @@ mod tests {
"layer_indices[layer_ix] is actually {}",
layer_indices[layer_ix]
);
assert_eq!(locality.child_layer_bounds, None);
assert_eq!(locality.next_layer_bounds, None);
}
#[test]
fn test_localize_child_peer_overlap() {
let (_, layer_indices) = ClusterInfo::describe_data_plane(500_000, 200, 200, false);
let (_, layer_indices) = ClusterInfo::describe_data_plane(500_000, 200);
let last_ix = layer_indices.len() - 1;
// sample every 33 pairs to reduce test time
for x in (0..*layer_indices.get(last_ix - 2).unwrap()).step_by(33) {
let me_locality = ClusterInfo::localize(&layer_indices, 200, x);
let buddy_locality = ClusterInfo::localize(&layer_indices, 200, x + 1);
assert!(!me_locality.child_layer_peers.is_empty());
assert!(!buddy_locality.child_layer_peers.is_empty());
assert!(!me_locality.next_layer_peers.is_empty());
assert!(!buddy_locality.next_layer_peers.is_empty());
me_locality
.child_layer_peers
.next_layer_peers
.iter()
.zip(buddy_locality.child_layer_peers.iter())
.zip(buddy_locality.next_layer_peers.iter())
.for_each(|(x, y)| assert_ne!(x, y));
}
}
@ -2177,12 +2163,12 @@ mod tests {
fn test_network_coverage() {
// pretend to be each node in a scaled down network and make sure the set of all the broadcast peers
// includes every node in the network.
let (_, layer_indices) = ClusterInfo::describe_data_plane(25_000, 10, 10, false);
let (_, layer_indices) = ClusterInfo::describe_data_plane(25_000, 10);
let mut broadcast_set = HashSet::new();
for my_index in 0..25_000 {
let my_locality = ClusterInfo::localize(&layer_indices, 10, my_index);
broadcast_set.extend(my_locality.neighbor_bounds.0..my_locality.neighbor_bounds.1);
broadcast_set.extend(my_locality.child_layer_peers);
broadcast_set.extend(my_locality.next_layer_peers);
}
for i in 0..25_000 {

View File

@ -2,9 +2,7 @@
use crate::bank_forks::BankForks;
use crate::blocktree::Blocktree;
use crate::cluster_info::{
compute_retransmit_peers, ClusterInfo, GROW_LAYER_CAPACITY, NEIGHBORHOOD_SIZE,
};
use crate::cluster_info::{compute_retransmit_peers, ClusterInfo, DATA_PLANE_FANOUT};
use crate::leader_schedule_cache::LeaderScheduleCache;
use crate::result::{Error, Result};
use crate::service::Service;
@ -45,9 +43,7 @@ fn retransmit(
let (neighbors, children) = compute_retransmit_peers(
&staking_utils::delegated_stakes_at_epoch(&r_bank, bank_epoch).unwrap(),
cluster_info,
NEIGHBORHOOD_SIZE,
NEIGHBORHOOD_SIZE,
GROW_LAYER_CAPACITY,
DATA_PLANE_FANOUT,
);
for blob in &blobs {
let leader = leader_schedule_cache

View File

@ -1,9 +1,7 @@
use hashbrown::{HashMap, HashSet};
use rayon::iter::{IntoParallelIterator, ParallelIterator};
use rayon::prelude::*;
use solana::cluster_info::{
compute_retransmit_peers, ClusterInfo, GROW_LAYER_CAPACITY, NEIGHBORHOOD_SIZE,
};
use solana::cluster_info::{compute_retransmit_peers, ClusterInfo};
use solana::contact_info::ContactInfo;
use solana_sdk::pubkey::Pubkey;
use std::sync::mpsc::channel;
@ -28,7 +26,7 @@ fn find_insert_blob(id: &Pubkey, blob: i32, batches: &mut [Nodes]) {
});
}
fn run_simulation(stakes: &[u64], fanout: usize, hood_size: usize) {
fn run_simulation(stakes: &[u64], fanout: usize) {
let num_threads = num_threads();
// set timeout to 5 minutes
let timeout = 60 * 5;
@ -100,15 +98,17 @@ fn run_simulation(stakes: &[u64], fanout: usize, hood_size: usize) {
> = HashMap::new();
while remaining > 0 {
for (id, (recv, r)) in batch.iter_mut() {
assert!(now.elapsed().as_secs() < timeout, "Timed out");
assert!(
now.elapsed().as_secs() < timeout,
"Timed out with {:?} remaining nodes",
remaining
);
cluster.gossip.set_self(&*id);
if !mapped_peers.contains_key(id) {
let (neighbors, children) = compute_retransmit_peers(
&staked_nodes,
&Arc::new(RwLock::new(cluster.clone())),
fanout,
hood_size,
GROW_LAYER_CAPACITY,
);
let vec_children: Vec<_> = children
.iter()
@ -172,30 +172,30 @@ fn run_simulation(stakes: &[u64], fanout: usize, hood_size: usize) {
// Run with a single layer
#[test]
fn test_retransmit_small() {
let stakes: Vec<_> = (0..NEIGHBORHOOD_SIZE as u64).map(|i| i).collect();
run_simulation(&stakes, NEIGHBORHOOD_SIZE, NEIGHBORHOOD_SIZE);
let stakes: Vec<_> = (0..200).map(|i| i).collect();
run_simulation(&stakes, 200);
}
// Make sure at least 2 layers are used
#[test]
fn test_retransmit_medium() {
let num_nodes = NEIGHBORHOOD_SIZE as u64 * 10;
let num_nodes = 2000;
let stakes: Vec<_> = (0..num_nodes).map(|i| i).collect();
run_simulation(&stakes, NEIGHBORHOOD_SIZE, NEIGHBORHOOD_SIZE);
run_simulation(&stakes, 200);
}
// Make sure at least 2 layers are used but with equal stakes
#[test]
fn test_retransmit_medium_equal_stakes() {
let num_nodes = NEIGHBORHOOD_SIZE as u64 * 10;
let num_nodes = 2000;
let stakes: Vec<_> = (0..num_nodes).map(|_| 10).collect();
run_simulation(&stakes, NEIGHBORHOOD_SIZE, NEIGHBORHOOD_SIZE);
run_simulation(&stakes, 200);
}
// Scale down the network and make sure at least 3 layers are used
// Scale down the network and make sure many layers are used
#[test]
fn test_retransmit_large() {
let num_nodes = NEIGHBORHOOD_SIZE as u64 * 20;
let num_nodes = 4000;
let stakes: Vec<_> = (0..num_nodes).map(|i| i).collect();
run_simulation(&stakes, NEIGHBORHOOD_SIZE / 10, NEIGHBORHOOD_SIZE / 10);
run_simulation(&stakes, 2);
}