p2p docs
This commit is contained in:
parent
008de93bbe
commit
1acb12edf5
119
p2p/README.md
119
p2p/README.md
|
@ -4,119 +4,10 @@
|
|||
|
||||
`tendermint/tendermint/p2p` provides an abstraction around peer-to-peer communication.<br/>
|
||||
|
||||
## MConnection
|
||||
See:
|
||||
|
||||
`MConnection` is a multiplex connection:
|
||||
- [docs/connection] for details on how connections and multiplexing work
|
||||
- [docs/peer] for details on peer ID, handshakes, and peer exchange
|
||||
- [docs/node] for details about different types of nodes and how they should work
|
||||
- [docs/reputation] for details on how peer reputation is managed
|
||||
|
||||
__multiplex__ *noun* a system or signal involving simultaneous transmission of
|
||||
several messages along a single channel of communication.
|
||||
|
||||
Each `MConnection` handles message transmission on multiple abstract communication
|
||||
`Channel`s. Each channel has a globally unique byte id.
|
||||
The byte id and the relative priorities of each `Channel` are configured upon
|
||||
initialization of the connection.
|
||||
|
||||
The `MConnection` supports three packet types: Ping, Pong, and Msg.
|
||||
|
||||
### Ping and Pong
|
||||
|
||||
The ping and pong messages consist of writing a single byte to the connection; 0x1 and 0x2, respectively
|
||||
|
||||
When we haven't received any messages on an `MConnection` in a time `pingTimeout`, we send a ping message.
|
||||
When a ping is received on the `MConnection`, a pong is sent in response.
|
||||
|
||||
If a pong is not received in sufficient time, the peer's score should be decremented (TODO).
|
||||
|
||||
### Msg
|
||||
|
||||
Messages in channels are chopped into smaller msgPackets for multiplexing.
|
||||
|
||||
```
|
||||
type msgPacket struct {
|
||||
ChannelID byte
|
||||
EOF byte // 1 means message ends here.
|
||||
Bytes []byte
|
||||
}
|
||||
```
|
||||
|
||||
The msgPacket is serialized using go-wire, and prefixed with a 0x3.
|
||||
The received `Bytes` of a sequential set of packets are appended together
|
||||
until a packet with `EOF=1` is received, at which point the complete serialized message
|
||||
is returned for processing by the corresponding channels `onReceive` function.
|
||||
|
||||
### Multiplexing
|
||||
|
||||
Messages are sent from a single `sendRoutine`, which loops over a select statement that results in the sending
|
||||
of a ping, a pong, or a batch of data messages. The batch of data messages may include messages from multiple channels.
|
||||
Message bytes are queued for sending in their respective channel, with each channel holding one unsent message at a time.
|
||||
Messages are chosen for a batch one a time from the channel with the lowest ratio of recently sent bytes to channel priority.
|
||||
|
||||
## Sending Messages
|
||||
|
||||
There are two methods for sending messages:
|
||||
```go
|
||||
func (m MConnection) Send(chID byte, msg interface{}) bool {}
|
||||
func (m MConnection) TrySend(chID byte, msg interface{}) bool {}
|
||||
```
|
||||
|
||||
`Send(chID, msg)` is a blocking call that waits until `msg` is successfully queued
|
||||
for the channel with the given id byte `chID`. The message `msg` is serialized
|
||||
using the `tendermint/wire` submodule's `WriteBinary()` reflection routine.
|
||||
|
||||
`TrySend(chID, msg)` is a nonblocking call that returns false if the channel's
|
||||
queue is full.
|
||||
|
||||
`Send()` and `TrySend()` are also exposed for each `Peer`.
|
||||
|
||||
## Peer
|
||||
|
||||
Each peer has one `MConnection` instance, and includes other information such as whether the connection
|
||||
was outbound, whether the connection should be recreated if it closes, various identity information about the node,
|
||||
and other higher level thread-safe data used by the reactors.
|
||||
|
||||
## Switch/Reactor
|
||||
|
||||
The `Switch` handles peer connections and exposes an API to receive incoming messages
|
||||
on `Reactors`. Each `Reactor` is responsible for handling incoming messages of one
|
||||
or more `Channels`. So while sending outgoing messages is typically performed on the peer,
|
||||
incoming messages are received on the reactor.
|
||||
|
||||
```go
|
||||
// Declare a MyReactor reactor that handles messages on MyChannelID.
|
||||
type MyReactor struct{}
|
||||
|
||||
func (reactor MyReactor) GetChannels() []*ChannelDescriptor {
|
||||
return []*ChannelDescriptor{ChannelDescriptor{ID:MyChannelID, Priority: 1}}
|
||||
}
|
||||
|
||||
func (reactor MyReactor) Receive(chID byte, peer *Peer, msgBytes []byte) {
|
||||
r, n, err := bytes.NewBuffer(msgBytes), new(int64), new(error)
|
||||
msgString := ReadString(r, n, err)
|
||||
fmt.Println(msgString)
|
||||
}
|
||||
|
||||
// Other Reactor methods omitted for brevity
|
||||
...
|
||||
|
||||
switch := NewSwitch([]Reactor{MyReactor{}})
|
||||
|
||||
...
|
||||
|
||||
// Send a random message to all outbound connections
|
||||
for _, peer := range switch.Peers().List() {
|
||||
if peer.IsOutbound() {
|
||||
peer.Send(MyChannelID, "Here's a random message")
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### PexReactor/AddrBook
|
||||
|
||||
A `PEXReactor` reactor implementation is provided to automate peer discovery.
|
||||
|
||||
```go
|
||||
book := p2p.NewAddrBook(addrBookFilePath)
|
||||
pexReactor := p2p.NewPEXReactor(book)
|
||||
...
|
||||
switch := NewSwitch([]Reactor{pexReactor, myReactor, ...})
|
||||
```
|
||||
|
|
|
@ -0,0 +1,116 @@
|
|||
## MConnection
|
||||
|
||||
`MConnection` is a multiplex connection:
|
||||
|
||||
__multiplex__ *noun* a system or signal involving simultaneous transmission of
|
||||
several messages along a single channel of communication.
|
||||
|
||||
Each `MConnection` handles message transmission on multiple abstract communication
|
||||
`Channel`s. Each channel has a globally unique byte id.
|
||||
The byte id and the relative priorities of each `Channel` are configured upon
|
||||
initialization of the connection.
|
||||
|
||||
The `MConnection` supports three packet types: Ping, Pong, and Msg.
|
||||
|
||||
### Ping and Pong
|
||||
|
||||
The ping and pong messages consist of writing a single byte to the connection; 0x1 and 0x2, respectively
|
||||
|
||||
When we haven't received any messages on an `MConnection` in a time `pingTimeout`, we send a ping message.
|
||||
When a ping is received on the `MConnection`, a pong is sent in response.
|
||||
|
||||
If a pong is not received in sufficient time, the peer's score should be decremented (TODO).
|
||||
|
||||
### Msg
|
||||
|
||||
Messages in channels are chopped into smaller msgPackets for multiplexing.
|
||||
|
||||
```
|
||||
type msgPacket struct {
|
||||
ChannelID byte
|
||||
EOF byte // 1 means message ends here.
|
||||
Bytes []byte
|
||||
}
|
||||
```
|
||||
|
||||
The msgPacket is serialized using go-wire, and prefixed with a 0x3.
|
||||
The received `Bytes` of a sequential set of packets are appended together
|
||||
until a packet with `EOF=1` is received, at which point the complete serialized message
|
||||
is returned for processing by the corresponding channels `onReceive` function.
|
||||
|
||||
### Multiplexing
|
||||
|
||||
Messages are sent from a single `sendRoutine`, which loops over a select statement that results in the sending
|
||||
of a ping, a pong, or a batch of data messages. The batch of data messages may include messages from multiple channels.
|
||||
Message bytes are queued for sending in their respective channel, with each channel holding one unsent message at a time.
|
||||
Messages are chosen for a batch one a time from the channel with the lowest ratio of recently sent bytes to channel priority.
|
||||
|
||||
## Sending Messages
|
||||
|
||||
There are two methods for sending messages:
|
||||
```go
|
||||
func (m MConnection) Send(chID byte, msg interface{}) bool {}
|
||||
func (m MConnection) TrySend(chID byte, msg interface{}) bool {}
|
||||
```
|
||||
|
||||
`Send(chID, msg)` is a blocking call that waits until `msg` is successfully queued
|
||||
for the channel with the given id byte `chID`. The message `msg` is serialized
|
||||
using the `tendermint/wire` submodule's `WriteBinary()` reflection routine.
|
||||
|
||||
`TrySend(chID, msg)` is a nonblocking call that returns false if the channel's
|
||||
queue is full.
|
||||
|
||||
`Send()` and `TrySend()` are also exposed for each `Peer`.
|
||||
|
||||
## Peer
|
||||
|
||||
Each peer has one `MConnection` instance, and includes other information such as whether the connection
|
||||
was outbound, whether the connection should be recreated if it closes, various identity information about the node,
|
||||
and other higher level thread-safe data used by the reactors.
|
||||
|
||||
## Switch/Reactor
|
||||
|
||||
The `Switch` handles peer connections and exposes an API to receive incoming messages
|
||||
on `Reactors`. Each `Reactor` is responsible for handling incoming messages of one
|
||||
or more `Channels`. So while sending outgoing messages is typically performed on the peer,
|
||||
incoming messages are received on the reactor.
|
||||
|
||||
```go
|
||||
// Declare a MyReactor reactor that handles messages on MyChannelID.
|
||||
type MyReactor struct{}
|
||||
|
||||
func (reactor MyReactor) GetChannels() []*ChannelDescriptor {
|
||||
return []*ChannelDescriptor{ChannelDescriptor{ID:MyChannelID, Priority: 1}}
|
||||
}
|
||||
|
||||
func (reactor MyReactor) Receive(chID byte, peer *Peer, msgBytes []byte) {
|
||||
r, n, err := bytes.NewBuffer(msgBytes), new(int64), new(error)
|
||||
msgString := ReadString(r, n, err)
|
||||
fmt.Println(msgString)
|
||||
}
|
||||
|
||||
// Other Reactor methods omitted for brevity
|
||||
...
|
||||
|
||||
switch := NewSwitch([]Reactor{MyReactor{}})
|
||||
|
||||
...
|
||||
|
||||
// Send a random message to all outbound connections
|
||||
for _, peer := range switch.Peers().List() {
|
||||
if peer.IsOutbound() {
|
||||
peer.Send(MyChannelID, "Here's a random message")
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### PexReactor/AddrBook
|
||||
|
||||
A `PEXReactor` reactor implementation is provided to automate peer discovery.
|
||||
|
||||
```go
|
||||
book := p2p.NewAddrBook(addrBookFilePath)
|
||||
pexReactor := p2p.NewPEXReactor(book)
|
||||
...
|
||||
switch := NewSwitch([]Reactor{pexReactor, myReactor, ...})
|
||||
```
|
|
@ -0,0 +1,53 @@
|
|||
# Tendermint Peer Discovery
|
||||
|
||||
A Tendermint P2P network has different kinds of nodes with different requirements for connectivity to others.
|
||||
This document describes what kind of nodes Tendermint should enable and how they should work.
|
||||
|
||||
## Node startup options
|
||||
--p2p.seed_mode // If present, this node operates in seed mode. It will kick incoming peers after sharing some peers.
|
||||
--p2p.seeds “1.2.3.4:466656,2.3.4.5:4444” // Dials these seeds to get peers and disconnects.
|
||||
--p2p.persistent_peers “1.2.3.4:46656,2.3.4.5:466656” // These connections will be auto-redialed. If dial_seeds and persistent intersect, the user will be WARNED that seeds may auto-close connections and the node may not be able to keep the connection persistent
|
||||
|
||||
## Seeds
|
||||
|
||||
Seeds are the first point of contact for a new node.
|
||||
They return a list of known active peers and disconnect.
|
||||
|
||||
Seeds should operate full nodes, and with the PEX reactor in a "crawler" mode
|
||||
that continuously explores to validate the availability of peers.
|
||||
|
||||
Seeds should only respond with some top percentile of the best peers it knows about.
|
||||
|
||||
## New Full Node
|
||||
|
||||
A new node has seeds hardcoded into the software, but they can also be set manually (config file or flags).
|
||||
The new node must also have access to a recent block height, H, and hash, HASH.
|
||||
|
||||
The node then queries some seeds for peers for its chain,
|
||||
dials those peers, and runs the Tendermint protocols with those it successfully connects to.
|
||||
|
||||
When the peer catches up to height H, it ensures the block hash matches HASH.
|
||||
|
||||
## Restarted Full Node
|
||||
|
||||
A node checks its address book on startup and attempts to connect to peers from there.
|
||||
If it can't connect to any peers after some time, it falls back to the seeds to find more.
|
||||
|
||||
## Validator Node
|
||||
|
||||
A validator node is a node that interfaces with a validator signing key.
|
||||
These nodes require the highest security, and should not accept incoming connections.
|
||||
They should maintain outgoing connections to a controlled set of "Sentry Nodes" that serve
|
||||
as their proxy shield to the rest of the network.
|
||||
|
||||
Validators that know and trust each other can accept incoming connections from one another and maintain direct private connectivity via VPN.
|
||||
|
||||
## Sentry Node
|
||||
|
||||
Sentry nodes are guardians of a validator node and provide it access to the rest of the network.
|
||||
Sentry nodes may be dynamic, but should maintain persistent connections to some evolving random subset of each other.
|
||||
They should always expect to have direct incoming connections from the validator node and its backup/s.
|
||||
They do not report the validator node's address in the PEX.
|
||||
They may be more strict about the quality of peers they keep.
|
||||
|
||||
Sentry nodes belonging to validators that trust each other may wish to maintain persistent connections via VPN with one another, but only report each other sparingly in the PEX.
|
|
@ -0,0 +1,105 @@
|
|||
# Tendermint Peers
|
||||
|
||||
This document explains how Tendermint Peers are identified, how they connect to one another,
|
||||
and how other peers are found.
|
||||
|
||||
## Peer Identity
|
||||
|
||||
Tendermint peers are expected to maintain long-term persistent identities in the form of a private key.
|
||||
Each peer has an ID defined as `peer.ID == peer.PrivKey.Address()`, where `Address` uses the scheme defined in go-crypto.
|
||||
|
||||
Peer ID's must come with some Proof-of-Work; that is,
|
||||
they must satisfy `peer.PrivKey.Address() < target` for some difficulty target.
|
||||
This ensures they are not too easy to generate.
|
||||
|
||||
A single peer ID can have multiple IP addresses associated with - for simplicity, we only keep track
|
||||
of the latest one.
|
||||
|
||||
When attempting to connect to a peer, we use the PeerURL: `<ID>@<IP>:<PORT>`.
|
||||
We will attempt to connect to the peer at IP:PORT, and verify,
|
||||
via authenticated encryption, that it is in possession of the private key
|
||||
corresponding to `<ID>`. This prevents man-in-the-middle attacks on the peer layer.
|
||||
|
||||
Peers can also be connected to without specifying an ID, ie. `<IP>:<PORT>`.
|
||||
In this case, the peer cannot be authenticated and other means, such as a VPN,
|
||||
must be used.
|
||||
|
||||
## Connections
|
||||
|
||||
All p2p connections use TCP.
|
||||
Upon establishing a successful TCP connection with a peer,
|
||||
two handhsakes are performed: one for authenticated encryption, and one for Tendermint versioning.
|
||||
Both handshakes have configurable timeouts (they should complete quickly).
|
||||
|
||||
### Authenticated Encryption Handshake
|
||||
|
||||
Tendermint implements the Station-to-Station protocol
|
||||
using ED25519 keys for Diffie-Helman key-exchange and NACL SecretBox for encryption.
|
||||
It goes as follows:
|
||||
- generate an emphemeral ED25519 keypair
|
||||
- send the ephemeral public key to the peer
|
||||
- wait to receive the peer's ephemeral public key
|
||||
- compute the Diffie-Hellman shared secret using the peers ephemeral public key and our ephemeral private key
|
||||
- generate nonces to use for encryption
|
||||
- TODO
|
||||
- all communications from now on are encrypted using the shared secret
|
||||
- generate a common challenge to sign
|
||||
- sign the common challenge with our persistent private key
|
||||
- send the signed challenge and persistent public key to the peer
|
||||
- wait to receive the signed challenge and persistent public key from the peer
|
||||
- verify the signature in the signed challenge using the peers persistent public key
|
||||
|
||||
|
||||
If this is an outgoing connection (we dialed the peer) and we used a peer ID,
|
||||
then finally verify that the `peer.PubKey` corresponds to the peer ID we dialed,
|
||||
ie. `peer.PubKey.Address() == <ID>`.
|
||||
|
||||
The connection has now been authenticated. All traffic is encrypted.
|
||||
|
||||
Note that only the dialer can authenticate the identity of the peer,
|
||||
but this is what we care about since when we join the network we wish to
|
||||
ensure we have reached the intended peer (and are not being MITMd).
|
||||
|
||||
|
||||
### Peer Filter
|
||||
|
||||
Before continuing, we check if the new peer has the same ID has ourselves or
|
||||
an existing peer. If so, we disconnect.
|
||||
|
||||
We also check the peer's address and public key against
|
||||
an optional whitelist which can be managed through the ABCI app -
|
||||
if the whitelist is enabled and the peer is not on it, the connection is
|
||||
terminated.
|
||||
|
||||
|
||||
### Tendermint Version Handshake
|
||||
|
||||
The Tendermint Version Handshake allows the peers to exchange their NodeInfo, which contains:
|
||||
|
||||
```
|
||||
type NodeInfo struct {
|
||||
PubKey crypto.PubKey `json:"pub_key"`
|
||||
Moniker string `json:"moniker"`
|
||||
Network string `json:"network"`
|
||||
RemoteAddr string `json:"remote_addr"`
|
||||
ListenAddr string `json:"listen_addr"` // accepting in
|
||||
Version string `json:"version"` // major.minor.revision
|
||||
Channels []int8 `json:"channels"` // active reactor channels
|
||||
Other []string `json:"other"` // other application specific data
|
||||
}
|
||||
```
|
||||
|
||||
The connection is disconnected if:
|
||||
- `peer.NodeInfo.PubKey != peer.PubKey`
|
||||
- `peer.NodeInfo.Version` is not formatted as `X.X.X` where X are integers known as Major, Minor, and Revision
|
||||
- `peer.NodeInfo.Version` Major is not the same as ours
|
||||
- `peer.NodeInfo.Version` Minor is not the same as ours
|
||||
- `peer.NodeInfo.Network` is not the same as ours
|
||||
|
||||
|
||||
At this point, if we have not disconnected, the peer is valid and added to the switch,
|
||||
so it is added to all reactors.
|
||||
|
||||
|
||||
### Connection Activity
|
||||
|
|
@ -0,0 +1,23 @@
|
|||
|
||||
# Peer Strategy
|
||||
|
||||
Peers are managed using an address book and a trust metric.
|
||||
The book keeps a record of vetted peers and unvetted peers.
|
||||
When we need more peers, we pick them randomly from the addrbook with some
|
||||
configurable bias for unvetted peers. When we’re asked for peers, we provide a random selection with no bias.
|
||||
|
||||
The trust metric tracks the quality of the peers.
|
||||
When a peer exceeds a certain quality for a certain amount of time,
|
||||
it is marked as vetted in the addrbook.
|
||||
If a vetted peer's quality degrades sufficiently, it is booted, and must prove itself from scratch.
|
||||
If we need to make room for a new vetted peer, we move the lowest scoring vetted peer back to unvetted.
|
||||
If we need to make room for a new unvetted peer, we remove the lowest scoring unvetted peer -
|
||||
possibly only if its below some absolute minimum ?
|
||||
|
||||
Peer quality is tracked in the connection and across the reactors.
|
||||
Behaviours are defined as one of:
|
||||
- fatal - something outright malicious. we should disconnect and remember them.
|
||||
- bad - any kind of timeout, msgs that dont unmarshal, or fail other validity checks, or msgs we didn't ask for or arent expecting
|
||||
- neutral - normal correct behaviour. unknown channels/msg types (version upgrades).
|
||||
- good - some random majority of peers per reactor sending us useful messages
|
||||
|
Loading…
Reference in New Issue