tendermint/consensus/state_test.go

1178 lines
37 KiB
Go
Raw Normal View History

2015-09-22 18:12:34 -07:00
package consensus
import (
"bytes"
new pubsub package comment out failing consensus tests for now rewrite rpc httpclient to use new pubsub package import pubsub as tmpubsub, query as tmquery make event IDs constants EventKey -> EventTypeKey rename EventsPubsub to PubSub mempool does not use pubsub rename eventsSub to pubsub new subscribe API fix channel size issues and consensus tests bugs refactor rpc client add missing discardFromChan method add mutex rename pubsub to eventBus remove IsRunning from WSRPCConnection interface (not needed) add a comment in broadcastNewRoundStepsAndVotes rename registerEventCallbacks to broadcastNewRoundStepsAndVotes See https://dave.cheney.net/2014/03/19/channel-axioms stop eventBuses after reactor tests remove unnecessary Unsubscribe return subscribe helper function move discardFromChan to where it is used subscribe now returns an err this gives us ability to refuse to subscribe if pubsub is at its max capacity. use context for control overflow cache queries handle err when subscribing in replay_test rename testClientID to testSubscriber extract var set channel buffer capacity to 1 in replay_file fix byzantine_test unsubscribe from single event, not all events refactor httpclient to return events to appropriate channels return failing testReplayCrashBeforeWriteVote test fix TestValidatorSetChanges refactor code a bit fix testReplayCrashBeforeWriteVote add comment fix TestValidatorSetChanges fixes from Bucky's review update comment [ci skip] test TxEventBuffer update changelog fix TestValidatorSetChanges (2nd attempt) only do wg.Done when no errors benchmark event bus create pubsub server inside NewEventBus only expose config params (later if needed) set buffer capacity to 0 so we are not testing cache new tx event format: key = "Tx" plus a tag {"tx.hash": XYZ} This should allow to subscribe to all transactions! or a specific one using a query: "tm.events.type = Tx and tx.hash = '013ABF99434...'" use TimeoutCommit instead of afterPublishEventNewBlockTimeout TimeoutCommit is the time a node waits after committing a block, before it goes into the next height. So it will finish everything from the last block, but then wait a bit. The idea is this gives it time to hear more votes from other validators, to strengthen the commit it includes in the next block. But it also gives it time to hear about new transactions. waitForBlockWithUpdatedVals rewrite WAL crash tests Task: test that we can recover from any WAL crash. Solution: the old tests were relying on event hub being run in the same thread (we were injecting the private validator's last signature). when considering a rewrite, we considered two possible solutions: write a "fuzzy" testing system where WAL is crashing upon receiving a new message, or inject failures and trigger them in tests using something like https://github.com/coreos/gofail. remove sleep no cs.Lock around wal.Save test different cases (empty block, non-empty block, ...) comments add comments test 4 cases: empty block, non-empty block, non-empty block with smaller part size, many blocks fixes as per Bucky's last review reset subscriptions on UnsubscribeAll use a simple counter to track message for which we panicked also, set a smaller part size for all test cases
2017-06-26 08:00:30 -07:00
"context"
2015-09-22 18:12:34 -07:00
"fmt"
"testing"
"time"
"github.com/stretchr/testify/require"
cstypes "github.com/tendermint/tendermint/consensus/types"
cmn "github.com/tendermint/tendermint/libs/common"
2018-07-01 19:36:49 -07:00
"github.com/tendermint/tendermint/libs/log"
tmpubsub "github.com/tendermint/tendermint/libs/pubsub"
p2pdummy "github.com/tendermint/tendermint/p2p/dummy"
"github.com/tendermint/tendermint/types"
2015-09-22 18:12:34 -07:00
)
func init() {
2017-05-01 21:43:49 -07:00
config = ResetConfig("consensus_state_test")
}
2017-05-04 19:33:08 -07:00
func ensureProposeTimeout(timeoutPropose int) time.Duration {
return time.Duration(timeoutPropose*2) * time.Millisecond
}
2015-09-22 18:12:34 -07:00
/*
ProposeSuite
2015-12-01 20:12:01 -08:00
x * TestProposerSelection0 - round robin ordering, round 0
x * TestProposerSelection2 - round robin ordering, round 2++
2015-09-22 18:12:34 -07:00
x * TestEnterProposeNoValidator - timeout into prevote round
x * TestEnterPropose - finish propose without timing out (we have the proposal)
x * TestBadProposal - 2 vals, bad proposal (bad block state hash), should prevote and precommit nil
FullRoundSuite
x * TestFullRound1 - 1 val, full successful round
x * TestFullRoundNil - 1 val, full round of nil
2016-06-25 21:40:53 -07:00
x * TestFullRound2 - 2 vals, both required for full round
2015-09-22 18:12:34 -07:00
LockSuite
x * TestLockNoPOL - 2 vals, 4 rounds. one val locked, precommits nil every round except first.
x * TestLockPOLRelock - 4 vals, one precommits, other 3 polka at next round, so we unlock and precomit the polka
x * TestLockPOLUnlock - 4 vals, one precommits, other 3 polka nil at next round, so we unlock and precomit nil
x * TestLockPOLSafety1 - 4 vals. We shouldn't change lock based on polka at earlier round
x * TestLockPOLSafety2 - 4 vals. After unlocking, we shouldn't relock based on polka at earlier round
* TestNetworkLock - once +1/3 precommits, network should be locked
* TestNetworkLockPOL - once +1/3 precommits, the block with more recent polka is committed
SlashingSuite
x * TestSlashingPrevotes - a validator prevoting twice in a round gets slashed
x * TestSlashingPrecommits - a validator precomitting twice in a round gets slashed
CatchupSuite
* TestCatchup - if we might be behind and we've seen any 2/3 prevotes, round skip to new round, precommit, or prevote
HaltSuite
x * TestHalt1 - if we see +2/3 precommits after timing out into new round, we should still commit
*/
//----------------------------------------------------------------------------------------------------
// ProposeSuite
2018-01-18 17:38:19 -08:00
func TestStateProposerSelection0(t *testing.T) {
2016-01-18 12:57:57 -08:00
cs1, vss := randConsensusState(4)
2015-12-01 20:12:01 -08:00
height, round := cs1.Height, cs1.Round
2015-10-28 10:49:35 -07:00
new pubsub package comment out failing consensus tests for now rewrite rpc httpclient to use new pubsub package import pubsub as tmpubsub, query as tmquery make event IDs constants EventKey -> EventTypeKey rename EventsPubsub to PubSub mempool does not use pubsub rename eventsSub to pubsub new subscribe API fix channel size issues and consensus tests bugs refactor rpc client add missing discardFromChan method add mutex rename pubsub to eventBus remove IsRunning from WSRPCConnection interface (not needed) add a comment in broadcastNewRoundStepsAndVotes rename registerEventCallbacks to broadcastNewRoundStepsAndVotes See https://dave.cheney.net/2014/03/19/channel-axioms stop eventBuses after reactor tests remove unnecessary Unsubscribe return subscribe helper function move discardFromChan to where it is used subscribe now returns an err this gives us ability to refuse to subscribe if pubsub is at its max capacity. use context for control overflow cache queries handle err when subscribing in replay_test rename testClientID to testSubscriber extract var set channel buffer capacity to 1 in replay_file fix byzantine_test unsubscribe from single event, not all events refactor httpclient to return events to appropriate channels return failing testReplayCrashBeforeWriteVote test fix TestValidatorSetChanges refactor code a bit fix testReplayCrashBeforeWriteVote add comment fix TestValidatorSetChanges fixes from Bucky's review update comment [ci skip] test TxEventBuffer update changelog fix TestValidatorSetChanges (2nd attempt) only do wg.Done when no errors benchmark event bus create pubsub server inside NewEventBus only expose config params (later if needed) set buffer capacity to 0 so we are not testing cache new tx event format: key = "Tx" plus a tag {"tx.hash": XYZ} This should allow to subscribe to all transactions! or a specific one using a query: "tm.events.type = Tx and tx.hash = '013ABF99434...'" use TimeoutCommit instead of afterPublishEventNewBlockTimeout TimeoutCommit is the time a node waits after committing a block, before it goes into the next height. So it will finish everything from the last block, but then wait a bit. The idea is this gives it time to hear more votes from other validators, to strengthen the commit it includes in the next block. But it also gives it time to hear about new transactions. waitForBlockWithUpdatedVals rewrite WAL crash tests Task: test that we can recover from any WAL crash. Solution: the old tests were relying on event hub being run in the same thread (we were injecting the private validator's last signature). when considering a rewrite, we considered two possible solutions: write a "fuzzy" testing system where WAL is crashing upon receiving a new message, or inject failures and trigger them in tests using something like https://github.com/coreos/gofail. remove sleep no cs.Lock around wal.Save test different cases (empty block, non-empty block, ...) comments add comments test 4 cases: empty block, non-empty block, non-empty block with smaller part size, many blocks fixes as per Bucky's last review reset subscriptions on UnsubscribeAll use a simple counter to track message for which we panicked also, set a smaller part size for all test cases
2017-06-26 08:00:30 -07:00
newRoundCh := subscribe(cs1.eventBus, types.EventQueryNewRound)
proposalCh := subscribe(cs1.eventBus, types.EventQueryCompleteProposal)
startTestRound(cs1, height, round)
// wait for new round so proposer is set
<-newRoundCh
2015-10-28 10:49:35 -07:00
// lets commit a block and ensure proposer for the next height is correct
prop := cs1.GetRoundState().Validators.GetProposer()
if !bytes.Equal(prop.Address, cs1.privValidator.GetAddress()) {
2016-06-26 12:33:11 -07:00
t.Fatalf("expected proposer to be validator %d. Got %X", 0, prop.Address)
2015-10-28 10:49:35 -07:00
}
2015-12-01 20:12:01 -08:00
// wait for complete proposal
<-proposalCh
2015-12-01 20:12:01 -08:00
rs := cs1.GetRoundState()
signAddVotes(cs1, types.VoteTypePrecommit, rs.ProposalBlock.Hash(), rs.ProposalBlockParts.Header(), vss[1:]...)
2015-12-01 20:12:01 -08:00
// wait for new round so next validator is set
<-newRoundCh
2015-12-01 20:12:01 -08:00
prop = cs1.GetRoundState().Validators.GetProposer()
if !bytes.Equal(prop.Address, vss[1].GetAddress()) {
panic(fmt.Sprintf("expected proposer to be validator %d. Got %X", 1, prop.Address))
2015-10-28 10:49:35 -07:00
}
2015-12-01 20:12:01 -08:00
}
2015-10-28 10:49:35 -07:00
2015-12-01 20:12:01 -08:00
// Now let's do it all again, but starting from round 2 instead of 0
2018-01-18 17:38:19 -08:00
func TestStateProposerSelection2(t *testing.T) {
2016-01-18 12:57:57 -08:00
cs1, vss := randConsensusState(4) // test needs more work for more than 3 validators
2015-10-28 10:49:35 -07:00
new pubsub package comment out failing consensus tests for now rewrite rpc httpclient to use new pubsub package import pubsub as tmpubsub, query as tmquery make event IDs constants EventKey -> EventTypeKey rename EventsPubsub to PubSub mempool does not use pubsub rename eventsSub to pubsub new subscribe API fix channel size issues and consensus tests bugs refactor rpc client add missing discardFromChan method add mutex rename pubsub to eventBus remove IsRunning from WSRPCConnection interface (not needed) add a comment in broadcastNewRoundStepsAndVotes rename registerEventCallbacks to broadcastNewRoundStepsAndVotes See https://dave.cheney.net/2014/03/19/channel-axioms stop eventBuses after reactor tests remove unnecessary Unsubscribe return subscribe helper function move discardFromChan to where it is used subscribe now returns an err this gives us ability to refuse to subscribe if pubsub is at its max capacity. use context for control overflow cache queries handle err when subscribing in replay_test rename testClientID to testSubscriber extract var set channel buffer capacity to 1 in replay_file fix byzantine_test unsubscribe from single event, not all events refactor httpclient to return events to appropriate channels return failing testReplayCrashBeforeWriteVote test fix TestValidatorSetChanges refactor code a bit fix testReplayCrashBeforeWriteVote add comment fix TestValidatorSetChanges fixes from Bucky's review update comment [ci skip] test TxEventBuffer update changelog fix TestValidatorSetChanges (2nd attempt) only do wg.Done when no errors benchmark event bus create pubsub server inside NewEventBus only expose config params (later if needed) set buffer capacity to 0 so we are not testing cache new tx event format: key = "Tx" plus a tag {"tx.hash": XYZ} This should allow to subscribe to all transactions! or a specific one using a query: "tm.events.type = Tx and tx.hash = '013ABF99434...'" use TimeoutCommit instead of afterPublishEventNewBlockTimeout TimeoutCommit is the time a node waits after committing a block, before it goes into the next height. So it will finish everything from the last block, but then wait a bit. The idea is this gives it time to hear more votes from other validators, to strengthen the commit it includes in the next block. But it also gives it time to hear about new transactions. waitForBlockWithUpdatedVals rewrite WAL crash tests Task: test that we can recover from any WAL crash. Solution: the old tests were relying on event hub being run in the same thread (we were injecting the private validator's last signature). when considering a rewrite, we considered two possible solutions: write a "fuzzy" testing system where WAL is crashing upon receiving a new message, or inject failures and trigger them in tests using something like https://github.com/coreos/gofail. remove sleep no cs.Lock around wal.Save test different cases (empty block, non-empty block, ...) comments add comments test 4 cases: empty block, non-empty block, non-empty block with smaller part size, many blocks fixes as per Bucky's last review reset subscriptions on UnsubscribeAll use a simple counter to track message for which we panicked also, set a smaller part size for all test cases
2017-06-26 08:00:30 -07:00
newRoundCh := subscribe(cs1.eventBus, types.EventQueryNewRound)
2015-10-28 10:49:35 -07:00
// this time we jump in at round 2
2015-12-01 20:12:01 -08:00
incrementRound(vss[1:]...)
incrementRound(vss[1:]...)
startTestRound(cs1, cs1.Height, 2)
<-newRoundCh // wait for the new round
2015-10-28 10:49:35 -07:00
// everyone just votes nil. we get a new proposer each round
2015-12-01 20:12:01 -08:00
for i := 0; i < len(vss); i++ {
prop := cs1.GetRoundState().Validators.GetProposer()
correctProposer := vss[(i+2)%len(vss)].GetAddress()
if !bytes.Equal(prop.Address, correctProposer) {
panic(fmt.Sprintf("expected RoundState.Validators.GetProposer() to be validator %d. Got %X", (i+2)%len(vss), prop.Address))
}
rs := cs1.GetRoundState()
signAddVotes(cs1, types.VoteTypePrecommit, nil, rs.ProposalBlockParts.Header(), vss[1:]...)
<-newRoundCh // wait for the new round event each round
2015-12-01 20:12:01 -08:00
incrementRound(vss[1:]...)
2015-10-28 10:49:35 -07:00
}
2015-09-22 18:12:34 -07:00
}
// a non-validator should timeout into the prevote round
2018-01-18 17:38:19 -08:00
func TestStateEnterProposeNoPrivValidator(t *testing.T) {
2016-01-18 12:57:57 -08:00
cs, _ := randConsensusState(1)
2015-09-22 18:12:34 -07:00
cs.SetPrivValidator(nil)
2015-12-01 20:12:01 -08:00
height, round := cs.Height, cs.Round
2015-09-22 18:12:34 -07:00
2015-12-01 20:12:01 -08:00
// Listen for propose timeout event
new pubsub package comment out failing consensus tests for now rewrite rpc httpclient to use new pubsub package import pubsub as tmpubsub, query as tmquery make event IDs constants EventKey -> EventTypeKey rename EventsPubsub to PubSub mempool does not use pubsub rename eventsSub to pubsub new subscribe API fix channel size issues and consensus tests bugs refactor rpc client add missing discardFromChan method add mutex rename pubsub to eventBus remove IsRunning from WSRPCConnection interface (not needed) add a comment in broadcastNewRoundStepsAndVotes rename registerEventCallbacks to broadcastNewRoundStepsAndVotes See https://dave.cheney.net/2014/03/19/channel-axioms stop eventBuses after reactor tests remove unnecessary Unsubscribe return subscribe helper function move discardFromChan to where it is used subscribe now returns an err this gives us ability to refuse to subscribe if pubsub is at its max capacity. use context for control overflow cache queries handle err when subscribing in replay_test rename testClientID to testSubscriber extract var set channel buffer capacity to 1 in replay_file fix byzantine_test unsubscribe from single event, not all events refactor httpclient to return events to appropriate channels return failing testReplayCrashBeforeWriteVote test fix TestValidatorSetChanges refactor code a bit fix testReplayCrashBeforeWriteVote add comment fix TestValidatorSetChanges fixes from Bucky's review update comment [ci skip] test TxEventBuffer update changelog fix TestValidatorSetChanges (2nd attempt) only do wg.Done when no errors benchmark event bus create pubsub server inside NewEventBus only expose config params (later if needed) set buffer capacity to 0 so we are not testing cache new tx event format: key = "Tx" plus a tag {"tx.hash": XYZ} This should allow to subscribe to all transactions! or a specific one using a query: "tm.events.type = Tx and tx.hash = '013ABF99434...'" use TimeoutCommit instead of afterPublishEventNewBlockTimeout TimeoutCommit is the time a node waits after committing a block, before it goes into the next height. So it will finish everything from the last block, but then wait a bit. The idea is this gives it time to hear more votes from other validators, to strengthen the commit it includes in the next block. But it also gives it time to hear about new transactions. waitForBlockWithUpdatedVals rewrite WAL crash tests Task: test that we can recover from any WAL crash. Solution: the old tests were relying on event hub being run in the same thread (we were injecting the private validator's last signature). when considering a rewrite, we considered two possible solutions: write a "fuzzy" testing system where WAL is crashing upon receiving a new message, or inject failures and trigger them in tests using something like https://github.com/coreos/gofail. remove sleep no cs.Lock around wal.Save test different cases (empty block, non-empty block, ...) comments add comments test 4 cases: empty block, non-empty block, non-empty block with smaller part size, many blocks fixes as per Bucky's last review reset subscriptions on UnsubscribeAll use a simple counter to track message for which we panicked also, set a smaller part size for all test cases
2017-06-26 08:00:30 -07:00
timeoutCh := subscribe(cs.eventBus, types.EventQueryTimeoutPropose)
2015-09-22 18:12:34 -07:00
startTestRound(cs, height, round)
2015-09-22 18:12:34 -07:00
// if we're not a validator, EnterPropose should timeout
2017-05-04 19:33:08 -07:00
ticker := time.NewTicker(ensureProposeTimeout(cs.config.TimeoutPropose))
select {
case <-timeoutCh:
case <-ticker.C:
2016-07-11 18:10:05 -07:00
panic("Expected EnterPropose to timeout")
2015-12-01 20:12:01 -08:00
}
2015-12-01 20:12:01 -08:00
if cs.GetRoundState().Proposal != nil {
t.Error("Expected to make no proposal, since no privValidator")
2015-09-22 18:12:34 -07:00
}
}
// a validator should not timeout of the prevote round (TODO: unless the block is really big!)
2018-01-18 17:38:19 -08:00
func TestStateEnterProposeYesPrivValidator(t *testing.T) {
2016-01-18 12:57:57 -08:00
cs, _ := randConsensusState(1)
2015-12-01 20:12:01 -08:00
height, round := cs.Height, cs.Round
2015-09-22 18:12:34 -07:00
2015-12-01 20:12:01 -08:00
// Listen for propose timeout event
new pubsub package comment out failing consensus tests for now rewrite rpc httpclient to use new pubsub package import pubsub as tmpubsub, query as tmquery make event IDs constants EventKey -> EventTypeKey rename EventsPubsub to PubSub mempool does not use pubsub rename eventsSub to pubsub new subscribe API fix channel size issues and consensus tests bugs refactor rpc client add missing discardFromChan method add mutex rename pubsub to eventBus remove IsRunning from WSRPCConnection interface (not needed) add a comment in broadcastNewRoundStepsAndVotes rename registerEventCallbacks to broadcastNewRoundStepsAndVotes See https://dave.cheney.net/2014/03/19/channel-axioms stop eventBuses after reactor tests remove unnecessary Unsubscribe return subscribe helper function move discardFromChan to where it is used subscribe now returns an err this gives us ability to refuse to subscribe if pubsub is at its max capacity. use context for control overflow cache queries handle err when subscribing in replay_test rename testClientID to testSubscriber extract var set channel buffer capacity to 1 in replay_file fix byzantine_test unsubscribe from single event, not all events refactor httpclient to return events to appropriate channels return failing testReplayCrashBeforeWriteVote test fix TestValidatorSetChanges refactor code a bit fix testReplayCrashBeforeWriteVote add comment fix TestValidatorSetChanges fixes from Bucky's review update comment [ci skip] test TxEventBuffer update changelog fix TestValidatorSetChanges (2nd attempt) only do wg.Done when no errors benchmark event bus create pubsub server inside NewEventBus only expose config params (later if needed) set buffer capacity to 0 so we are not testing cache new tx event format: key = "Tx" plus a tag {"tx.hash": XYZ} This should allow to subscribe to all transactions! or a specific one using a query: "tm.events.type = Tx and tx.hash = '013ABF99434...'" use TimeoutCommit instead of afterPublishEventNewBlockTimeout TimeoutCommit is the time a node waits after committing a block, before it goes into the next height. So it will finish everything from the last block, but then wait a bit. The idea is this gives it time to hear more votes from other validators, to strengthen the commit it includes in the next block. But it also gives it time to hear about new transactions. waitForBlockWithUpdatedVals rewrite WAL crash tests Task: test that we can recover from any WAL crash. Solution: the old tests were relying on event hub being run in the same thread (we were injecting the private validator's last signature). when considering a rewrite, we considered two possible solutions: write a "fuzzy" testing system where WAL is crashing upon receiving a new message, or inject failures and trigger them in tests using something like https://github.com/coreos/gofail. remove sleep no cs.Lock around wal.Save test different cases (empty block, non-empty block, ...) comments add comments test 4 cases: empty block, non-empty block, non-empty block with smaller part size, many blocks fixes as per Bucky's last review reset subscriptions on UnsubscribeAll use a simple counter to track message for which we panicked also, set a smaller part size for all test cases
2017-06-26 08:00:30 -07:00
timeoutCh := subscribe(cs.eventBus, types.EventQueryTimeoutPropose)
proposalCh := subscribe(cs.eventBus, types.EventQueryCompleteProposal)
2015-09-22 18:12:34 -07:00
cs.enterNewRound(height, round)
cs.startRoutines(3)
2015-09-22 18:12:34 -07:00
<-proposalCh
2015-09-22 18:12:34 -07:00
2015-12-01 20:12:01 -08:00
// Check that Proposal, ProposalBlock, ProposalBlockParts are set.
rs := cs.GetRoundState()
if rs.Proposal == nil {
t.Error("rs.Proposal should be set")
}
if rs.ProposalBlock == nil {
t.Error("rs.ProposalBlock should be set")
}
if rs.ProposalBlockParts.Total() == 0 {
t.Error("rs.ProposalBlockParts should be set")
}
2015-09-22 18:12:34 -07:00
// if we're a validator, enterPropose should not timeout
2017-05-04 19:33:08 -07:00
ticker := time.NewTicker(ensureProposeTimeout(cs.config.TimeoutPropose))
select {
case <-timeoutCh:
2016-07-11 18:10:05 -07:00
panic("Expected EnterPropose not to timeout")
case <-ticker.C:
2015-09-22 18:12:34 -07:00
}
}
2018-01-18 17:38:19 -08:00
func TestStateBadProposal(t *testing.T) {
2016-01-18 12:57:57 -08:00
cs1, vss := randConsensusState(2)
2015-12-01 20:12:01 -08:00
height, round := cs1.Height, cs1.Round
vs2 := vss[1]
2015-09-22 18:12:34 -07:00
partSize := types.BlockPartSizeBytes
new pubsub package comment out failing consensus tests for now rewrite rpc httpclient to use new pubsub package import pubsub as tmpubsub, query as tmquery make event IDs constants EventKey -> EventTypeKey rename EventsPubsub to PubSub mempool does not use pubsub rename eventsSub to pubsub new subscribe API fix channel size issues and consensus tests bugs refactor rpc client add missing discardFromChan method add mutex rename pubsub to eventBus remove IsRunning from WSRPCConnection interface (not needed) add a comment in broadcastNewRoundStepsAndVotes rename registerEventCallbacks to broadcastNewRoundStepsAndVotes See https://dave.cheney.net/2014/03/19/channel-axioms stop eventBuses after reactor tests remove unnecessary Unsubscribe return subscribe helper function move discardFromChan to where it is used subscribe now returns an err this gives us ability to refuse to subscribe if pubsub is at its max capacity. use context for control overflow cache queries handle err when subscribing in replay_test rename testClientID to testSubscriber extract var set channel buffer capacity to 1 in replay_file fix byzantine_test unsubscribe from single event, not all events refactor httpclient to return events to appropriate channels return failing testReplayCrashBeforeWriteVote test fix TestValidatorSetChanges refactor code a bit fix testReplayCrashBeforeWriteVote add comment fix TestValidatorSetChanges fixes from Bucky's review update comment [ci skip] test TxEventBuffer update changelog fix TestValidatorSetChanges (2nd attempt) only do wg.Done when no errors benchmark event bus create pubsub server inside NewEventBus only expose config params (later if needed) set buffer capacity to 0 so we are not testing cache new tx event format: key = "Tx" plus a tag {"tx.hash": XYZ} This should allow to subscribe to all transactions! or a specific one using a query: "tm.events.type = Tx and tx.hash = '013ABF99434...'" use TimeoutCommit instead of afterPublishEventNewBlockTimeout TimeoutCommit is the time a node waits after committing a block, before it goes into the next height. So it will finish everything from the last block, but then wait a bit. The idea is this gives it time to hear more votes from other validators, to strengthen the commit it includes in the next block. But it also gives it time to hear about new transactions. waitForBlockWithUpdatedVals rewrite WAL crash tests Task: test that we can recover from any WAL crash. Solution: the old tests were relying on event hub being run in the same thread (we were injecting the private validator's last signature). when considering a rewrite, we considered two possible solutions: write a "fuzzy" testing system where WAL is crashing upon receiving a new message, or inject failures and trigger them in tests using something like https://github.com/coreos/gofail. remove sleep no cs.Lock around wal.Save test different cases (empty block, non-empty block, ...) comments add comments test 4 cases: empty block, non-empty block, non-empty block with smaller part size, many blocks fixes as per Bucky's last review reset subscriptions on UnsubscribeAll use a simple counter to track message for which we panicked also, set a smaller part size for all test cases
2017-06-26 08:00:30 -07:00
proposalCh := subscribe(cs1.eventBus, types.EventQueryCompleteProposal)
voteCh := subscribe(cs1.eventBus, types.EventQueryVote)
2015-09-22 18:12:34 -07:00
propBlock, _ := cs1.createProposalBlock() //changeProposer(t, cs1, vs2)
// make the second validator the proposer by incrementing round
round = round + 1
incrementRound(vss[1:]...)
2015-09-22 18:12:34 -07:00
// make the block bad by tampering with statehash
2015-12-01 20:12:01 -08:00
stateHash := propBlock.AppHash
if len(stateHash) == 0 {
stateHash = make([]byte, 32)
}
2015-09-22 18:12:34 -07:00
stateHash[0] = byte((stateHash[0] + 1) % 255)
2015-12-01 20:12:01 -08:00
propBlock.AppHash = stateHash
propBlockParts := propBlock.MakePartSet(partSize)
proposal := types.NewProposal(vs2.Height, round, propBlockParts.Header(), -1, types.BlockID{})
2017-12-10 17:43:58 -08:00
if err := vs2.SignProposal(config.ChainID(), proposal); err != nil {
t.Fatal("failed to sign bad proposal", err)
2015-09-22 18:12:34 -07:00
}
// set the proposal block
2017-09-06 10:11:47 -07:00
if err := cs1.SetProposalAndBlock(proposal, propBlock, propBlockParts, "some peer"); err != nil {
t.Fatal(err)
}
2015-09-22 18:12:34 -07:00
// start the machine
startTestRound(cs1, height, round)
2015-09-22 18:12:34 -07:00
// wait for proposal
<-proposalCh
// wait for prevote
<-voteCh
2015-12-01 20:12:01 -08:00
validatePrevote(t, cs1, round, vss[0], nil)
2015-09-22 18:12:34 -07:00
// add bad prevote from vs2 and wait for it
signAddVotes(cs1, types.VoteTypePrevote, propBlock.Hash(), propBlock.MakePartSet(partSize).Header(), vs2)
<-voteCh
2015-12-01 20:12:01 -08:00
// wait for precommit
<-voteCh
2015-12-01 20:12:01 -08:00
validatePrecommit(t, cs1, round, 0, vss[0], nil, nil)
signAddVotes(cs1, types.VoteTypePrecommit, propBlock.Hash(), propBlock.MakePartSet(partSize).Header(), vs2)
2015-09-22 18:12:34 -07:00
}
//----------------------------------------------------------------------------------------------------
2015-10-28 10:49:35 -07:00
// FullRoundSuite
2015-09-22 18:12:34 -07:00
// propose, prevote, and precommit a block
2018-01-18 17:38:19 -08:00
func TestStateFullRound1(t *testing.T) {
2016-01-18 12:57:57 -08:00
cs, vss := randConsensusState(1)
2015-12-01 20:12:01 -08:00
height, round := cs.Height, cs.Round
2015-09-22 18:12:34 -07:00
fix TestFullRound1 race (Refs #846) ``` ================== WARNING: DATA RACE Write at 0x00c42d7605f0 by goroutine 844: github.com/tendermint/tendermint/consensus.(*ConsensusState).updateToState() /home/vagrant/go/src/github.com/tendermint/tendermint/consensus/state.go:465 +0x59e I[11-14|22:37:28.781] Added to prevote vote="Vote{0:646753DCE124 1/02/1(Prevote) E9B19636DCDB {/CAD5FA805E8C.../}}" prevotes="VoteSet{H:1 R:2 T:1 +2/3:<nil> BA{2:X_} map[]}" github.com/tendermint/tendermint/consensus.(*ConsensusState).finalizeCommit() /home/vagrant/go/src/github.com/tendermint/tendermint/consensus/state.go:1229 +0x16a9 github.com/tendermint/tendermint/consensus.(*ConsensusState).tryFinalizeCommit() /home/vagrant/go/src/github.com/tendermint/tendermint/consensus/state.go:1135 +0x721 github.com/tendermint/tendermint/consensus.(*ConsensusState).enterCommit.func1() /home/vagrant/go/src/github.com/tendermint/tendermint/consensus/state.go:1087 +0x153 github.com/tendermint/tendermint/consensus.(*ConsensusState).enterCommit() /home/vagrant/go/src/github.com/tendermint/tendermint/consensus/state.go:1114 +0xa34 github.com/tendermint/tendermint/consensus.(*ConsensusState).addVote() /home/vagrant/go/src/github.com/tendermint/tendermint/consensus/state.go:1423 +0xdd6 github.com/tendermint/tendermint/consensus.(*ConsensusState).tryAddVote() /home/vagrant/go/src/github.com/tendermint/tendermint/consensus/state.go:1317 +0x77 github.com/tendermint/tendermint/consensus.(*ConsensusState).handleMsg() /home/vagrant/go/src/github.com/tendermint/tendermint/consensus/state.go:565 +0x7a9 github.com/tendermint/tendermint/consensus.(*ConsensusState).receiveRoutine() /home/vagrant/go/src/github.com/tendermint/tendermint/consensus/state.go:523 +0x6d2 Previous read at 0x00c42d7605f0 by goroutine 654: github.com/tendermint/tendermint/consensus.validatePrevote() /home/vagrant/go/src/github.com/tendermint/tendermint/consensus/common_test.go:149 +0x57 github.com/tendermint/tendermint/consensus.TestFullRound1() /home/vagrant/go/src/github.com/tendermint/tendermint/consensus/state_test.go:256 +0x3c5 testing.tRunner() /usr/local/go/src/testing/testing.go:746 +0x16c Goroutine 844 (running) created at: github.com/tendermint/tendermint/consensus.(*ConsensusState).startRoutines() /home/vagrant/go/src/github.com/tendermint/tendermint/consensus/state.go:258 +0x8c github.com/tendermint/tendermint/consensus.startTestRound() /home/vagrant/go/src/github.com/tendermint/tendermint/consensus/common_test.go:118 +0x63 github.com/tendermint/tendermint/consensus.TestFullRound1() /home/vagrant/go/src/github.com/tendermint/tendermint/consensus/state_test.go:247 +0x1fb testing.tRunner() /usr/local/go/src/testing/testing.go:746 +0x16c Goroutine 654 (running) created at: testing.(*T).Run() /usr/local/go/src/testing/testing.go:789 +0x568 testing.runTests.func1() /usr/local/go/src/testing/testing.go:1004 +0xa7 testing.tRunner() /usr/local/go/src/testing/testing.go:746 +0x16c testing.runTests() /usr/local/go/src/testing/testing.go:1002 +0x521 testing.(*M).Run() /usr/local/go/src/testing/testing.go:921 +0x206 main.main() github.com/tendermint/tendermint/consensus/_test/_testmain.go:106 +0x1d3 ================== ```
2017-11-14 15:41:30 -08:00
// NOTE: buffer capacity of 0 ensures we can validate prevote and last commit
// before consensus can move to the next height (and cause a race condition)
cs.eventBus.Stop()
eventBus := types.NewEventBusWithBufferCapacity(0)
eventBus.SetLogger(log.TestingLogger().With("module", "events"))
cs.SetEventBus(eventBus)
eventBus.Start()
new pubsub package comment out failing consensus tests for now rewrite rpc httpclient to use new pubsub package import pubsub as tmpubsub, query as tmquery make event IDs constants EventKey -> EventTypeKey rename EventsPubsub to PubSub mempool does not use pubsub rename eventsSub to pubsub new subscribe API fix channel size issues and consensus tests bugs refactor rpc client add missing discardFromChan method add mutex rename pubsub to eventBus remove IsRunning from WSRPCConnection interface (not needed) add a comment in broadcastNewRoundStepsAndVotes rename registerEventCallbacks to broadcastNewRoundStepsAndVotes See https://dave.cheney.net/2014/03/19/channel-axioms stop eventBuses after reactor tests remove unnecessary Unsubscribe return subscribe helper function move discardFromChan to where it is used subscribe now returns an err this gives us ability to refuse to subscribe if pubsub is at its max capacity. use context for control overflow cache queries handle err when subscribing in replay_test rename testClientID to testSubscriber extract var set channel buffer capacity to 1 in replay_file fix byzantine_test unsubscribe from single event, not all events refactor httpclient to return events to appropriate channels return failing testReplayCrashBeforeWriteVote test fix TestValidatorSetChanges refactor code a bit fix testReplayCrashBeforeWriteVote add comment fix TestValidatorSetChanges fixes from Bucky's review update comment [ci skip] test TxEventBuffer update changelog fix TestValidatorSetChanges (2nd attempt) only do wg.Done when no errors benchmark event bus create pubsub server inside NewEventBus only expose config params (later if needed) set buffer capacity to 0 so we are not testing cache new tx event format: key = "Tx" plus a tag {"tx.hash": XYZ} This should allow to subscribe to all transactions! or a specific one using a query: "tm.events.type = Tx and tx.hash = '013ABF99434...'" use TimeoutCommit instead of afterPublishEventNewBlockTimeout TimeoutCommit is the time a node waits after committing a block, before it goes into the next height. So it will finish everything from the last block, but then wait a bit. The idea is this gives it time to hear more votes from other validators, to strengthen the commit it includes in the next block. But it also gives it time to hear about new transactions. waitForBlockWithUpdatedVals rewrite WAL crash tests Task: test that we can recover from any WAL crash. Solution: the old tests were relying on event hub being run in the same thread (we were injecting the private validator's last signature). when considering a rewrite, we considered two possible solutions: write a "fuzzy" testing system where WAL is crashing upon receiving a new message, or inject failures and trigger them in tests using something like https://github.com/coreos/gofail. remove sleep no cs.Lock around wal.Save test different cases (empty block, non-empty block, ...) comments add comments test 4 cases: empty block, non-empty block, non-empty block with smaller part size, many blocks fixes as per Bucky's last review reset subscriptions on UnsubscribeAll use a simple counter to track message for which we panicked also, set a smaller part size for all test cases
2017-06-26 08:00:30 -07:00
voteCh := subscribe(cs.eventBus, types.EventQueryVote)
propCh := subscribe(cs.eventBus, types.EventQueryCompleteProposal)
newRoundCh := subscribe(cs.eventBus, types.EventQueryNewRound)
2015-12-01 20:12:01 -08:00
2015-12-13 11:56:05 -08:00
startTestRound(cs, height, round)
<-newRoundCh
// grab proposal
re := <-propCh
2018-04-05 08:17:10 -07:00
propBlockHash := re.(types.EventDataRoundState).RoundState.(*cstypes.RoundState).ProposalBlock.Hash()
2015-09-22 18:12:34 -07:00
<-voteCh // wait for prevote
2015-12-13 11:56:05 -08:00
validatePrevote(t, cs, round, vss[0], propBlockHash)
2015-09-22 18:12:34 -07:00
2015-12-13 11:56:05 -08:00
<-voteCh // wait for precommit
2015-12-01 20:12:01 -08:00
2015-12-13 11:56:05 -08:00
// we're going to roll right into new height
<-newRoundCh
2015-12-13 11:56:05 -08:00
validateLastPrecommit(t, cs, vss[0], propBlockHash)
2015-09-22 18:12:34 -07:00
}
// nil is proposed, so prevote and precommit nil
2018-01-18 17:38:19 -08:00
func TestStateFullRoundNil(t *testing.T) {
2016-01-18 12:57:57 -08:00
cs, vss := randConsensusState(1)
2015-12-01 20:12:01 -08:00
height, round := cs.Height, cs.Round
2015-09-22 18:12:34 -07:00
new pubsub package comment out failing consensus tests for now rewrite rpc httpclient to use new pubsub package import pubsub as tmpubsub, query as tmquery make event IDs constants EventKey -> EventTypeKey rename EventsPubsub to PubSub mempool does not use pubsub rename eventsSub to pubsub new subscribe API fix channel size issues and consensus tests bugs refactor rpc client add missing discardFromChan method add mutex rename pubsub to eventBus remove IsRunning from WSRPCConnection interface (not needed) add a comment in broadcastNewRoundStepsAndVotes rename registerEventCallbacks to broadcastNewRoundStepsAndVotes See https://dave.cheney.net/2014/03/19/channel-axioms stop eventBuses after reactor tests remove unnecessary Unsubscribe return subscribe helper function move discardFromChan to where it is used subscribe now returns an err this gives us ability to refuse to subscribe if pubsub is at its max capacity. use context for control overflow cache queries handle err when subscribing in replay_test rename testClientID to testSubscriber extract var set channel buffer capacity to 1 in replay_file fix byzantine_test unsubscribe from single event, not all events refactor httpclient to return events to appropriate channels return failing testReplayCrashBeforeWriteVote test fix TestValidatorSetChanges refactor code a bit fix testReplayCrashBeforeWriteVote add comment fix TestValidatorSetChanges fixes from Bucky's review update comment [ci skip] test TxEventBuffer update changelog fix TestValidatorSetChanges (2nd attempt) only do wg.Done when no errors benchmark event bus create pubsub server inside NewEventBus only expose config params (later if needed) set buffer capacity to 0 so we are not testing cache new tx event format: key = "Tx" plus a tag {"tx.hash": XYZ} This should allow to subscribe to all transactions! or a specific one using a query: "tm.events.type = Tx and tx.hash = '013ABF99434...'" use TimeoutCommit instead of afterPublishEventNewBlockTimeout TimeoutCommit is the time a node waits after committing a block, before it goes into the next height. So it will finish everything from the last block, but then wait a bit. The idea is this gives it time to hear more votes from other validators, to strengthen the commit it includes in the next block. But it also gives it time to hear about new transactions. waitForBlockWithUpdatedVals rewrite WAL crash tests Task: test that we can recover from any WAL crash. Solution: the old tests were relying on event hub being run in the same thread (we were injecting the private validator's last signature). when considering a rewrite, we considered two possible solutions: write a "fuzzy" testing system where WAL is crashing upon receiving a new message, or inject failures and trigger them in tests using something like https://github.com/coreos/gofail. remove sleep no cs.Lock around wal.Save test different cases (empty block, non-empty block, ...) comments add comments test 4 cases: empty block, non-empty block, non-empty block with smaller part size, many blocks fixes as per Bucky's last review reset subscriptions on UnsubscribeAll use a simple counter to track message for which we panicked also, set a smaller part size for all test cases
2017-06-26 08:00:30 -07:00
voteCh := subscribe(cs.eventBus, types.EventQueryVote)
cs.enterPrevote(height, round)
cs.startRoutines(4)
2015-09-22 18:12:34 -07:00
<-voteCh // prevote
<-voteCh // precommit
2015-09-22 18:12:34 -07:00
// should prevote and precommit nil
2015-12-01 20:12:01 -08:00
validatePrevoteAndPrecommit(t, cs, round, 0, vss[0], nil, nil)
2015-09-22 18:12:34 -07:00
}
// run through propose, prevote, precommit commit with two validators
// where the first validator has to wait for votes from the second
2018-01-18 17:38:19 -08:00
func TestStateFullRound2(t *testing.T) {
2016-01-18 12:57:57 -08:00
cs1, vss := randConsensusState(2)
vs2 := vss[1]
2015-12-01 20:12:01 -08:00
height, round := cs1.Height, cs1.Round
2015-09-22 18:12:34 -07:00
new pubsub package comment out failing consensus tests for now rewrite rpc httpclient to use new pubsub package import pubsub as tmpubsub, query as tmquery make event IDs constants EventKey -> EventTypeKey rename EventsPubsub to PubSub mempool does not use pubsub rename eventsSub to pubsub new subscribe API fix channel size issues and consensus tests bugs refactor rpc client add missing discardFromChan method add mutex rename pubsub to eventBus remove IsRunning from WSRPCConnection interface (not needed) add a comment in broadcastNewRoundStepsAndVotes rename registerEventCallbacks to broadcastNewRoundStepsAndVotes See https://dave.cheney.net/2014/03/19/channel-axioms stop eventBuses after reactor tests remove unnecessary Unsubscribe return subscribe helper function move discardFromChan to where it is used subscribe now returns an err this gives us ability to refuse to subscribe if pubsub is at its max capacity. use context for control overflow cache queries handle err when subscribing in replay_test rename testClientID to testSubscriber extract var set channel buffer capacity to 1 in replay_file fix byzantine_test unsubscribe from single event, not all events refactor httpclient to return events to appropriate channels return failing testReplayCrashBeforeWriteVote test fix TestValidatorSetChanges refactor code a bit fix testReplayCrashBeforeWriteVote add comment fix TestValidatorSetChanges fixes from Bucky's review update comment [ci skip] test TxEventBuffer update changelog fix TestValidatorSetChanges (2nd attempt) only do wg.Done when no errors benchmark event bus create pubsub server inside NewEventBus only expose config params (later if needed) set buffer capacity to 0 so we are not testing cache new tx event format: key = "Tx" plus a tag {"tx.hash": XYZ} This should allow to subscribe to all transactions! or a specific one using a query: "tm.events.type = Tx and tx.hash = '013ABF99434...'" use TimeoutCommit instead of afterPublishEventNewBlockTimeout TimeoutCommit is the time a node waits after committing a block, before it goes into the next height. So it will finish everything from the last block, but then wait a bit. The idea is this gives it time to hear more votes from other validators, to strengthen the commit it includes in the next block. But it also gives it time to hear about new transactions. waitForBlockWithUpdatedVals rewrite WAL crash tests Task: test that we can recover from any WAL crash. Solution: the old tests were relying on event hub being run in the same thread (we were injecting the private validator's last signature). when considering a rewrite, we considered two possible solutions: write a "fuzzy" testing system where WAL is crashing upon receiving a new message, or inject failures and trigger them in tests using something like https://github.com/coreos/gofail. remove sleep no cs.Lock around wal.Save test different cases (empty block, non-empty block, ...) comments add comments test 4 cases: empty block, non-empty block, non-empty block with smaller part size, many blocks fixes as per Bucky's last review reset subscriptions on UnsubscribeAll use a simple counter to track message for which we panicked also, set a smaller part size for all test cases
2017-06-26 08:00:30 -07:00
voteCh := subscribe(cs1.eventBus, types.EventQueryVote)
newBlockCh := subscribe(cs1.eventBus, types.EventQueryNewBlock)
2015-09-22 18:12:34 -07:00
// start round and wait for propose and prevote
startTestRound(cs1, height, round)
2015-09-22 18:12:34 -07:00
<-voteCh // prevote
// we should be stuck in limbo waiting for more prevotes
2016-07-11 17:40:48 -07:00
rs := cs1.GetRoundState()
propBlockHash, propPartsHeader := rs.ProposalBlock.Hash(), rs.ProposalBlockParts.Header()
2015-09-22 18:12:34 -07:00
// prevote arrives from vs2:
signAddVotes(cs1, types.VoteTypePrevote, propBlockHash, propPartsHeader, vs2)
<-voteCh
2015-09-22 18:12:34 -07:00
<-voteCh //precommit
2015-09-22 18:12:34 -07:00
// the proposed block should now be locked and our precommit added
2015-12-01 20:12:01 -08:00
validatePrecommit(t, cs1, 0, 0, vss[0], propBlockHash, propBlockHash)
2015-09-22 18:12:34 -07:00
// we should be stuck in limbo waiting for more precommits
2015-09-22 18:12:34 -07:00
// precommit arrives from vs2:
signAddVotes(cs1, types.VoteTypePrecommit, propBlockHash, propPartsHeader, vs2)
<-voteCh
2015-09-22 18:12:34 -07:00
// wait to finish commit, propose in next height
<-newBlockCh
2015-09-22 18:12:34 -07:00
}
//------------------------------------------------------------------------------------------
// LockSuite
// two validators, 4 rounds.
// two vals take turns proposing. val1 locks on first one, precommits nil on everything else
2018-01-18 17:38:19 -08:00
func TestStateLockNoPOL(t *testing.T) {
2016-01-18 12:57:57 -08:00
cs1, vss := randConsensusState(2)
vs2 := vss[1]
2015-12-01 20:12:01 -08:00
height := cs1.Height
2015-09-22 18:12:34 -07:00
partSize := types.BlockPartSizeBytes
new pubsub package comment out failing consensus tests for now rewrite rpc httpclient to use new pubsub package import pubsub as tmpubsub, query as tmquery make event IDs constants EventKey -> EventTypeKey rename EventsPubsub to PubSub mempool does not use pubsub rename eventsSub to pubsub new subscribe API fix channel size issues and consensus tests bugs refactor rpc client add missing discardFromChan method add mutex rename pubsub to eventBus remove IsRunning from WSRPCConnection interface (not needed) add a comment in broadcastNewRoundStepsAndVotes rename registerEventCallbacks to broadcastNewRoundStepsAndVotes See https://dave.cheney.net/2014/03/19/channel-axioms stop eventBuses after reactor tests remove unnecessary Unsubscribe return subscribe helper function move discardFromChan to where it is used subscribe now returns an err this gives us ability to refuse to subscribe if pubsub is at its max capacity. use context for control overflow cache queries handle err when subscribing in replay_test rename testClientID to testSubscriber extract var set channel buffer capacity to 1 in replay_file fix byzantine_test unsubscribe from single event, not all events refactor httpclient to return events to appropriate channels return failing testReplayCrashBeforeWriteVote test fix TestValidatorSetChanges refactor code a bit fix testReplayCrashBeforeWriteVote add comment fix TestValidatorSetChanges fixes from Bucky's review update comment [ci skip] test TxEventBuffer update changelog fix TestValidatorSetChanges (2nd attempt) only do wg.Done when no errors benchmark event bus create pubsub server inside NewEventBus only expose config params (later if needed) set buffer capacity to 0 so we are not testing cache new tx event format: key = "Tx" plus a tag {"tx.hash": XYZ} This should allow to subscribe to all transactions! or a specific one using a query: "tm.events.type = Tx and tx.hash = '013ABF99434...'" use TimeoutCommit instead of afterPublishEventNewBlockTimeout TimeoutCommit is the time a node waits after committing a block, before it goes into the next height. So it will finish everything from the last block, but then wait a bit. The idea is this gives it time to hear more votes from other validators, to strengthen the commit it includes in the next block. But it also gives it time to hear about new transactions. waitForBlockWithUpdatedVals rewrite WAL crash tests Task: test that we can recover from any WAL crash. Solution: the old tests were relying on event hub being run in the same thread (we were injecting the private validator's last signature). when considering a rewrite, we considered two possible solutions: write a "fuzzy" testing system where WAL is crashing upon receiving a new message, or inject failures and trigger them in tests using something like https://github.com/coreos/gofail. remove sleep no cs.Lock around wal.Save test different cases (empty block, non-empty block, ...) comments add comments test 4 cases: empty block, non-empty block, non-empty block with smaller part size, many blocks fixes as per Bucky's last review reset subscriptions on UnsubscribeAll use a simple counter to track message for which we panicked also, set a smaller part size for all test cases
2017-06-26 08:00:30 -07:00
timeoutProposeCh := subscribe(cs1.eventBus, types.EventQueryTimeoutPropose)
timeoutWaitCh := subscribe(cs1.eventBus, types.EventQueryTimeoutWait)
voteCh := subscribe(cs1.eventBus, types.EventQueryVote)
proposalCh := subscribe(cs1.eventBus, types.EventQueryCompleteProposal)
newRoundCh := subscribe(cs1.eventBus, types.EventQueryNewRound)
2015-09-22 18:12:34 -07:00
/*
Round1 (cs1, B) // B B // B B2
*/
// start round and wait for prevote
cs1.enterNewRound(height, 0)
2015-12-13 11:56:05 -08:00
cs1.startRoutines(0)
re := <-proposalCh
2018-04-05 08:17:10 -07:00
rs := re.(types.EventDataRoundState).RoundState.(*cstypes.RoundState)
2015-12-13 11:56:05 -08:00
theBlockHash := rs.ProposalBlock.Hash()
<-voteCh // prevote
2015-09-22 18:12:34 -07:00
// we should now be stuck in limbo forever, waiting for more prevotes
// prevote arrives from vs2:
signAddVotes(cs1, types.VoteTypePrevote, cs1.ProposalBlock.Hash(), cs1.ProposalBlockParts.Header(), vs2)
<-voteCh // prevote
2015-09-22 18:12:34 -07:00
2015-12-13 11:56:05 -08:00
<-voteCh // precommit
2015-09-22 18:12:34 -07:00
// the proposed block should now be locked and our precommit added
2015-12-01 20:12:01 -08:00
validatePrecommit(t, cs1, 0, 0, vss[0], theBlockHash, theBlockHash)
2015-09-22 18:12:34 -07:00
// we should now be stuck in limbo forever, waiting for more precommits
// lets add one for a different block
// NOTE: in practice we should never get to a point where there are precommits for different blocks at the same round
2015-12-13 11:56:05 -08:00
hash := make([]byte, len(theBlockHash))
copy(hash, theBlockHash)
2015-09-22 18:12:34 -07:00
hash[0] = byte((hash[0] + 1) % 255)
signAddVotes(cs1, types.VoteTypePrecommit, hash, rs.ProposalBlock.MakePartSet(partSize).Header(), vs2)
<-voteCh // precommit
2015-09-22 18:12:34 -07:00
// (note we're entering precommit for a second time this round)
// but with invalid args. then we enterPrecommitWait, and the timeout to new round
2015-12-13 11:56:05 -08:00
<-timeoutWaitCh
///
2015-09-22 18:12:34 -07:00
2015-12-13 11:56:05 -08:00
<-newRoundCh
2017-05-02 00:53:32 -07:00
t.Log("#### ONTO ROUND 1")
2015-09-22 18:12:34 -07:00
/*
Round2 (cs1, B) // B B2
*/
incrementRound(vs2)
2015-09-22 18:12:34 -07:00
2015-10-28 10:49:35 -07:00
// now we're on a new round and not the proposer, so wait for timeout
2015-12-13 11:56:05 -08:00
re = <-timeoutProposeCh
2018-04-05 08:17:10 -07:00
rs = re.(types.EventDataRoundState).RoundState.(*cstypes.RoundState)
2015-12-01 20:12:01 -08:00
2015-12-13 11:56:05 -08:00
if rs.ProposalBlock != nil {
2016-07-11 18:10:05 -07:00
panic("Expected proposal block to be nil")
2015-09-22 18:12:34 -07:00
}
// wait to finish prevote
2015-12-13 11:56:05 -08:00
<-voteCh
2015-09-22 18:12:34 -07:00
// we should have prevoted our locked block
2015-12-13 11:56:05 -08:00
validatePrevote(t, cs1, 1, vss[0], rs.LockedBlock.Hash())
2015-09-22 18:12:34 -07:00
// add a conflicting prevote from the other validator
2018-04-05 07:54:26 -07:00
signAddVotes(cs1, types.VoteTypePrevote, hash, rs.LockedBlock.MakePartSet(partSize).Header(), vs2)
<-voteCh
2015-09-22 18:12:34 -07:00
// now we're going to enter prevote again, but with invalid args
// and then prevote wait, which should timeout. then wait for precommit
2015-12-13 11:56:05 -08:00
<-timeoutWaitCh
<-voteCh // precommit
2015-09-22 18:12:34 -07:00
// the proposed block should still be locked and our precommit added
// we should precommit nil and be locked on the proposal
2015-12-01 20:12:01 -08:00
validatePrecommit(t, cs1, 1, 0, vss[0], nil, theBlockHash)
2015-09-22 18:12:34 -07:00
// add conflicting precommit from vs2
2015-09-22 18:12:34 -07:00
// NOTE: in practice we should never get to a point where there are precommits for different blocks at the same round
2018-04-05 07:54:26 -07:00
signAddVotes(cs1, types.VoteTypePrecommit, hash, rs.LockedBlock.MakePartSet(partSize).Header(), vs2)
<-voteCh
2015-09-22 18:12:34 -07:00
// (note we're entering precommit for a second time this round, but with invalid args
// then we enterPrecommitWait and timeout into NewRound
2015-12-13 11:56:05 -08:00
<-timeoutWaitCh
2015-09-22 18:12:34 -07:00
2015-12-13 11:56:05 -08:00
<-newRoundCh
2017-05-02 00:53:32 -07:00
t.Log("#### ONTO ROUND 2")
2015-09-22 18:12:34 -07:00
/*
Round3 (vs2, _) // B, B2
2015-09-22 18:12:34 -07:00
*/
incrementRound(vs2)
2015-09-22 18:12:34 -07:00
2015-12-13 11:56:05 -08:00
re = <-proposalCh
2018-04-05 08:17:10 -07:00
rs = re.(types.EventDataRoundState).RoundState.(*cstypes.RoundState)
2015-10-28 10:49:35 -07:00
// now we're on a new round and are the proposer
2015-12-13 11:56:05 -08:00
if !bytes.Equal(rs.ProposalBlock.Hash(), rs.LockedBlock.Hash()) {
panic(fmt.Sprintf("Expected proposal block to be locked block. Got %v, Expected %v", rs.ProposalBlock, rs.LockedBlock))
2015-09-22 18:12:34 -07:00
}
2015-12-13 11:56:05 -08:00
<-voteCh // prevote
2015-12-01 20:12:01 -08:00
validatePrevote(t, cs1, 2, vss[0], rs.LockedBlock.Hash())
2015-09-22 18:12:34 -07:00
signAddVotes(cs1, types.VoteTypePrevote, hash, rs.ProposalBlock.MakePartSet(partSize).Header(), vs2)
<-voteCh
2015-12-01 20:12:01 -08:00
2015-12-13 11:56:05 -08:00
<-timeoutWaitCh // prevote wait
<-voteCh // precommit
2015-12-01 20:12:01 -08:00
validatePrecommit(t, cs1, 2, 0, vss[0], nil, theBlockHash) // precommit nil but be locked on proposal
signAddVotes(cs1, types.VoteTypePrecommit, hash, rs.ProposalBlock.MakePartSet(partSize).Header(), vs2) // NOTE: conflicting precommits at same height
<-voteCh
2015-09-22 18:12:34 -07:00
2015-12-13 11:56:05 -08:00
<-timeoutWaitCh
2015-09-22 18:12:34 -07:00
2015-10-28 10:49:35 -07:00
// before we time out into new round, set next proposal block
prop, propBlock := decideProposal(cs1, vs2, vs2.Height, vs2.Round+1)
2015-09-22 18:12:34 -07:00
if prop == nil || propBlock == nil {
t.Fatal("Failed to create proposal block with vs2")
2015-09-22 18:12:34 -07:00
}
incrementRound(vs2)
2015-09-22 18:12:34 -07:00
2015-12-13 11:56:05 -08:00
<-newRoundCh
2017-05-02 00:53:32 -07:00
t.Log("#### ONTO ROUND 3")
2015-09-22 18:12:34 -07:00
/*
Round4 (vs2, C) // B C // B C
2015-09-22 18:12:34 -07:00
*/
// now we're on a new round and not the proposer
// so set the proposal block
2017-09-06 10:11:47 -07:00
if err := cs1.SetProposalAndBlock(prop, propBlock, propBlock.MakePartSet(partSize), ""); err != nil {
t.Fatal(err)
}
<-proposalCh
2015-12-13 11:56:05 -08:00
<-voteCh // prevote
2015-12-01 20:12:01 -08:00
2015-12-13 11:56:05 -08:00
// prevote for locked block (not proposal)
2015-12-01 20:12:01 -08:00
validatePrevote(t, cs1, 0, vss[0], cs1.LockedBlock.Hash())
2015-09-22 18:12:34 -07:00
signAddVotes(cs1, types.VoteTypePrevote, propBlock.Hash(), propBlock.MakePartSet(partSize).Header(), vs2)
<-voteCh
2015-12-01 20:12:01 -08:00
2015-12-13 11:56:05 -08:00
<-timeoutWaitCh
<-voteCh
2015-12-01 20:12:01 -08:00
validatePrecommit(t, cs1, 2, 0, vss[0], nil, theBlockHash) // precommit nil but locked on proposal
signAddVotes(cs1, types.VoteTypePrecommit, propBlock.Hash(), propBlock.MakePartSet(partSize).Header(), vs2) // NOTE: conflicting precommits at same height
<-voteCh
2015-09-22 18:12:34 -07:00
}
// 4 vals, one precommits, other 3 polka at next round, so we unlock and precomit the polka
2018-01-18 17:38:19 -08:00
func TestStateLockPOLRelock(t *testing.T) {
2016-01-18 12:57:57 -08:00
cs1, vss := randConsensusState(4)
vs2, vs3, vs4 := vss[1], vss[2], vss[3]
2015-12-13 11:56:05 -08:00
partSize := types.BlockPartSizeBytes
new pubsub package comment out failing consensus tests for now rewrite rpc httpclient to use new pubsub package import pubsub as tmpubsub, query as tmquery make event IDs constants EventKey -> EventTypeKey rename EventsPubsub to PubSub mempool does not use pubsub rename eventsSub to pubsub new subscribe API fix channel size issues and consensus tests bugs refactor rpc client add missing discardFromChan method add mutex rename pubsub to eventBus remove IsRunning from WSRPCConnection interface (not needed) add a comment in broadcastNewRoundStepsAndVotes rename registerEventCallbacks to broadcastNewRoundStepsAndVotes See https://dave.cheney.net/2014/03/19/channel-axioms stop eventBuses after reactor tests remove unnecessary Unsubscribe return subscribe helper function move discardFromChan to where it is used subscribe now returns an err this gives us ability to refuse to subscribe if pubsub is at its max capacity. use context for control overflow cache queries handle err when subscribing in replay_test rename testClientID to testSubscriber extract var set channel buffer capacity to 1 in replay_file fix byzantine_test unsubscribe from single event, not all events refactor httpclient to return events to appropriate channels return failing testReplayCrashBeforeWriteVote test fix TestValidatorSetChanges refactor code a bit fix testReplayCrashBeforeWriteVote add comment fix TestValidatorSetChanges fixes from Bucky's review update comment [ci skip] test TxEventBuffer update changelog fix TestValidatorSetChanges (2nd attempt) only do wg.Done when no errors benchmark event bus create pubsub server inside NewEventBus only expose config params (later if needed) set buffer capacity to 0 so we are not testing cache new tx event format: key = "Tx" plus a tag {"tx.hash": XYZ} This should allow to subscribe to all transactions! or a specific one using a query: "tm.events.type = Tx and tx.hash = '013ABF99434...'" use TimeoutCommit instead of afterPublishEventNewBlockTimeout TimeoutCommit is the time a node waits after committing a block, before it goes into the next height. So it will finish everything from the last block, but then wait a bit. The idea is this gives it time to hear more votes from other validators, to strengthen the commit it includes in the next block. But it also gives it time to hear about new transactions. waitForBlockWithUpdatedVals rewrite WAL crash tests Task: test that we can recover from any WAL crash. Solution: the old tests were relying on event hub being run in the same thread (we were injecting the private validator's last signature). when considering a rewrite, we considered two possible solutions: write a "fuzzy" testing system where WAL is crashing upon receiving a new message, or inject failures and trigger them in tests using something like https://github.com/coreos/gofail. remove sleep no cs.Lock around wal.Save test different cases (empty block, non-empty block, ...) comments add comments test 4 cases: empty block, non-empty block, non-empty block with smaller part size, many blocks fixes as per Bucky's last review reset subscriptions on UnsubscribeAll use a simple counter to track message for which we panicked also, set a smaller part size for all test cases
2017-06-26 08:00:30 -07:00
timeoutProposeCh := subscribe(cs1.eventBus, types.EventQueryTimeoutPropose)
timeoutWaitCh := subscribe(cs1.eventBus, types.EventQueryTimeoutWait)
proposalCh := subscribe(cs1.eventBus, types.EventQueryCompleteProposal)
voteCh := subscribe(cs1.eventBus, types.EventQueryVote)
newRoundCh := subscribe(cs1.eventBus, types.EventQueryNewRound)
newBlockCh := subscribe(cs1.eventBus, types.EventQueryNewBlockHeader)
2015-12-13 11:56:05 -08:00
2015-09-22 18:12:34 -07:00
// everything done from perspective of cs1
/*
Round1 (cs1, B) // B B B B// B nil B nil
eg. vs2 and vs4 didn't see the 2/3 prevotes
2015-09-22 18:12:34 -07:00
*/
// start round and wait for propose and prevote
2015-12-13 11:56:05 -08:00
startTestRound(cs1, cs1.Height, 0)
<-newRoundCh
re := <-proposalCh
2018-04-05 08:17:10 -07:00
rs := re.(types.EventDataRoundState).RoundState.(*cstypes.RoundState)
2015-12-13 11:56:05 -08:00
theBlockHash := rs.ProposalBlock.Hash()
<-voteCh // prevote
signAddVotes(cs1, types.VoteTypePrevote, cs1.ProposalBlock.Hash(), cs1.ProposalBlockParts.Header(), vs2, vs3, vs4)
// prevotes
2017-06-28 08:12:45 -07:00
discardFromChan(voteCh, 3)
2015-09-22 18:12:34 -07:00
2015-12-13 11:56:05 -08:00
<-voteCh // our precommit
2015-09-22 18:12:34 -07:00
// the proposed block should now be locked and our precommit added
2015-12-01 20:12:01 -08:00
validatePrecommit(t, cs1, 0, 0, vss[0], theBlockHash, theBlockHash)
2015-09-22 18:12:34 -07:00
// add precommits from the rest
signAddVotes(cs1, types.VoteTypePrecommit, nil, types.PartSetHeader{}, vs2, vs4)
signAddVotes(cs1, types.VoteTypePrecommit, cs1.ProposalBlock.Hash(), cs1.ProposalBlockParts.Header(), vs3)
// precommites
2017-06-28 08:12:45 -07:00
discardFromChan(voteCh, 3)
2015-09-22 18:12:34 -07:00
2015-12-13 11:56:05 -08:00
// before we timeout to the new round set the new proposal
prop, propBlock := decideProposal(cs1, vs2, vs2.Height, vs2.Round+1)
propBlockParts := propBlock.MakePartSet(partSize)
2015-12-13 11:56:05 -08:00
propBlockHash := propBlock.Hash()
2015-09-22 18:12:34 -07:00
incrementRound(vs2, vs3, vs4)
2015-09-22 18:12:34 -07:00
// timeout to new round
2015-12-13 11:56:05 -08:00
<-timeoutWaitCh
2015-09-22 18:12:34 -07:00
2017-05-20 21:43:00 -07:00
//XXX: this isnt guaranteed to get there before the timeoutPropose ...
2017-09-06 10:11:47 -07:00
if err := cs1.SetProposalAndBlock(prop, propBlock, propBlockParts, "some peer"); err != nil {
t.Fatal(err)
}
2015-12-13 11:56:05 -08:00
<-newRoundCh
2017-05-02 00:53:32 -07:00
t.Log("### ONTO ROUND 1")
2015-09-22 18:12:34 -07:00
/*
Round2 (vs2, C) // B C C C // C C C _)
2015-09-22 18:12:34 -07:00
cs1 changes lock!
*/
// now we're on a new round and not the proposer
2015-12-13 11:56:05 -08:00
// but we should receive the proposal
select {
case <-proposalCh:
case <-timeoutProposeCh:
<-proposalCh
}
2015-09-22 18:12:34 -07:00
// go to prevote, prevote for locked block (not proposal), move on
2015-12-13 11:56:05 -08:00
<-voteCh
2015-12-01 20:12:01 -08:00
validatePrevote(t, cs1, 0, vss[0], theBlockHash)
2015-09-22 18:12:34 -07:00
// now lets add prevotes from everyone else for the new block
signAddVotes(cs1, types.VoteTypePrevote, propBlockHash, propBlockParts.Header(), vs2, vs3, vs4)
// prevotes
2017-06-28 08:12:45 -07:00
discardFromChan(voteCh, 3)
2015-12-13 11:56:05 -08:00
// now either we go to PrevoteWait or Precommit
select {
case <-timeoutWaitCh: // we're in PrevoteWait, go to Precommit
// XXX: there's no guarantee we see the polka, this might be a precommit for nil,
// in which case the test fails!
2015-12-13 11:56:05 -08:00
<-voteCh
case <-voteCh: // we went straight to Precommit
}
2015-09-22 18:12:34 -07:00
// we should have unlocked and locked on the new block
2015-12-01 20:12:01 -08:00
validatePrecommit(t, cs1, 1, 1, vss[0], propBlockHash, propBlockHash)
2015-09-22 18:12:34 -07:00
signAddVotes(cs1, types.VoteTypePrecommit, propBlockHash, propBlockParts.Header(), vs2, vs3)
2017-06-28 08:12:45 -07:00
discardFromChan(voteCh, 2)
2015-09-22 18:12:34 -07:00
2015-12-13 11:56:05 -08:00
be := <-newBlockCh
2018-04-05 08:17:10 -07:00
b := be.(types.EventDataNewBlockHeader)
2015-12-13 11:56:05 -08:00
re = <-newRoundCh
2018-04-05 08:17:10 -07:00
rs = re.(types.EventDataRoundState).RoundState.(*cstypes.RoundState)
2015-09-22 18:12:34 -07:00
if rs.Height != 2 {
2016-07-11 18:10:05 -07:00
panic("Expected height to increment")
2015-09-22 18:12:34 -07:00
}
2016-04-19 17:59:52 -07:00
if !bytes.Equal(b.Header.Hash(), propBlockHash) {
2016-07-11 18:10:05 -07:00
panic("Expected new block to be proposal block")
2015-09-22 18:12:34 -07:00
}
}
// 4 vals, one precommits, other 3 polka at next round, so we unlock and precomit the polka
2018-01-18 17:38:19 -08:00
func TestStateLockPOLUnlock(t *testing.T) {
2016-01-18 12:57:57 -08:00
cs1, vss := randConsensusState(4)
vs2, vs3, vs4 := vss[1], vss[2], vss[3]
2015-12-13 11:56:05 -08:00
partSize := types.BlockPartSizeBytes
new pubsub package comment out failing consensus tests for now rewrite rpc httpclient to use new pubsub package import pubsub as tmpubsub, query as tmquery make event IDs constants EventKey -> EventTypeKey rename EventsPubsub to PubSub mempool does not use pubsub rename eventsSub to pubsub new subscribe API fix channel size issues and consensus tests bugs refactor rpc client add missing discardFromChan method add mutex rename pubsub to eventBus remove IsRunning from WSRPCConnection interface (not needed) add a comment in broadcastNewRoundStepsAndVotes rename registerEventCallbacks to broadcastNewRoundStepsAndVotes See https://dave.cheney.net/2014/03/19/channel-axioms stop eventBuses after reactor tests remove unnecessary Unsubscribe return subscribe helper function move discardFromChan to where it is used subscribe now returns an err this gives us ability to refuse to subscribe if pubsub is at its max capacity. use context for control overflow cache queries handle err when subscribing in replay_test rename testClientID to testSubscriber extract var set channel buffer capacity to 1 in replay_file fix byzantine_test unsubscribe from single event, not all events refactor httpclient to return events to appropriate channels return failing testReplayCrashBeforeWriteVote test fix TestValidatorSetChanges refactor code a bit fix testReplayCrashBeforeWriteVote add comment fix TestValidatorSetChanges fixes from Bucky's review update comment [ci skip] test TxEventBuffer update changelog fix TestValidatorSetChanges (2nd attempt) only do wg.Done when no errors benchmark event bus create pubsub server inside NewEventBus only expose config params (later if needed) set buffer capacity to 0 so we are not testing cache new tx event format: key = "Tx" plus a tag {"tx.hash": XYZ} This should allow to subscribe to all transactions! or a specific one using a query: "tm.events.type = Tx and tx.hash = '013ABF99434...'" use TimeoutCommit instead of afterPublishEventNewBlockTimeout TimeoutCommit is the time a node waits after committing a block, before it goes into the next height. So it will finish everything from the last block, but then wait a bit. The idea is this gives it time to hear more votes from other validators, to strengthen the commit it includes in the next block. But it also gives it time to hear about new transactions. waitForBlockWithUpdatedVals rewrite WAL crash tests Task: test that we can recover from any WAL crash. Solution: the old tests were relying on event hub being run in the same thread (we were injecting the private validator's last signature). when considering a rewrite, we considered two possible solutions: write a "fuzzy" testing system where WAL is crashing upon receiving a new message, or inject failures and trigger them in tests using something like https://github.com/coreos/gofail. remove sleep no cs.Lock around wal.Save test different cases (empty block, non-empty block, ...) comments add comments test 4 cases: empty block, non-empty block, non-empty block with smaller part size, many blocks fixes as per Bucky's last review reset subscriptions on UnsubscribeAll use a simple counter to track message for which we panicked also, set a smaller part size for all test cases
2017-06-26 08:00:30 -07:00
proposalCh := subscribe(cs1.eventBus, types.EventQueryCompleteProposal)
timeoutProposeCh := subscribe(cs1.eventBus, types.EventQueryTimeoutPropose)
timeoutWaitCh := subscribe(cs1.eventBus, types.EventQueryTimeoutWait)
newRoundCh := subscribe(cs1.eventBus, types.EventQueryNewRound)
unlockCh := subscribe(cs1.eventBus, types.EventQueryUnlock)
voteCh := subscribeToVoter(cs1, cs1.privValidator.GetAddress())
2015-09-22 18:12:34 -07:00
// everything done from perspective of cs1
/*
Round1 (cs1, B) // B B B B // B nil B nil
eg. didn't see the 2/3 prevotes
*/
// start round and wait for propose and prevote
2015-12-13 11:56:05 -08:00
startTestRound(cs1, cs1.Height, 0)
<-newRoundCh
re := <-proposalCh
2018-04-05 08:17:10 -07:00
rs := re.(types.EventDataRoundState).RoundState.(*cstypes.RoundState)
2015-12-13 11:56:05 -08:00
theBlockHash := rs.ProposalBlock.Hash()
2015-09-22 18:12:34 -07:00
2015-12-13 11:56:05 -08:00
<-voteCh // prevote
2015-09-22 18:12:34 -07:00
signAddVotes(cs1, types.VoteTypePrevote, cs1.ProposalBlock.Hash(), cs1.ProposalBlockParts.Header(), vs2, vs3, vs4)
2015-12-13 11:56:05 -08:00
<-voteCh //precommit
2015-09-22 18:12:34 -07:00
// the proposed block should now be locked and our precommit added
2015-12-01 20:12:01 -08:00
validatePrecommit(t, cs1, 0, 0, vss[0], theBlockHash, theBlockHash)
2015-09-22 18:12:34 -07:00
2016-07-11 17:40:48 -07:00
rs = cs1.GetRoundState()
2015-09-22 18:12:34 -07:00
// add precommits from the rest
signAddVotes(cs1, types.VoteTypePrecommit, nil, types.PartSetHeader{}, vs2, vs4)
signAddVotes(cs1, types.VoteTypePrecommit, cs1.ProposalBlock.Hash(), cs1.ProposalBlockParts.Header(), vs3)
2015-09-22 18:12:34 -07:00
2015-12-13 11:56:05 -08:00
// before we time out into new round, set next proposal block
prop, propBlock := decideProposal(cs1, vs2, vs2.Height, vs2.Round+1)
propBlockParts := propBlock.MakePartSet(partSize)
2015-09-22 18:12:34 -07:00
incrementRound(vs2, vs3, vs4)
2015-09-22 18:12:34 -07:00
// timeout to new round
2015-12-13 11:56:05 -08:00
re = <-timeoutWaitCh
2018-04-05 08:17:10 -07:00
rs = re.(types.EventDataRoundState).RoundState.(*cstypes.RoundState)
2015-12-13 11:56:05 -08:00
lockedBlockHash := rs.LockedBlock.Hash()
2017-05-20 21:43:00 -07:00
//XXX: this isnt guaranteed to get there before the timeoutPropose ...
2017-09-06 10:11:47 -07:00
if err := cs1.SetProposalAndBlock(prop, propBlock, propBlockParts, "some peer"); err != nil {
t.Fatal(err)
}
2015-09-22 18:12:34 -07:00
2015-12-13 11:56:05 -08:00
<-newRoundCh
2017-05-02 00:53:32 -07:00
t.Log("#### ONTO ROUND 1")
2015-09-22 18:12:34 -07:00
/*
Round2 (vs2, C) // B nil nil nil // nil nil nil _
2015-09-22 18:12:34 -07:00
cs1 unlocks!
*/
// now we're on a new round and not the proposer,
2015-12-13 11:56:05 -08:00
// but we should receive the proposal
select {
case <-proposalCh:
case <-timeoutProposeCh:
<-proposalCh
}
2015-09-22 18:12:34 -07:00
// go to prevote, prevote for locked block (not proposal)
2015-12-13 11:56:05 -08:00
<-voteCh
2015-12-01 20:12:01 -08:00
validatePrevote(t, cs1, 0, vss[0], lockedBlockHash)
2015-12-13 11:56:05 -08:00
// now lets add prevotes from everyone else for nil (a polka!)
signAddVotes(cs1, types.VoteTypePrevote, nil, types.PartSetHeader{}, vs2, vs3, vs4)
2015-09-22 18:12:34 -07:00
2015-12-13 11:56:05 -08:00
// the polka makes us unlock and precommit nil
<-unlockCh
<-voteCh // precommit
// we should have unlocked and committed nil
// NOTE: since we don't relock on nil, the lock round is 0
2015-12-01 20:12:01 -08:00
validatePrecommit(t, cs1, 1, 0, vss[0], nil, nil)
2015-09-22 18:12:34 -07:00
signAddVotes(cs1, types.VoteTypePrecommit, nil, types.PartSetHeader{}, vs2, vs3)
2015-12-13 11:56:05 -08:00
<-newRoundCh
2015-09-22 18:12:34 -07:00
}
// 4 vals
// a polka at round 1 but we miss it
// then a polka at round 2 that we lock on
// then we see the polka from round 1 but shouldn't unlock
2018-01-18 17:38:19 -08:00
func TestStateLockPOLSafety1(t *testing.T) {
2016-01-18 12:57:57 -08:00
cs1, vss := randConsensusState(4)
vs2, vs3, vs4 := vss[1], vss[2], vss[3]
2015-12-13 11:56:05 -08:00
partSize := types.BlockPartSizeBytes
new pubsub package comment out failing consensus tests for now rewrite rpc httpclient to use new pubsub package import pubsub as tmpubsub, query as tmquery make event IDs constants EventKey -> EventTypeKey rename EventsPubsub to PubSub mempool does not use pubsub rename eventsSub to pubsub new subscribe API fix channel size issues and consensus tests bugs refactor rpc client add missing discardFromChan method add mutex rename pubsub to eventBus remove IsRunning from WSRPCConnection interface (not needed) add a comment in broadcastNewRoundStepsAndVotes rename registerEventCallbacks to broadcastNewRoundStepsAndVotes See https://dave.cheney.net/2014/03/19/channel-axioms stop eventBuses after reactor tests remove unnecessary Unsubscribe return subscribe helper function move discardFromChan to where it is used subscribe now returns an err this gives us ability to refuse to subscribe if pubsub is at its max capacity. use context for control overflow cache queries handle err when subscribing in replay_test rename testClientID to testSubscriber extract var set channel buffer capacity to 1 in replay_file fix byzantine_test unsubscribe from single event, not all events refactor httpclient to return events to appropriate channels return failing testReplayCrashBeforeWriteVote test fix TestValidatorSetChanges refactor code a bit fix testReplayCrashBeforeWriteVote add comment fix TestValidatorSetChanges fixes from Bucky's review update comment [ci skip] test TxEventBuffer update changelog fix TestValidatorSetChanges (2nd attempt) only do wg.Done when no errors benchmark event bus create pubsub server inside NewEventBus only expose config params (later if needed) set buffer capacity to 0 so we are not testing cache new tx event format: key = "Tx" plus a tag {"tx.hash": XYZ} This should allow to subscribe to all transactions! or a specific one using a query: "tm.events.type = Tx and tx.hash = '013ABF99434...'" use TimeoutCommit instead of afterPublishEventNewBlockTimeout TimeoutCommit is the time a node waits after committing a block, before it goes into the next height. So it will finish everything from the last block, but then wait a bit. The idea is this gives it time to hear more votes from other validators, to strengthen the commit it includes in the next block. But it also gives it time to hear about new transactions. waitForBlockWithUpdatedVals rewrite WAL crash tests Task: test that we can recover from any WAL crash. Solution: the old tests were relying on event hub being run in the same thread (we were injecting the private validator's last signature). when considering a rewrite, we considered two possible solutions: write a "fuzzy" testing system where WAL is crashing upon receiving a new message, or inject failures and trigger them in tests using something like https://github.com/coreos/gofail. remove sleep no cs.Lock around wal.Save test different cases (empty block, non-empty block, ...) comments add comments test 4 cases: empty block, non-empty block, non-empty block with smaller part size, many blocks fixes as per Bucky's last review reset subscriptions on UnsubscribeAll use a simple counter to track message for which we panicked also, set a smaller part size for all test cases
2017-06-26 08:00:30 -07:00
proposalCh := subscribe(cs1.eventBus, types.EventQueryCompleteProposal)
timeoutProposeCh := subscribe(cs1.eventBus, types.EventQueryTimeoutPropose)
timeoutWaitCh := subscribe(cs1.eventBus, types.EventQueryTimeoutWait)
newRoundCh := subscribe(cs1.eventBus, types.EventQueryNewRound)
voteCh := subscribeToVoter(cs1, cs1.privValidator.GetAddress())
2015-09-22 18:12:34 -07:00
// start round and wait for propose and prevote
2015-12-13 11:56:05 -08:00
startTestRound(cs1, cs1.Height, 0)
<-newRoundCh
re := <-proposalCh
2018-04-05 08:17:10 -07:00
rs := re.(types.EventDataRoundState).RoundState.(*cstypes.RoundState)
2015-12-13 11:56:05 -08:00
propBlock := rs.ProposalBlock
2015-09-22 18:12:34 -07:00
2015-12-13 11:56:05 -08:00
<-voteCh // prevote
2015-09-22 18:12:34 -07:00
2015-12-13 11:56:05 -08:00
validatePrevote(t, cs1, 0, vss[0], propBlock.Hash())
2015-09-22 18:12:34 -07:00
// the others sign a polka but we don't see it
prevotes := signVotes(types.VoteTypePrevote, propBlock.Hash(), propBlock.MakePartSet(partSize).Header(), vs2, vs3, vs4)
2015-09-22 18:12:34 -07:00
// before we time out into new round, set next proposer
// and next proposal block
2015-12-13 11:56:05 -08:00
/*
_, v1 := cs1.Validators.GetByAddress(vss[0].Address)
v1.VotingPower = 1
if updated := cs1.Validators.Update(v1); !updated {
2016-07-11 18:10:05 -07:00
panic("failed to update validator")
2015-12-13 11:56:05 -08:00
}*/
2015-09-22 18:12:34 -07:00
2017-05-02 00:53:32 -07:00
t.Logf("old prop hash %v", fmt.Sprintf("%X", propBlock.Hash()))
2015-09-22 18:12:34 -07:00
// we do see them precommit nil
signAddVotes(cs1, types.VoteTypePrecommit, nil, types.PartSetHeader{}, vs2, vs3, vs4)
2015-09-22 18:12:34 -07:00
prop, propBlock := decideProposal(cs1, vs2, vs2.Height, vs2.Round+1)
2015-12-13 11:56:05 -08:00
propBlockHash := propBlock.Hash()
propBlockParts := propBlock.MakePartSet(partSize)
2015-09-22 18:12:34 -07:00
incrementRound(vs2, vs3, vs4)
2015-09-22 18:12:34 -07:00
2017-05-20 21:43:00 -07:00
//XXX: this isnt guaranteed to get there before the timeoutPropose ...
2017-09-06 10:11:47 -07:00
if err := cs1.SetProposalAndBlock(prop, propBlock, propBlockParts, "some peer"); err != nil {
t.Fatal(err)
}
2015-12-13 11:56:05 -08:00
<-newRoundCh
2017-05-02 00:53:32 -07:00
t.Log("### ONTO ROUND 1")
2015-09-22 18:12:34 -07:00
/*Round2
// we timeout and prevote our lock
// a polka happened but we didn't see it!
*/
// now we're on a new round and not the proposer,
2015-12-13 11:56:05 -08:00
// but we should receive the proposal
select {
case re = <-proposalCh:
case <-timeoutProposeCh:
re = <-proposalCh
}
2018-04-05 08:17:10 -07:00
rs = re.(types.EventDataRoundState).RoundState.(*cstypes.RoundState)
2015-12-13 11:56:05 -08:00
if rs.LockedBlock != nil {
2016-07-11 18:10:05 -07:00
panic("we should not be locked!")
2015-09-22 18:12:34 -07:00
}
2017-05-02 00:53:32 -07:00
t.Logf("new prop hash %v", fmt.Sprintf("%X", propBlockHash))
2015-09-22 18:12:34 -07:00
// go to prevote, prevote for proposal block
2015-12-13 11:56:05 -08:00
<-voteCh
2015-12-01 20:12:01 -08:00
validatePrevote(t, cs1, 1, vss[0], propBlockHash)
2015-09-22 18:12:34 -07:00
// now we see the others prevote for it, so we should lock on it
signAddVotes(cs1, types.VoteTypePrevote, propBlockHash, propBlockParts.Header(), vs2, vs3, vs4)
2015-12-13 11:56:05 -08:00
<-voteCh // precommit
2015-09-22 18:12:34 -07:00
// we should have precommitted
2015-12-01 20:12:01 -08:00
validatePrecommit(t, cs1, 1, 1, vss[0], propBlockHash, propBlockHash)
2015-09-22 18:12:34 -07:00
signAddVotes(cs1, types.VoteTypePrecommit, nil, types.PartSetHeader{}, vs2, vs3)
2015-12-13 11:56:05 -08:00
<-timeoutWaitCh
2015-09-22 18:12:34 -07:00
incrementRound(vs2, vs3, vs4)
2015-09-22 18:12:34 -07:00
2015-12-13 11:56:05 -08:00
<-newRoundCh
2017-05-02 00:53:32 -07:00
t.Log("### ONTO ROUND 2")
2015-09-22 18:12:34 -07:00
/*Round3
we see the polka from round 1 but we shouldn't unlock!
*/
// timeout of propose
2015-12-13 11:56:05 -08:00
<-timeoutProposeCh
2015-09-22 18:12:34 -07:00
// finish prevote
2015-12-13 11:56:05 -08:00
<-voteCh
2015-09-22 18:12:34 -07:00
// we should prevote what we're locked on
2015-12-01 20:12:01 -08:00
validatePrevote(t, cs1, 2, vss[0], propBlockHash)
2015-09-22 18:12:34 -07:00
new pubsub package comment out failing consensus tests for now rewrite rpc httpclient to use new pubsub package import pubsub as tmpubsub, query as tmquery make event IDs constants EventKey -> EventTypeKey rename EventsPubsub to PubSub mempool does not use pubsub rename eventsSub to pubsub new subscribe API fix channel size issues and consensus tests bugs refactor rpc client add missing discardFromChan method add mutex rename pubsub to eventBus remove IsRunning from WSRPCConnection interface (not needed) add a comment in broadcastNewRoundStepsAndVotes rename registerEventCallbacks to broadcastNewRoundStepsAndVotes See https://dave.cheney.net/2014/03/19/channel-axioms stop eventBuses after reactor tests remove unnecessary Unsubscribe return subscribe helper function move discardFromChan to where it is used subscribe now returns an err this gives us ability to refuse to subscribe if pubsub is at its max capacity. use context for control overflow cache queries handle err when subscribing in replay_test rename testClientID to testSubscriber extract var set channel buffer capacity to 1 in replay_file fix byzantine_test unsubscribe from single event, not all events refactor httpclient to return events to appropriate channels return failing testReplayCrashBeforeWriteVote test fix TestValidatorSetChanges refactor code a bit fix testReplayCrashBeforeWriteVote add comment fix TestValidatorSetChanges fixes from Bucky's review update comment [ci skip] test TxEventBuffer update changelog fix TestValidatorSetChanges (2nd attempt) only do wg.Done when no errors benchmark event bus create pubsub server inside NewEventBus only expose config params (later if needed) set buffer capacity to 0 so we are not testing cache new tx event format: key = "Tx" plus a tag {"tx.hash": XYZ} This should allow to subscribe to all transactions! or a specific one using a query: "tm.events.type = Tx and tx.hash = '013ABF99434...'" use TimeoutCommit instead of afterPublishEventNewBlockTimeout TimeoutCommit is the time a node waits after committing a block, before it goes into the next height. So it will finish everything from the last block, but then wait a bit. The idea is this gives it time to hear more votes from other validators, to strengthen the commit it includes in the next block. But it also gives it time to hear about new transactions. waitForBlockWithUpdatedVals rewrite WAL crash tests Task: test that we can recover from any WAL crash. Solution: the old tests were relying on event hub being run in the same thread (we were injecting the private validator's last signature). when considering a rewrite, we considered two possible solutions: write a "fuzzy" testing system where WAL is crashing upon receiving a new message, or inject failures and trigger them in tests using something like https://github.com/coreos/gofail. remove sleep no cs.Lock around wal.Save test different cases (empty block, non-empty block, ...) comments add comments test 4 cases: empty block, non-empty block, non-empty block with smaller part size, many blocks fixes as per Bucky's last review reset subscriptions on UnsubscribeAll use a simple counter to track message for which we panicked also, set a smaller part size for all test cases
2017-06-26 08:00:30 -07:00
newStepCh := subscribe(cs1.eventBus, types.EventQueryNewRoundStep)
2015-09-22 18:12:34 -07:00
// add prevotes from the earlier round
addVotes(cs1, prevotes...)
2015-09-22 18:12:34 -07:00
2017-05-02 00:53:32 -07:00
t.Log("Done adding prevotes!")
2015-09-22 18:12:34 -07:00
ensureNoNewStep(newStepCh)
2015-09-22 18:12:34 -07:00
}
// 4 vals.
2015-12-13 11:56:05 -08:00
// polka P0 at R0, P1 at R1, and P2 at R2,
// we lock on P0 at R0, don't see P1, and unlock using P2 at R2
// then we should make sure we don't lock using P1
// What we want:
// dont see P0, lock on P1 at R1, dont unlock using P0 at R2
2018-01-18 17:38:19 -08:00
func TestStateLockPOLSafety2(t *testing.T) {
2016-01-18 12:57:57 -08:00
cs1, vss := randConsensusState(4)
vs2, vs3, vs4 := vss[1], vss[2], vss[3]
2015-09-22 18:12:34 -07:00
partSize := types.BlockPartSizeBytes
new pubsub package comment out failing consensus tests for now rewrite rpc httpclient to use new pubsub package import pubsub as tmpubsub, query as tmquery make event IDs constants EventKey -> EventTypeKey rename EventsPubsub to PubSub mempool does not use pubsub rename eventsSub to pubsub new subscribe API fix channel size issues and consensus tests bugs refactor rpc client add missing discardFromChan method add mutex rename pubsub to eventBus remove IsRunning from WSRPCConnection interface (not needed) add a comment in broadcastNewRoundStepsAndVotes rename registerEventCallbacks to broadcastNewRoundStepsAndVotes See https://dave.cheney.net/2014/03/19/channel-axioms stop eventBuses after reactor tests remove unnecessary Unsubscribe return subscribe helper function move discardFromChan to where it is used subscribe now returns an err this gives us ability to refuse to subscribe if pubsub is at its max capacity. use context for control overflow cache queries handle err when subscribing in replay_test rename testClientID to testSubscriber extract var set channel buffer capacity to 1 in replay_file fix byzantine_test unsubscribe from single event, not all events refactor httpclient to return events to appropriate channels return failing testReplayCrashBeforeWriteVote test fix TestValidatorSetChanges refactor code a bit fix testReplayCrashBeforeWriteVote add comment fix TestValidatorSetChanges fixes from Bucky's review update comment [ci skip] test TxEventBuffer update changelog fix TestValidatorSetChanges (2nd attempt) only do wg.Done when no errors benchmark event bus create pubsub server inside NewEventBus only expose config params (later if needed) set buffer capacity to 0 so we are not testing cache new tx event format: key = "Tx" plus a tag {"tx.hash": XYZ} This should allow to subscribe to all transactions! or a specific one using a query: "tm.events.type = Tx and tx.hash = '013ABF99434...'" use TimeoutCommit instead of afterPublishEventNewBlockTimeout TimeoutCommit is the time a node waits after committing a block, before it goes into the next height. So it will finish everything from the last block, but then wait a bit. The idea is this gives it time to hear more votes from other validators, to strengthen the commit it includes in the next block. But it also gives it time to hear about new transactions. waitForBlockWithUpdatedVals rewrite WAL crash tests Task: test that we can recover from any WAL crash. Solution: the old tests were relying on event hub being run in the same thread (we were injecting the private validator's last signature). when considering a rewrite, we considered two possible solutions: write a "fuzzy" testing system where WAL is crashing upon receiving a new message, or inject failures and trigger them in tests using something like https://github.com/coreos/gofail. remove sleep no cs.Lock around wal.Save test different cases (empty block, non-empty block, ...) comments add comments test 4 cases: empty block, non-empty block, non-empty block with smaller part size, many blocks fixes as per Bucky's last review reset subscriptions on UnsubscribeAll use a simple counter to track message for which we panicked also, set a smaller part size for all test cases
2017-06-26 08:00:30 -07:00
proposalCh := subscribe(cs1.eventBus, types.EventQueryCompleteProposal)
timeoutProposeCh := subscribe(cs1.eventBus, types.EventQueryTimeoutPropose)
timeoutWaitCh := subscribe(cs1.eventBus, types.EventQueryTimeoutWait)
newRoundCh := subscribe(cs1.eventBus, types.EventQueryNewRound)
unlockCh := subscribe(cs1.eventBus, types.EventQueryUnlock)
voteCh := subscribeToVoter(cs1, cs1.privValidator.GetAddress())
2015-09-22 18:12:34 -07:00
2015-12-13 11:56:05 -08:00
// the block for R0: gets polkad but we miss it
// (even though we signed it, shhh)
_, propBlock0 := decideProposal(cs1, vss[0], cs1.Height, cs1.Round)
propBlockHash0 := propBlock0.Hash()
propBlockParts0 := propBlock0.MakePartSet(partSize)
2015-09-22 18:12:34 -07:00
2015-12-13 11:56:05 -08:00
// the others sign a polka but we don't see it
prevotes := signVotes(types.VoteTypePrevote, propBlockHash0, propBlockParts0.Header(), vs2, vs3, vs4)
2015-09-22 18:12:34 -07:00
2015-12-13 11:56:05 -08:00
// the block for round 1
prop1, propBlock1 := decideProposal(cs1, vs2, vs2.Height, vs2.Round+1)
2015-12-13 11:56:05 -08:00
propBlockHash1 := propBlock1.Hash()
propBlockParts1 := propBlock1.MakePartSet(partSize)
propBlockID1 := types.BlockID{propBlockHash1, propBlockParts1.Header()}
2015-09-22 18:12:34 -07:00
incrementRound(vs2, vs3, vs4)
2015-09-22 18:12:34 -07:00
cs1.updateRoundStep(0, cstypes.RoundStepPrecommitWait)
2015-09-22 18:12:34 -07:00
2017-05-02 00:53:32 -07:00
t.Log("### ONTO Round 1")
2015-12-13 11:56:05 -08:00
// jump in at round 1
height := cs1.Height
startTestRound(cs1, height, 1)
<-newRoundCh
2015-09-22 18:12:34 -07:00
2017-09-06 10:11:47 -07:00
if err := cs1.SetProposalAndBlock(prop1, propBlock1, propBlockParts1, "some peer"); err != nil {
t.Fatal(err)
}
2015-12-13 11:56:05 -08:00
<-proposalCh
2015-09-22 18:12:34 -07:00
2015-12-13 11:56:05 -08:00
<-voteCh // prevote
2015-09-22 18:12:34 -07:00
signAddVotes(cs1, types.VoteTypePrevote, propBlockHash1, propBlockParts1.Header(), vs2, vs3, vs4)
2015-09-22 18:12:34 -07:00
2015-12-13 11:56:05 -08:00
<-voteCh // precommit
// the proposed block should now be locked and our precommit added
validatePrecommit(t, cs1, 1, 1, vss[0], propBlockHash1, propBlockHash1)
2015-09-22 18:12:34 -07:00
2015-12-13 11:56:05 -08:00
// add precommits from the rest
signAddVotes(cs1, types.VoteTypePrecommit, nil, types.PartSetHeader{}, vs2, vs4)
signAddVotes(cs1, types.VoteTypePrecommit, propBlockHash1, propBlockParts1.Header(), vs3)
2015-09-22 18:12:34 -07:00
incrementRound(vs2, vs3, vs4)
2015-09-22 18:12:34 -07:00
2015-12-13 11:56:05 -08:00
// timeout of precommit wait to new round
<-timeoutWaitCh
2015-09-22 18:12:34 -07:00
2015-12-13 11:56:05 -08:00
// in round 2 we see the polkad block from round 0
newProp := types.NewProposal(height, 2, propBlockParts0.Header(), 0, propBlockID1)
2017-12-10 17:43:58 -08:00
if err := vs3.SignProposal(config.ChainID(), newProp); err != nil {
t.Fatal(err)
2015-12-13 11:56:05 -08:00
}
2017-09-06 10:11:47 -07:00
if err := cs1.SetProposalAndBlock(newProp, propBlock0, propBlockParts0, "some peer"); err != nil {
t.Fatal(err)
}
// Add the pol votes
addVotes(cs1, prevotes...)
2015-09-22 18:12:34 -07:00
2015-12-13 11:56:05 -08:00
<-newRoundCh
2017-05-02 00:53:32 -07:00
t.Log("### ONTO Round 2")
2015-12-13 11:56:05 -08:00
/*Round2
// now we see the polka from round 1, but we shouldnt unlock
2015-09-22 18:12:34 -07:00
*/
2015-12-13 11:56:05 -08:00
select {
case <-timeoutProposeCh:
<-proposalCh
case <-proposalCh:
}
2015-09-22 18:12:34 -07:00
2015-12-13 11:56:05 -08:00
select {
case <-unlockCh:
2016-07-11 18:10:05 -07:00
panic("validator unlocked using an old polka")
2015-12-13 11:56:05 -08:00
case <-voteCh:
// prevote our locked block
2015-09-22 18:12:34 -07:00
}
2015-12-13 11:56:05 -08:00
validatePrevote(t, cs1, 2, vss[0], propBlockHash1)
2015-09-22 18:12:34 -07:00
}
//------------------------------------------------------------------------------------------
// SlashingSuite
2015-12-13 11:56:05 -08:00
// TODO: Slashing
2015-09-22 18:12:34 -07:00
2015-12-13 11:56:05 -08:00
/*
2018-01-18 17:38:19 -08:00
func TestStateSlashingPrevotes(t *testing.T) {
2016-01-18 12:57:57 -08:00
cs1, vss := randConsensusState(2)
vs2 := vss[1]
2015-12-13 11:56:05 -08:00
new pubsub package comment out failing consensus tests for now rewrite rpc httpclient to use new pubsub package import pubsub as tmpubsub, query as tmquery make event IDs constants EventKey -> EventTypeKey rename EventsPubsub to PubSub mempool does not use pubsub rename eventsSub to pubsub new subscribe API fix channel size issues and consensus tests bugs refactor rpc client add missing discardFromChan method add mutex rename pubsub to eventBus remove IsRunning from WSRPCConnection interface (not needed) add a comment in broadcastNewRoundStepsAndVotes rename registerEventCallbacks to broadcastNewRoundStepsAndVotes See https://dave.cheney.net/2014/03/19/channel-axioms stop eventBuses after reactor tests remove unnecessary Unsubscribe return subscribe helper function move discardFromChan to where it is used subscribe now returns an err this gives us ability to refuse to subscribe if pubsub is at its max capacity. use context for control overflow cache queries handle err when subscribing in replay_test rename testClientID to testSubscriber extract var set channel buffer capacity to 1 in replay_file fix byzantine_test unsubscribe from single event, not all events refactor httpclient to return events to appropriate channels return failing testReplayCrashBeforeWriteVote test fix TestValidatorSetChanges refactor code a bit fix testReplayCrashBeforeWriteVote add comment fix TestValidatorSetChanges fixes from Bucky's review update comment [ci skip] test TxEventBuffer update changelog fix TestValidatorSetChanges (2nd attempt) only do wg.Done when no errors benchmark event bus create pubsub server inside NewEventBus only expose config params (later if needed) set buffer capacity to 0 so we are not testing cache new tx event format: key = "Tx" plus a tag {"tx.hash": XYZ} This should allow to subscribe to all transactions! or a specific one using a query: "tm.events.type = Tx and tx.hash = '013ABF99434...'" use TimeoutCommit instead of afterPublishEventNewBlockTimeout TimeoutCommit is the time a node waits after committing a block, before it goes into the next height. So it will finish everything from the last block, but then wait a bit. The idea is this gives it time to hear more votes from other validators, to strengthen the commit it includes in the next block. But it also gives it time to hear about new transactions. waitForBlockWithUpdatedVals rewrite WAL crash tests Task: test that we can recover from any WAL crash. Solution: the old tests were relying on event hub being run in the same thread (we were injecting the private validator's last signature). when considering a rewrite, we considered two possible solutions: write a "fuzzy" testing system where WAL is crashing upon receiving a new message, or inject failures and trigger them in tests using something like https://github.com/coreos/gofail. remove sleep no cs.Lock around wal.Save test different cases (empty block, non-empty block, ...) comments add comments test 4 cases: empty block, non-empty block, non-empty block with smaller part size, many blocks fixes as per Bucky's last review reset subscriptions on UnsubscribeAll use a simple counter to track message for which we panicked also, set a smaller part size for all test cases
2017-06-26 08:00:30 -07:00
proposalCh := subscribe(cs1.eventBus, types.EventQueryCompleteProposal)
timeoutWaitCh := subscribe(cs1.eventBus, types.EventQueryTimeoutWait)
newRoundCh := subscribe(cs1.eventBus, types.EventQueryNewRound)
voteCh := subscribeToVoter(cs1, cs1.privValidator.GetAddress())
2015-09-22 18:12:34 -07:00
// start round and wait for propose and prevote
2015-12-13 11:56:05 -08:00
startTestRound(cs1, cs1.Height, 0)
<-newRoundCh
re := <-proposalCh
<-voteCh // prevote
rs := re.(types.EventDataRoundState).RoundState.(*cstypes.RoundState)
2015-09-22 18:12:34 -07:00
// we should now be stuck in limbo forever, waiting for more prevotes
// add one for a different block should cause us to go into prevote wait
2016-07-11 17:40:48 -07:00
hash := rs.ProposalBlock.Hash()
2015-09-22 18:12:34 -07:00
hash[0] = byte(hash[0]+1) % 255
signAddVotes(cs1, types.VoteTypePrevote, hash, rs.ProposalBlockParts.Header(), vs2)
2015-09-22 18:12:34 -07:00
2015-12-13 11:56:05 -08:00
<-timeoutWaitCh
2015-09-22 18:12:34 -07:00
// NOTE: we have to send the vote for different block first so we don't just go into precommit round right
// away and ignore more prevotes (and thus fail to slash!)
// add the conflicting vote
signAddVotes(cs1, types.VoteTypePrevote, rs.ProposalBlock.Hash(), rs.ProposalBlockParts.Header(), vs2)
2015-09-22 18:12:34 -07:00
2015-11-01 11:34:08 -08:00
// XXX: Check for existence of Dupeout info
2015-09-22 18:12:34 -07:00
}
2018-01-18 17:38:19 -08:00
func TestStateSlashingPrecommits(t *testing.T) {
2016-01-18 12:57:57 -08:00
cs1, vss := randConsensusState(2)
vs2 := vss[1]
2015-12-13 11:56:05 -08:00
new pubsub package comment out failing consensus tests for now rewrite rpc httpclient to use new pubsub package import pubsub as tmpubsub, query as tmquery make event IDs constants EventKey -> EventTypeKey rename EventsPubsub to PubSub mempool does not use pubsub rename eventsSub to pubsub new subscribe API fix channel size issues and consensus tests bugs refactor rpc client add missing discardFromChan method add mutex rename pubsub to eventBus remove IsRunning from WSRPCConnection interface (not needed) add a comment in broadcastNewRoundStepsAndVotes rename registerEventCallbacks to broadcastNewRoundStepsAndVotes See https://dave.cheney.net/2014/03/19/channel-axioms stop eventBuses after reactor tests remove unnecessary Unsubscribe return subscribe helper function move discardFromChan to where it is used subscribe now returns an err this gives us ability to refuse to subscribe if pubsub is at its max capacity. use context for control overflow cache queries handle err when subscribing in replay_test rename testClientID to testSubscriber extract var set channel buffer capacity to 1 in replay_file fix byzantine_test unsubscribe from single event, not all events refactor httpclient to return events to appropriate channels return failing testReplayCrashBeforeWriteVote test fix TestValidatorSetChanges refactor code a bit fix testReplayCrashBeforeWriteVote add comment fix TestValidatorSetChanges fixes from Bucky's review update comment [ci skip] test TxEventBuffer update changelog fix TestValidatorSetChanges (2nd attempt) only do wg.Done when no errors benchmark event bus create pubsub server inside NewEventBus only expose config params (later if needed) set buffer capacity to 0 so we are not testing cache new tx event format: key = "Tx" plus a tag {"tx.hash": XYZ} This should allow to subscribe to all transactions! or a specific one using a query: "tm.events.type = Tx and tx.hash = '013ABF99434...'" use TimeoutCommit instead of afterPublishEventNewBlockTimeout TimeoutCommit is the time a node waits after committing a block, before it goes into the next height. So it will finish everything from the last block, but then wait a bit. The idea is this gives it time to hear more votes from other validators, to strengthen the commit it includes in the next block. But it also gives it time to hear about new transactions. waitForBlockWithUpdatedVals rewrite WAL crash tests Task: test that we can recover from any WAL crash. Solution: the old tests were relying on event hub being run in the same thread (we were injecting the private validator's last signature). when considering a rewrite, we considered two possible solutions: write a "fuzzy" testing system where WAL is crashing upon receiving a new message, or inject failures and trigger them in tests using something like https://github.com/coreos/gofail. remove sleep no cs.Lock around wal.Save test different cases (empty block, non-empty block, ...) comments add comments test 4 cases: empty block, non-empty block, non-empty block with smaller part size, many blocks fixes as per Bucky's last review reset subscriptions on UnsubscribeAll use a simple counter to track message for which we panicked also, set a smaller part size for all test cases
2017-06-26 08:00:30 -07:00
proposalCh := subscribe(cs1.eventBus, types.EventQueryCompleteProposal)
timeoutWaitCh := subscribe(cs1.eventBus, types.EventQueryTimeoutWait)
newRoundCh := subscribe(cs1.eventBus, types.EventQueryNewRound)
voteCh := subscribeToVoter(cs1, cs1.privValidator.GetAddress())
2015-09-22 18:12:34 -07:00
// start round and wait for propose and prevote
2015-12-13 11:56:05 -08:00
startTestRound(cs1, cs1.Height, 0)
<-newRoundCh
re := <-proposalCh
<-voteCh // prevote
2015-09-22 18:12:34 -07:00
// add prevote from vs2
signAddVotes(cs1, types.VoteTypePrevote, rs.ProposalBlock.Hash(), rs.ProposalBlockParts.Header(), vs2)
2015-09-22 18:12:34 -07:00
2015-12-13 11:56:05 -08:00
<-voteCh // precommit
2015-09-22 18:12:34 -07:00
// we should now be stuck in limbo forever, waiting for more prevotes
// add one for a different block should cause us to go into prevote wait
2015-12-13 11:56:05 -08:00
hash := rs.ProposalBlock.Hash()
2015-09-22 18:12:34 -07:00
hash[0] = byte(hash[0]+1) % 255
signAddVotes(cs1, types.VoteTypePrecommit, hash, rs.ProposalBlockParts.Header(), vs2)
2015-09-22 18:12:34 -07:00
// NOTE: we have to send the vote for different block first so we don't just go into precommit round right
// away and ignore more prevotes (and thus fail to slash!)
// add precommit from vs2
signAddVotes(cs1, types.VoteTypePrecommit, rs.ProposalBlock.Hash(), rs.ProposalBlockParts.Header(), vs2)
2015-09-22 18:12:34 -07:00
2015-11-01 11:34:08 -08:00
// XXX: Check for existence of Dupeout info
2015-09-22 18:12:34 -07:00
}
2015-12-13 11:56:05 -08:00
*/
2015-09-22 18:12:34 -07:00
//------------------------------------------------------------------------------------------
// CatchupSuite
//------------------------------------------------------------------------------------------
// HaltSuite
// 4 vals.
// we receive a final precommit after going into next round, but others might have gone to commit already!
2018-01-18 17:38:19 -08:00
func TestStateHalt1(t *testing.T) {
2016-01-18 12:57:57 -08:00
cs1, vss := randConsensusState(4)
vs2, vs3, vs4 := vss[1], vss[2], vss[3]
2015-09-22 18:12:34 -07:00
partSize := types.BlockPartSizeBytes
new pubsub package comment out failing consensus tests for now rewrite rpc httpclient to use new pubsub package import pubsub as tmpubsub, query as tmquery make event IDs constants EventKey -> EventTypeKey rename EventsPubsub to PubSub mempool does not use pubsub rename eventsSub to pubsub new subscribe API fix channel size issues and consensus tests bugs refactor rpc client add missing discardFromChan method add mutex rename pubsub to eventBus remove IsRunning from WSRPCConnection interface (not needed) add a comment in broadcastNewRoundStepsAndVotes rename registerEventCallbacks to broadcastNewRoundStepsAndVotes See https://dave.cheney.net/2014/03/19/channel-axioms stop eventBuses after reactor tests remove unnecessary Unsubscribe return subscribe helper function move discardFromChan to where it is used subscribe now returns an err this gives us ability to refuse to subscribe if pubsub is at its max capacity. use context for control overflow cache queries handle err when subscribing in replay_test rename testClientID to testSubscriber extract var set channel buffer capacity to 1 in replay_file fix byzantine_test unsubscribe from single event, not all events refactor httpclient to return events to appropriate channels return failing testReplayCrashBeforeWriteVote test fix TestValidatorSetChanges refactor code a bit fix testReplayCrashBeforeWriteVote add comment fix TestValidatorSetChanges fixes from Bucky's review update comment [ci skip] test TxEventBuffer update changelog fix TestValidatorSetChanges (2nd attempt) only do wg.Done when no errors benchmark event bus create pubsub server inside NewEventBus only expose config params (later if needed) set buffer capacity to 0 so we are not testing cache new tx event format: key = "Tx" plus a tag {"tx.hash": XYZ} This should allow to subscribe to all transactions! or a specific one using a query: "tm.events.type = Tx and tx.hash = '013ABF99434...'" use TimeoutCommit instead of afterPublishEventNewBlockTimeout TimeoutCommit is the time a node waits after committing a block, before it goes into the next height. So it will finish everything from the last block, but then wait a bit. The idea is this gives it time to hear more votes from other validators, to strengthen the commit it includes in the next block. But it also gives it time to hear about new transactions. waitForBlockWithUpdatedVals rewrite WAL crash tests Task: test that we can recover from any WAL crash. Solution: the old tests were relying on event hub being run in the same thread (we were injecting the private validator's last signature). when considering a rewrite, we considered two possible solutions: write a "fuzzy" testing system where WAL is crashing upon receiving a new message, or inject failures and trigger them in tests using something like https://github.com/coreos/gofail. remove sleep no cs.Lock around wal.Save test different cases (empty block, non-empty block, ...) comments add comments test 4 cases: empty block, non-empty block, non-empty block with smaller part size, many blocks fixes as per Bucky's last review reset subscriptions on UnsubscribeAll use a simple counter to track message for which we panicked also, set a smaller part size for all test cases
2017-06-26 08:00:30 -07:00
proposalCh := subscribe(cs1.eventBus, types.EventQueryCompleteProposal)
timeoutWaitCh := subscribe(cs1.eventBus, types.EventQueryTimeoutWait)
newRoundCh := subscribe(cs1.eventBus, types.EventQueryNewRound)
newBlockCh := subscribe(cs1.eventBus, types.EventQueryNewBlock)
voteCh := subscribeToVoter(cs1, cs1.privValidator.GetAddress())
2015-09-22 18:12:34 -07:00
// start round and wait for propose and prevote
2015-12-13 11:56:05 -08:00
startTestRound(cs1, cs1.Height, 0)
<-newRoundCh
re := <-proposalCh
2018-04-05 08:17:10 -07:00
rs := re.(types.EventDataRoundState).RoundState.(*cstypes.RoundState)
2015-12-13 11:56:05 -08:00
propBlock := rs.ProposalBlock
propBlockParts := propBlock.MakePartSet(partSize)
2015-09-22 18:12:34 -07:00
2015-12-13 11:56:05 -08:00
<-voteCh // prevote
2015-09-22 18:12:34 -07:00
signAddVotes(cs1, types.VoteTypePrevote, propBlock.Hash(), propBlockParts.Header(), vs3, vs4)
2015-12-13 11:56:05 -08:00
<-voteCh // precommit
2015-09-22 18:12:34 -07:00
// the proposed block should now be locked and our precommit added
2015-12-13 11:56:05 -08:00
validatePrecommit(t, cs1, 0, 0, vss[0], propBlock.Hash(), propBlock.Hash())
2015-09-22 18:12:34 -07:00
// add precommits from the rest
signAddVotes(cs1, types.VoteTypePrecommit, nil, types.PartSetHeader{}, vs2) // didnt receive proposal
signAddVotes(cs1, types.VoteTypePrecommit, propBlock.Hash(), propBlockParts.Header(), vs3)
// we receive this later, but vs3 might receive it earlier and with ours will go to commit!
precommit4 := signVote(vs4, types.VoteTypePrecommit, propBlock.Hash(), propBlockParts.Header())
2015-09-22 18:12:34 -07:00
incrementRound(vs2, vs3, vs4)
2015-09-22 18:12:34 -07:00
// timeout to new round
2015-12-13 11:56:05 -08:00
<-timeoutWaitCh
re = <-newRoundCh
2018-04-05 08:17:10 -07:00
rs = re.(types.EventDataRoundState).RoundState.(*cstypes.RoundState)
2015-09-22 18:12:34 -07:00
2017-05-02 00:53:32 -07:00
t.Log("### ONTO ROUND 1")
2015-09-22 18:12:34 -07:00
/*Round2
// we timeout and prevote our lock
// a polka happened but we didn't see it!
*/
// go to prevote, prevote for locked block
2015-12-13 11:56:05 -08:00
<-voteCh // prevote
validatePrevote(t, cs1, 0, vss[0], rs.LockedBlock.Hash())
2015-09-22 18:12:34 -07:00
// now we receive the precommit from the previous round
addVotes(cs1, precommit4)
2015-09-22 18:12:34 -07:00
// receiving that precommit should take us straight to commit
2015-12-13 11:56:05 -08:00
<-newBlockCh
re = <-newRoundCh
2018-04-05 08:17:10 -07:00
rs = re.(types.EventDataRoundState).RoundState.(*cstypes.RoundState)
2015-09-22 18:12:34 -07:00
2015-12-13 11:56:05 -08:00
if rs.Height != 2 {
2016-07-11 18:10:05 -07:00
panic("expected height to increment")
2015-09-22 18:12:34 -07:00
}
}
new pubsub package comment out failing consensus tests for now rewrite rpc httpclient to use new pubsub package import pubsub as tmpubsub, query as tmquery make event IDs constants EventKey -> EventTypeKey rename EventsPubsub to PubSub mempool does not use pubsub rename eventsSub to pubsub new subscribe API fix channel size issues and consensus tests bugs refactor rpc client add missing discardFromChan method add mutex rename pubsub to eventBus remove IsRunning from WSRPCConnection interface (not needed) add a comment in broadcastNewRoundStepsAndVotes rename registerEventCallbacks to broadcastNewRoundStepsAndVotes See https://dave.cheney.net/2014/03/19/channel-axioms stop eventBuses after reactor tests remove unnecessary Unsubscribe return subscribe helper function move discardFromChan to where it is used subscribe now returns an err this gives us ability to refuse to subscribe if pubsub is at its max capacity. use context for control overflow cache queries handle err when subscribing in replay_test rename testClientID to testSubscriber extract var set channel buffer capacity to 1 in replay_file fix byzantine_test unsubscribe from single event, not all events refactor httpclient to return events to appropriate channels return failing testReplayCrashBeforeWriteVote test fix TestValidatorSetChanges refactor code a bit fix testReplayCrashBeforeWriteVote add comment fix TestValidatorSetChanges fixes from Bucky's review update comment [ci skip] test TxEventBuffer update changelog fix TestValidatorSetChanges (2nd attempt) only do wg.Done when no errors benchmark event bus create pubsub server inside NewEventBus only expose config params (later if needed) set buffer capacity to 0 so we are not testing cache new tx event format: key = "Tx" plus a tag {"tx.hash": XYZ} This should allow to subscribe to all transactions! or a specific one using a query: "tm.events.type = Tx and tx.hash = '013ABF99434...'" use TimeoutCommit instead of afterPublishEventNewBlockTimeout TimeoutCommit is the time a node waits after committing a block, before it goes into the next height. So it will finish everything from the last block, but then wait a bit. The idea is this gives it time to hear more votes from other validators, to strengthen the commit it includes in the next block. But it also gives it time to hear about new transactions. waitForBlockWithUpdatedVals rewrite WAL crash tests Task: test that we can recover from any WAL crash. Solution: the old tests were relying on event hub being run in the same thread (we were injecting the private validator's last signature). when considering a rewrite, we considered two possible solutions: write a "fuzzy" testing system where WAL is crashing upon receiving a new message, or inject failures and trigger them in tests using something like https://github.com/coreos/gofail. remove sleep no cs.Lock around wal.Save test different cases (empty block, non-empty block, ...) comments add comments test 4 cases: empty block, non-empty block, non-empty block with smaller part size, many blocks fixes as per Bucky's last review reset subscriptions on UnsubscribeAll use a simple counter to track message for which we panicked also, set a smaller part size for all test cases
2017-06-26 08:00:30 -07:00
func TestStateOutputsBlockPartsStats(t *testing.T) {
// create dummy peer
cs, _ := randConsensusState(1)
peer := p2pdummy.NewPeer()
// 1) new block part
parts := types.NewPartSetFromData(cmn.RandBytes(100), 10)
msg := &BlockPartMessage{
Height: 1,
Round: 0,
Part: parts.GetPart(0),
}
cs.ProposalBlockParts = types.NewPartSetFromHeader(parts.Header())
cs.handleMsg(msgInfo{msg, peer.ID()})
statsMessage := <-cs.statsMsgQueue
require.Equal(t, msg, statsMessage.Msg, "")
require.Equal(t, peer.ID(), statsMessage.PeerID, "")
// sending the same part from different peer
cs.handleMsg(msgInfo{msg, "peer2"})
// sending the part with the same height, but different round
msg.Round = 1
cs.handleMsg(msgInfo{msg, peer.ID()})
// sending the part from the smaller height
msg.Height = 0
cs.handleMsg(msgInfo{msg, peer.ID()})
// sending the part from the bigger height
msg.Height = 3
cs.handleMsg(msgInfo{msg, peer.ID()})
select {
case <-cs.statsMsgQueue:
t.Errorf("Should not output stats message after receiving the known block part!")
case <-time.After(50 * time.Millisecond):
}
}
func TestStateOutputVoteStats(t *testing.T) {
cs, vss := randConsensusState(2)
// create dummy peer
peer := p2pdummy.NewPeer()
vote := signVote(vss[1], types.VoteTypePrecommit, []byte("test"), types.PartSetHeader{})
voteMessage := &VoteMessage{vote}
cs.handleMsg(msgInfo{voteMessage, peer.ID()})
statsMessage := <-cs.statsMsgQueue
require.Equal(t, voteMessage, statsMessage.Msg, "")
require.Equal(t, peer.ID(), statsMessage.PeerID, "")
// sending the same part from different peer
cs.handleMsg(msgInfo{&VoteMessage{vote}, "peer2"})
// sending the vote for the bigger height
incrementHeight(vss[1])
vote = signVote(vss[1], types.VoteTypePrecommit, []byte("test"), types.PartSetHeader{})
cs.handleMsg(msgInfo{&VoteMessage{vote}, peer.ID()})
select {
case <-cs.statsMsgQueue:
t.Errorf("Should not output stats message after receiving the known vote or vote from bigger height")
case <-time.After(50 * time.Millisecond):
}
}
new pubsub package comment out failing consensus tests for now rewrite rpc httpclient to use new pubsub package import pubsub as tmpubsub, query as tmquery make event IDs constants EventKey -> EventTypeKey rename EventsPubsub to PubSub mempool does not use pubsub rename eventsSub to pubsub new subscribe API fix channel size issues and consensus tests bugs refactor rpc client add missing discardFromChan method add mutex rename pubsub to eventBus remove IsRunning from WSRPCConnection interface (not needed) add a comment in broadcastNewRoundStepsAndVotes rename registerEventCallbacks to broadcastNewRoundStepsAndVotes See https://dave.cheney.net/2014/03/19/channel-axioms stop eventBuses after reactor tests remove unnecessary Unsubscribe return subscribe helper function move discardFromChan to where it is used subscribe now returns an err this gives us ability to refuse to subscribe if pubsub is at its max capacity. use context for control overflow cache queries handle err when subscribing in replay_test rename testClientID to testSubscriber extract var set channel buffer capacity to 1 in replay_file fix byzantine_test unsubscribe from single event, not all events refactor httpclient to return events to appropriate channels return failing testReplayCrashBeforeWriteVote test fix TestValidatorSetChanges refactor code a bit fix testReplayCrashBeforeWriteVote add comment fix TestValidatorSetChanges fixes from Bucky's review update comment [ci skip] test TxEventBuffer update changelog fix TestValidatorSetChanges (2nd attempt) only do wg.Done when no errors benchmark event bus create pubsub server inside NewEventBus only expose config params (later if needed) set buffer capacity to 0 so we are not testing cache new tx event format: key = "Tx" plus a tag {"tx.hash": XYZ} This should allow to subscribe to all transactions! or a specific one using a query: "tm.events.type = Tx and tx.hash = '013ABF99434...'" use TimeoutCommit instead of afterPublishEventNewBlockTimeout TimeoutCommit is the time a node waits after committing a block, before it goes into the next height. So it will finish everything from the last block, but then wait a bit. The idea is this gives it time to hear more votes from other validators, to strengthen the commit it includes in the next block. But it also gives it time to hear about new transactions. waitForBlockWithUpdatedVals rewrite WAL crash tests Task: test that we can recover from any WAL crash. Solution: the old tests were relying on event hub being run in the same thread (we were injecting the private validator's last signature). when considering a rewrite, we considered two possible solutions: write a "fuzzy" testing system where WAL is crashing upon receiving a new message, or inject failures and trigger them in tests using something like https://github.com/coreos/gofail. remove sleep no cs.Lock around wal.Save test different cases (empty block, non-empty block, ...) comments add comments test 4 cases: empty block, non-empty block, non-empty block with smaller part size, many blocks fixes as per Bucky's last review reset subscriptions on UnsubscribeAll use a simple counter to track message for which we panicked also, set a smaller part size for all test cases
2017-06-26 08:00:30 -07:00
// subscribe subscribes test client to the given query and returns a channel with cap = 1.
func subscribe(eventBus *types.EventBus, q tmpubsub.Query) <-chan interface{} {
out := make(chan interface{}, 1)
err := eventBus.Subscribe(context.Background(), testSubscriber, q, out)
if err != nil {
panic(fmt.Sprintf("failed to subscribe %s to %v", testSubscriber, q))
}
return out
}
// discardFromChan reads n values from the channel.
func discardFromChan(ch <-chan interface{}, n int) {
for i := 0; i < n; i++ {
<-ch
}
}