[libs/pubsub] fix memory leak

Refs #1755

I started with writing a test for wsConnection (WebsocketManager) where
I:

- create a WS connection
- do a simple echo call
- close it

No leaking goroutines, nor any leaking memory were detected.

For useful shortcuts see my blog post
https://blog.cosmos.network/debugging-the-memory-leak-in-tendermint-210186711420

Then I went to the rpc tests to see if calling Subscribe results in
memory growth. It did.

I used a slightly modified version of TestHeaderEvents function:

```
func TestHeaderEvents(t *testing.T) {
	// memory heap before
	f, err := os.Create("/tmp/mem1.mprof")
	if err != nil {
		t.Fatal(err)
	}
	pprof.WriteHeapProfile(f)
	f.Close()

	for i := 0; i < 100; i++ {
		c := getHTTPClient()
		err = c.Start()
		require.Nil(t, err)

		evtTyp := types.EventNewBlockHeader
		evt, err := client.WaitForOneEvent(c, evtTyp, waitForEventTimeout)
		require.Nil(t, err)
		_, ok := evt.(types.EventDataNewBlockHeader)
		require.True(t, ok)

		c.Stop()
		c = nil
	}

	runtime.GC()

	// memory heap before
	f, err = os.Create("/tmp/mem2.mprof")
	if err != nil {
		t.Fatal(err)
	}
	pprof.WriteHeapProfile(f)
	f.Close()

	// dump all running goroutines
	time.Sleep(10 * time.Second)
	pprof.Lookup("goroutine").WriteTo(os.Stdout, 1)
}
```

```
Showing nodes accounting for 35159.16kB, 100% of 35159.16kB total
Showing top 10 nodes out of 48
      flat  flat%   sum%        cum   cum%
32022.23kB 91.08% 91.08% 32022.23kB 91.08%  github.com/tendermint/tendermint/libs/pubsub/query.(*QueryParser).Init
 1056.33kB  3.00% 94.08%  1056.33kB  3.00%  bufio.NewReaderSize
  528.17kB  1.50% 95.58%   528.17kB  1.50%  bufio.NewWriterSize
  528.17kB  1.50% 97.09%   528.17kB  1.50%  github.com/tendermint/tendermint/consensus.NewConsensusState
  512.19kB  1.46% 98.54%   512.19kB  1.46%  runtime.malg
  512.08kB  1.46%   100%   512.08kB  1.46%  syscall.ByteSliceFromString
         0     0%   100%   512.08kB  1.46%  github.com/tendermint/tendermint/consensus.(*ConsensusState).(github.com/tendermint/tendermint/consensus.defaultDecideProposal)-fm
         0     0%   100%   512.08kB  1.46%  github.com/tendermint/tendermint/consensus.(*ConsensusState).addVote
         0     0%   100%   512.08kB  1.46%  github.com/tendermint/tendermint/consensus.(*ConsensusState).defaultDecideProposal
         0     0%   100%   512.08kB  1.46%  github.com/tendermint/tendermint/consensus.(*ConsensusState).enterNewRound
```

100 subscriptions produce 32MB.

Again, no additional goroutines are running after the end of the test
(wsConnection readRoutine and writeRoutine both finishes). **It means
that some exiting goroutine or object is holding a reference to the
*Query objects, which are leaking.**

One of them is pubsub#loop. It's using state.queries to map queries to
clients and state.clients to map clients to queries.

Before this commit, we're not thoroughly cleaning state.queries, which
was the reason for memory leakage.
This commit is contained in:
Anton Kaliaev 2018-06-19 19:59:21 +04:00
parent aaddf5d32f
commit 4fc06e9d2a
No known key found for this signature in database
GPG Key ID: 7B6881D965918214
1 changed files with 8 additions and 1 deletions

View File

@ -156,6 +156,8 @@ func (s *Server) Subscribe(ctx context.Context, clientID string, query Query, ou
if _, ok = s.subscriptions[clientID]; !ok {
s.subscriptions[clientID] = make(map[string]Query)
}
// preserve original query
// see Unsubscribe
s.subscriptions[clientID][query.String()] = query
s.mtx.Unlock()
return nil
@ -314,6 +316,9 @@ func (state *state) remove(clientID string, q Query) {
}
delete(state.queries[q], clientID)
if len(state.queries[q]) == 0 {
delete(state.queries, q)
}
}
}
@ -328,8 +333,10 @@ func (state *state) removeAll(clientID string) {
close(ch)
delete(state.queries[q], clientID)
if len(state.queries[q]) == 0 {
delete(state.queries, q)
}
}
delete(state.clients, clientID)
}