Skip to content

mempool: Data race when rechecking with socket connection #1827

@hvanz

Description

@hvanz

Bug Report

What happened?

There is a data race to access the variable recheckCursor in CListMempool on a particular recheck scenario when connected to the app via sockets.

This is the scenario:

  • At the end of a mempool Update, all txs in the mempool are sent to the app for rechecking their validity. So, for each tx one CheckTx request is sent to the app. Each request is sent in the order in which the txs appear in the mempool list.
  • Because we are using a socket connection, CheckTx responses may arrive in a different order in which the requests were sent.
  • If a tx that is being rechecked is received again for validation, it may occur a data race when accessing recheckCursor, which is used to traverse clist in order.

Related to tendermint/tendermint#5519.

How to reproduce it

Running the following test should produce a data race. This test was originally written for #1116, which includes a potential fix.

func TestMempoolRecheckRace(t *testing.T) {
	sockPath := fmt.Sprintf("unix:///tmp/echo_%v.sock", cmtrand.Str(6))
	app := kvstore.NewInMemoryApplication()
	_, server := newRemoteApp(t, sockPath, app)
	t.Cleanup(func() {
		if err := server.Stop(); err != nil {
			t.Error(err)
		}
	})
	cfg := test.ResetTestRoot("mempool_test")
	cfg.Mempool.Size = 1000
	cfg.Mempool.CacheSize = 1000
	mp, cleanup := newMempoolWithAppAndConfig(proxy.NewRemoteClientCreator(sockPath, "socket", true), cfg)
	defer cleanup()

	// Add a bunch of transactions to the mempool.
	var err error
	txs := newUniqueTxs(100)
	for _, tx := range txs {
		err = mp.CheckTx(tx, nil, TxInfo{})
		require.NoError(t, err)
	}
	err = mp.FlushAppConn()
	require.NoError(t, err)

	// Update one transaction to force rechecking the rest.
	mp.Lock()
	err = mp.Update(1, txs[:1], abciResponses(1, abci.CodeTypeOK), nil, nil)
	mp.Unlock()
	require.NoError(t, err)

	// Add again the same transaction that was updated. Checking this
	// transaction and rechecking the others simultaneously should not result
	// in a data race for the variable recheckCursor.
	err = mp.CheckTx(txs[:1][0], nil, TxInfo{})
	require.Equal(t, err, ErrTxInCache)
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingmempool

    Type

    No type

    Projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions