Skip to content

raft crashes with error="key too large" (DoS) #13281

@slm0n87

Description

@slm0n87

Describe the bug
Internal raft storage crashes when user put secrets into deep nested paths.

To Reproduce
0. Having a running cluster (or single node) with raft storage backend with a secret-engine "secret"

  1. Run vault login ...
  2. Run for count in {550..600}; do echo COUNT:$count; vault kv put secret/$(for i in {1..$count}; do echo -n "1/"; done) foo=bar ; done
  3. For me at a count of 564 directories the raft cluster crashed.

Expected behavior
vault should prevent raft cluster from crashing

Environment:

  • Vault Server Version (retrieve with vault status): 1.9.0
  • Vault CLI Version (retrieve with vault version): Vault v1.8.4 ('925bc650ad1d997e84fbb832f302a6bfe0105bbb+CHANGES')
  • Server Operating System/Architecture: Ubuntu 20.4 LTS

Vault server configuration file(s):

# Ansible managed

storage "raft" {
  path    = "/space/raft-storage/xxxxxxxx-cnxyaqpi"
  node_id = "xxxxxxxx"
}

listener "tcp" {
  address = "0.0.0.0:8200"
  cluster_address = "x.x.x.x:8201"
  tls_disable = "true"
  # trust x-forwarded-for header of HA-Proxies
  x_forwarded_for_authorized_addrs = "x.x.x.x/32,x.x.x.x/32"
  x_forwarded_for_reject_not_present = "false"
}

seal "transit" {
  address = "http://x.x.x.x:8200"
  # token is read from VAULT_TOKEN env
  disable_renewal = "true"
  // Key configuration
  key_name           = "unseal_key"
  mount_path         = "transit/"
}

cluster_addr = "http://x.x.x.x:8201"
api_addr = "http://vault-test.app.xxxxxxxx:8200"
cluster_name = "vault"
raw_storage_endpoint = "true"
ui = "true"

Vault systemd logs:

Nov 25 17:42:13 xxxxxxxx vault[24393]: 2021-11-25T17:42:13.678Z [ERROR] storage.raft.fsm: failed to store data: error="key too large"
Nov 25 17:42:13 xxxxxxxx vault[24393]: panic: failed to store data
Nov 25 17:42:13 xxxxxxxx vault[24393]: goroutine 422 [running]:
Nov 25 17:42:13 xxxxxxxx vault[24393]: github.com/hashicorp/vault/physical/raft.(*FSM).ApplyBatch(0xc000207f40, {0xc000970838, 0x1, 0x44864f})
Nov 25 17:42:13 xxxxxxxx vault[24393]:         /home/runner/work/vault/vault/physical/raft/fsm.go:678 +0x7bf
Nov 25 17:42:13 xxxxxxxx vault[24393]: github.com/hashicorp/go-raftchunking.(*ChunkingBatchingFSM).ApplyBatch(0xc0004da150, {0xc000970830, 0x1, 0x2})
Nov 25 17:42:13 xxxxxxxx vault[24393]:         /home/runner/go/pkg/mod/github.com/hashicorp/go-raftchunking@v0.6.3-0.20191002164813-7e9e8525653a/fsm.go:234 +0x36e
Nov 25 17:42:13 xxxxxxxx vault[24393]: github.com/hashicorp/raft.(*Raft).runFSM.func2({0xc000f1c200, 0x1, 0xc0004da150})
Nov 25 17:42:13 xxxxxxxx vault[24393]:         /home/runner/go/pkg/mod/github.com/hashicorp/raft@v1.3.1/fsm.go:141 +0x1e9
Nov 25 17:42:13 xxxxxxxx vault[24393]: github.com/hashicorp/raft.(*Raft).runFSM(0xc000563b80)
Nov 25 17:42:13 xxxxxxxx vault[24393]:         /home/runner/go/pkg/mod/github.com/hashicorp/raft@v1.3.1/fsm.go:216 +0x35a
Nov 25 17:42:13 xxxxxxxx vault[24393]: github.com/hashicorp/raft.(*raftState).goFunc.func1()
Nov 25 17:42:13 xxxxxxxx vault[24393]:         /home/runner/go/pkg/mod/github.com/hashicorp/raft@v1.3.1/state.go:146 +0x62
Nov 25 17:42:13 xxxxxxxx vault[24393]: created by github.com/hashicorp/raft.(*raftState).goFunc
Nov 25 17:42:13 xxxxxxxx vault[24393]:         /home/runner/go/pkg/mod/github.com/hashicorp/raft@v1.3.1/state.go:144 +0x92
Nov 25 17:42:13 xxxxxxxx systemd[1]: vault.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Nov 25 17:42:13 xxxxxxxx systemd[1]: vault.service: Failed with result 'exit-code'.

Additional context
The issue was discovered by me during some performance and security tests.
In reality no user will ever create a path with 564 directories hopefully. But a potential attacker is able to crash the whole raft cluster with this method.
I would classify this issue as a Denial of Service Bug.
I was not able to recover from that state - I needed to build up a new empty raft cluster and restore the last raft snapshot.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions