-
Notifications
You must be signed in to change notification settings - Fork 10.2k
Closed
Description
Hi,
When starting the TSDB f4dd456 (initial version of mmap + chunks) used by Thanos Recevie we got following error:
level=error ts=2020-06-15T13:34:38.5403894Z caller=multitsdb.go:271 component=receive tenant=FB870BF3-9F3A-44FF-9BF7-D7A047A52F43 msg="failed to open tsdb" err="invalid magic number 0"
level=warn ts=2020-06-15T13:34:38.540465482Z caller=intrumentation.go:54 component=receive msg="changing probe status" status=not-ready reason="opening storage: invalid magic number 0"
level=info ts=2020-06-15T13:34:38.540508553Z caller=http.go:81 component=receive service=http/server component=receive msg="internal server shutdown" err="opening storage: invalid magic number 0"
level=info ts=2020-06-15T13:34:38.540523593Z caller=intrumentation.go:66 component=receive msg="changing probe status" status=not-healthy reason="opening storage: invalid magic number 0"
level=error ts=2020-06-15T13:34:38.540633727Z caller=main.go:211 err="invalid magic number 0\nopening storage\nmain.runReceive.func1\n\t/go/src/github.com/thanos-io/thanos/cmd/thanos/receive.go:316\ngithub.com/oklog/run.(*Group).Run.func1\n\t/go/pkg/mod/github.com/oklog/run@v1.1.0/group.go:38\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1373\nreceive command failed\nmain.main\n\t/go/src/github.com/thanos-io/thanos/cmd/thanos/main.go:211\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:203\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1373"
Repro:
- Deploy receive
master-2020-05-25-c733564d( TSDB cd73b3d - without mmap chunks features) - Upgrade and deploy receive to
master-2020-06-03-20004510which maps TSDB upgrade from to 3268eac (mainly adds mmap chunks feature + fixes) - Revert to Thanos
master-2020-05-25-c733564d(so back to TSDB with no mmap chunks) - Upgraded and deploy to
master-2020-05-28-e7d431d3(TSDB f4dd456 with initial mmap feature). - See crash on startup.
I think we hit either lack of compatibility or some kind of partial write race case.
Also, we might want better error wraps in TSDB to ensure which file this actually relates to.
cc @codesome
Reactions are currently unavailable