Skip to content

No such entry error caused by incorrect metadata #4097

@izumo27

Description

@izumo27

BUG REPORT

Describe the bug

The ensemble bookies of metadata and the actual written bookies are different, resulting in a read failure.


conf
  • ensemble size: 6
  • write quorum size: 2
  • ack quorum size: 2

I encountered the following event.

The write operations failed and ensemble was changed as following:

[A, B, C, D, E, F]

[G, B, C, H, E, F]

[G, I, J, H, E, F]

10:00:11.806 [BookKeeperClientWorker-OrderedExecutor-24-0] WARN  o.a.bookkeeper.client.PendingAddOp   - Failed to write entry (xx, aa): Bookie operation timeout
…
10:00:12.001 [BookKeeperClientWorker-OrderedExecutor-24-0] INFO  o.a.bookkeeper.client.LedgerHandle   - New Ensemble: [G, B, C, H, E, F] for ledger: xx
…
10:00:12.303 [BookKeeperClientWorker-OrderedExecutor-24-0] WARN  o.a.bookkeeper.client.PendingAddOp   - Failed to write entry (xx, bb): Bookie operation timeout
…
10:00:12.527 [BookKeeperClientWorker-OrderedExecutor-24-0] INFO  o.a.bookkeeper.client.LedgerHandle   - New Ensemble: [G, I, J, H, E, F] for ledger: xx

The entries were written to bookie G and H, but not to bookie I and J.

Bookie I and J don't have any entries of ledger xx.

The write and ack quorum is 2, so I got a "No such entry" error.

$ bookkeeper shell readledger -msg -ledgerid xx -firstentryid yy -lastentryid yy


11:25:50.215 [BookKeeperClientWorker-OrderedExecutor-24-0] ERROR o.a.bookkeeper.client.PendingReadOp  - Read of ledger entry failed: Lxx Eyy-Eyy, Sent to [I, J], Heard from [] : bitset = {}, Error = 'No such entry'. First unread entry is (-1, rc = null)
11:25:50.224 [main] ERROR o.a.b.client.BookKeeperAdmin         - Error reading entry yy from ledger xx
org.apache.bookkeeper.client.BKException$BKNoSuchEntryException: No such entry
        at org.apache.bookkeeper.client.SyncCallbackUtils.finish(SyncCallbackUtils.java:83)
        at org.apache.bookkeeper.client.SyncCallbackUtils$SyncReadCallback.readComplete(SyncCallbackUtils.java:231)
        at org.apache.bookkeeper.client.LedgerHandle$6.onFailure(LedgerHandle.java:820)
        at org.apache.bookkeeper.common.concurrent.FutureEventListener.accept(FutureEventListener.java:38)
        at org.apache.bookkeeper.common.concurrent.FutureEventListener.accept(FutureEventListener.java:26)
        at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
        at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
        at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:456)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.lang.Thread.run(Thread.java:750)
11:25:50.425 [main] ERROR o.a.bookkeeper.bookie.BookieShell    - Got an exception
com.google.common.util.concurrent.UncheckedExecutionException: org.apache.bookkeeper.client.BKException$BKNoSuchEntryException: No such entry
        at org.apache.bookkeeper.tools.cli.commands.bookie.ReadLedgerCommand.apply(ReadLedgerCommand.java:137)
        at org.apache.bookkeeper.bookie.BookieShell$ReadLedgerEntriesCmd.runCmd(BookieShell.java:626)
        at org.apache.bookkeeper.bookie.BookieShell$MyCommand.runCmd(BookieShell.java:236)
        at org.apache.bookkeeper.bookie.BookieShell.run(BookieShell.java:2235)
        at org.apache.bookkeeper.bookie.BookieShell.main(BookieShell.java:2328)
Caused by: java.lang.RuntimeException: org.apache.bookkeeper.client.BKException$BKNoSuchEntryException: No such entry
        at org.apache.bookkeeper.client.BookKeeperAdmin$LedgerEntriesIterator.hasNext(BookKeeperAdmin.java:435)
        at org.apache.bookkeeper.tools.cli.commands.bookie.ReadLedgerCommand.readledger(ReadLedgerCommand.java:171)
        at org.apache.bookkeeper.tools.cli.commands.bookie.ReadLedgerCommand.apply(ReadLedgerCommand.java:135)
        ... 4 common frames omitted
Caused by: org.apache.bookkeeper.client.BKException$BKNoSuchEntryException: No such entry
        at org.apache.bookkeeper.client.SyncCallbackUtils.finish(SyncCallbackUtils.java:83)
        at org.apache.bookkeeper.client.SyncCallbackUtils$SyncReadCallback.readComplete(SyncCallbackUtils.java:231)
        at org.apache.bookkeeper.client.LedgerHandle$6.onFailure(LedgerHandle.java:820)
        at org.apache.bookkeeper.common.concurrent.FutureEventListener.accept(FutureEventListener.java:38)
        at org.apache.bookkeeper.common.concurrent.FutureEventListener.accept(FutureEventListener.java:26)
        at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
        at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
        at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:456)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.lang.Thread.run(Thread.java:750)

The actual data exists in bookie B and C.

# at bookie B
$ bookkeeper shell readlog -msg foo.log -ledgerid xx -entryid yy
Scan entry log bar (foo.log) for LedgerId xx for EntryId yy
--------- Lid=xx, Eid=yy, ByteOffset=, EntrySize= ---------
Type:           DATA
LastConfirmed:
Data:

00000000 …

Expected behavior

The ensemble bookies of metadata and the actual written bookies are same.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions