Skip to content

Avoid raising FlakyReplay to users when multiple bugs are found #4228

@carterbox

Description

@carterbox

I'm trying to track down the cause of a flaky test. However, sometimes the test fails with FlakyFailure and other times with FlakyReplay. These are both errors, but no reproduce_failure hash is provided when the error is FlakyReplay.

What is the difference between a FlakyReplay and a FlakyFailure?

class Flaky(_Trimmable):
    """Base class for indeterministic failures. Usually one of the more
    specific subclasses (FlakyFailure or FlakyStrategyDefinition) is raised."""


class FlakyReplay(Flaky):
    """Internal error raised by the conjecture engine if flaky failures are
    detected during replay.

    Carries information allowing the runner to reconstruct the flakiness as
    a FlakyFailure exception group for final presentation.
    """

    def __init__(self, reason, interesting_origins=None):
        super().__init__(reason)
        self.reason = reason
        self._interesting_origins = interesting_origins

class FlakyFailure(ExceptionGroup, Flaky):
    """This function appears to fail non-deterministically: We have seen it
    fail when passed this example at least once, but a subsequent invocation
    did not fail, or caused a distinct error.

    Common causes for this problem are:
        1. The function depends on external state. e.g. it uses an external
           random number generator. Try to make a version that passes all the
           relevant state in from Hypothesis.
        2. The function is suffering from too much recursion and its failure
           depends sensitively on where it's been called from.
        3. The function is timing sensitive and can fail or pass depending on
           how long it takes. Try breaking it up into smaller functions which
           don't do that and testing those instead.
    """

    def __new__(cls, msg, group):
        # The Exception mixin forces this an ExceptionGroup (only accepting
        # Exceptions, not BaseException). Usually BaseException is raised
        # directly and will hence not be part of a FlakyFailure, but I'm not
        # sure this assumption holds everywhere. So wrap any BaseExceptions.
        group = list(group)
        for i, exc in enumerate(group):
            if not isinstance(exc, Exception):
                err = _WrappedBaseException()
                err.__cause__ = err.__context__ = exc
                group[i] = err
        return ExceptionGroup.__new__(cls, msg, group)

Above is the relevant source code from hypothesis. My understanding is that FlakyReplay is "internal" which means it should not be shown to the user, so something is wrong if I am seeing this error in my logs as the final error? In my case, this error is raised when the status goes from INTERESTING to INTERESTING. (Is there documentation for the meaning of this status? Couldn't find any mention in the source.)

Why is reproduce_failure provided for one and not the other?

Is this because FlakyReplay is not intended to be raise to the user?

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugsomething is clearly wrong herelegibilitymake errors helpful and Hypothesis grokable

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions