This repository was archived by the owner on Mar 6, 2026. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 326
This repository was archived by the owner on Mar 6, 2026. It is now read-only.
combination of freezegun and parallel tests causes unit tests to flake #2264
Copy link
Copy link
Closed
Labels
api: bigqueryIssues related to the googleapis/python-bigquery API.Issues related to the googleapis/python-bigquery API.type: processA process-related concern. May include testing, release, or the like.A process-related concern. May include testing, release, or the like.
Description
Problem
Sometimes our retry-related unit tests flake. See: https://github.com/googleapis/python-bigquery/actions/runs/17044794898/job/48317652621?pr=2250
FAILED tests/unit/test_client.py::TestClient::test__call_api_applying_custom_retry_on_timeout - google.api_core.exceptions.RetryError: Timeout of 1.0s exceeded, last exception:
__________ TestClient.test__call_api_applying_custom_retry_on_timeout __________
[gw1] linux -- Python 3.11.13 /home/runner/work/python-bigquery/python-bigquery/.nox/unit-3-11/bin/python
target = functools.partial(functools.partial(<MagicMock name='api_request' id='140254749447632'>, foo='bar'))
predicate = <function TestClient.test__call_api_applying_custom_retry_on_timeout.<locals>.<lambda> at 0x7f8f9c037060>
sleep_generator = <generator object exponential_sleep_generator at 0x7f8f9bd9af20>
timeout = 1, on_error = None
exception_factory = <function build_retry_error at 0x7f8fbb85d300>, kwargs = {}
deadline = 471.796770038, error_list = [TimeoutError()]
sleep_iter = <generator object exponential_sleep_generator at 0x7f8f9bd9af20>
def retry_target(
target: Callable[[], _R],
predicate: Callable[[Exception], bool],
sleep_generator: Iterable[float],
timeout: float | None = None,
on_error: Callable[[Exception], None] | None = None,
exception_factory: Callable[
[list[Exception], RetryFailureReason, float | None],
tuple[Exception, Exception | None],
] = build_retry_error,
**kwargs,
):
"""Call a function and retry if it fails.
This is the lowest-level retry helper. Generally, you'll use the
higher-level retry helper :class:`Retry`.
Args:
target(Callable): The function to call and retry. This must be a
nullary function - apply arguments with `functools.partial`.
predicate (Callable[Exception]): A callable used to determine if an
exception raised by the target should be considered retryable.
It should return True to retry or False otherwise.
sleep_generator (Iterable[float]): An infinite iterator that determines
how long to sleep between retries.
timeout (Optional[float]): How long to keep retrying the target.
Note: timeout is only checked before initiating a retry, so the target may
run past the timeout value as long as it is healthy.
on_error (Optional[Callable[Exception]]): If given, the on_error
callback will be called with each retryable exception raised by the
target. Any error raised by this function will *not* be caught.
exception_factory: A function that is called when the retryable reaches
a terminal failure state, used to construct an exception to be raised.
It takes a list of all exceptions encountered, a retry.RetryFailureReason
enum indicating the failure cause, and the original timeout value
as arguments. It should return a tuple of the exception to be raised,
along with the cause exception if any. The default implementation will raise
a RetryError on timeout, or the last exception encountered otherwise.
deadline (float): DEPRECATED: use ``timeout`` instead. For backward
compatibility, if specified it will override ``timeout`` parameter.
Returns:
Any: the return value of the target function.
Raises:
ValueError: If the sleep generator stops yielding values.
Exception: a custom exception specified by the exception_factory if provided.
If no exception_factory is provided:
google.api_core.RetryError: If the timeout is exceeded while retrying.
Exception: If the target raises an error that isn't retryable.
"""
timeout = kwargs.get("deadline", timeout)
deadline = time.monotonic() + timeout if timeout is not None else None
error_list: list[Exception] = []
sleep_iter = iter(sleep_generator)
# continue trying until an attempt completes, or a terminal exception is raised in _retry_error_helper
# TODO: support max_attempts argument: https://github.com/googleapis/python-api-core/issues/535
while True:
try:
> result = target()
^^^^^^^^
.nox/unit-3-11/lib/python3.11/site-packages/google/api_core/retry/retry_unary.py:147:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
/opt/hostedtoolcache/Python/3.11.13/x64/lib/python3.11/unittest/mock.py:1124: in __call__
return self._mock_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/opt/hostedtoolcache/Python/3.11.13/x64/lib/python3.11/unittest/mock.py:1128: in _mock_call
return self._execute_mock_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <MagicMock name='api_request' id='140254749447632'>, args = ()
kwargs = {'foo': 'bar'}, effect = <list_iterator object at 0x7f8f9a845750>
result = <class 'TimeoutError'>
def _execute_mock_call(self, /, *args, **kwargs):
# separate from _increment_mock_call so that awaited functions are
# executed separately from their call, also AsyncMock overrides this method
effect = self.side_effect
if effect is not None:
if _is_exception(effect):
raise effect
elif not _callable(effect):
result = next(effect)
if _is_exception(result):
> raise result
E TimeoutError
/opt/hostedtoolcache/Python/3.11.13/x64/lib/python3.11/unittest/mock.py:1187: TimeoutError
The above exception was the direct cause of the following exception:
self = <tests.unit.test_client.TestClient testMethod=test__call_api_applying_custom_retry_on_timeout>
def test__call_api_applying_custom_retry_on_timeout(self):
from concurrent.futures import TimeoutError
from google.cloud.bigquery.retry import DEFAULT_RETRY
creds = _make_credentials()
client = self._make_one(project=self.PROJECT, credentials=creds)
api_request_patcher = mock.patch.object(
client._connection,
"api_request",
side_effect=[TimeoutError, "result"],
)
retry = DEFAULT_RETRY.with_deadline(1).with_predicate(
lambda exc: isinstance(exc, TimeoutError)
)
with api_request_patcher as fake_api_request:
> result = client._call_api(retry, foo="bar")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
tests/unit/test_client.py:333:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
google/cloud/bigquery/client.py:863: in _call_api
return call()
^^^^^^
.nox/unit-3-11/lib/python3.11/site-packages/google/api_core/retry/retry_unary.py:294: in retry_wrapped_func
return retry_target(
.nox/unit-3-11/lib/python3.11/site-packages/google/api_core/retry/retry_unary.py:156: in retry_target
next_sleep = _retry_error_helper(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
exc = TimeoutError(), deadline = 471.796770038
sleep_iterator = <generator object exponential_sleep_generator at 0x7f8f9bd9af20>
error_list = [TimeoutError()]
predicate_fn = <function TestClient.test__call_api_applying_custom_retry_on_timeout.<locals>.<lambda> at 0x7f8f9c037060>
on_error_fn = None
exc_factory_fn = <function build_retry_error at 0x7f8fbb85d300>
original_timeout = 1
def _retry_error_helper(
exc: Exception,
deadline: float | None,
sleep_iterator: Iterator[float],
error_list: list[Exception],
predicate_fn: Callable[[Exception], bool],
on_error_fn: Callable[[Exception], None] | None,
exc_factory_fn: Callable[
[list[Exception], RetryFailureReason, float | None],
tuple[Exception, Exception | None],
],
original_timeout: float | None,
) -> float:
"""
Shared logic for handling an error for all retry implementations
- Raises an error on timeout or non-retryable error
- Calls on_error_fn if provided
- Logs the error
Args:
- exc: the exception that was raised
- deadline: the deadline for the retry, calculated as a diff from time.monotonic()
- sleep_iterator: iterator to draw the next backoff value from
- error_list: the list of exceptions that have been raised so far
- predicate_fn: takes `exc` and returns true if the operation should be retried
- on_error_fn: callback to execute when a retryable error occurs
- exc_factory_fn: callback used to build the exception to be raised on terminal failure
- original_timeout_val: the original timeout value for the retry (in seconds),
to be passed to the exception factory for building an error message
Returns:
- the sleep value chosen before the next attempt
"""
error_list.append(exc)
if not predicate_fn(exc):
final_exc, source_exc = exc_factory_fn(
error_list,
RetryFailureReason.NON_RETRYABLE_ERROR,
original_timeout,
)
raise final_exc from source_exc
if on_error_fn is not None:
on_error_fn(exc)
# next_sleep is fetched after the on_error callback, to allow clients
# to update sleep_iterator values dynamically in response to errors
try:
next_sleep = next(sleep_iterator)
except StopIteration:
raise ValueError("Sleep generator stopped yielding sleep values.") from exc
if deadline is not None and time.monotonic() + next_sleep > deadline:
final_exc, source_exc = exc_factory_fn(
error_list,
RetryFailureReason.TIMEOUT,
original_timeout,
)
> raise final_exc from source_exc
E google.api_core.exceptions.RetryError: Timeout of 1.0s exceeded, last exception:
.nox/unit-3-11/lib/python3.11/site-packages/google/api_core/retry/retry_base.py:229: RetryError
Background
We run unit tests in parallel:
Line 131 in 0a95b24
| "-n=8", |
We use freezegun to mock out times, especially in retry tests:
https://github.com/search?q=repo%3Agoogleapis%2Fpython-bigquery%20freezegun&type=code
Proposed solution
From https://betterstack.com/community/guides/testing/time-machine-vs-freezegun/ it seems time-machine might be a better option.
Alternative
Wherever we use freezegun, protect those tests with a lock. See: spulec/freezegun#503 (comment)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
api: bigqueryIssues related to the googleapis/python-bigquery API.Issues related to the googleapis/python-bigquery API.type: processA process-related concern. May include testing, release, or the like.A process-related concern. May include testing, release, or the like.