Skip to content

Setting a timeout for BigQuery async_query doesn't work #4135

@jasonqng

Description

@jasonqng

The documentation and code appears to allow for setting a maximum duration before timing out for an async_query by passing timeout (an int for the number of milliseconds) to the result() function that is called on a query job (https://github.com/GoogleCloudPlatform/google-cloud-python/blob/master/bigquery/google/cloud/bigquery/job.py#L476 and https://github.com/GoogleCloudPlatform/google-cloud-python/blob/master/bigquery/google/cloud/bigquery/job.py#L1311). Similarly, there also appear to be some references to the fetch_data() function of a QueryResults object being able to accept a timeout parameter (https://github.com/GoogleCloudPlatform/google-cloud-python/blob/master/bigquery/google/cloud/bigquery/query.py#L390).

However, passing a timeout value of 1 or 0 in either fashion with a async_query does nothing to interrupt the query from being completed. Queries complete regardless and output results:

from google.cloud import bigquery
import uuid
client = bigquery.Client(project=project)
query_job = client.run_async_query(str(uuid.uuid4()),
                                   "select 'hello' as a, 32423432 as b")
query_job.begin()
query_results = query_job.result(timeout=1)
data = query_results.fetch_data(timeout_ms=1)
rows = list(data)
print(rows)

Output:

[(u'hello', 32423432)]

By contrast, setting a timeout value for a sync_query by setting the timeout_ms property of a QueryJob indeed works as expected as raises the appropriate JobComplete = False flag:

from google.cloud import bigquery
client = bigquery.Client(project=project)
query_job = client.run_sync_query("select 'hello' as a, 32423432 as b")
query_job.timeout_ms=1
query_job.run()
print(query_job._properties.get("jobComplete"))
rows = list(query_job.rows)
print(rows)

Output:

False
[]

It appears timeout for async_queries is still to be implemented and it is unclear how to access a JobIncomplete property via an async_query, forcing users to continue using sync_queries if they want the ability to time out their queries.

cc: @tswast who requested I raise the issue which I identified in this pandas-gbq PR: googleapis/python-bigquery-pandas#25 (comment)

OSX
Python 2.7.13
google-cloud-python 0.27.0 and google-cloud-bigquery 0.26.0

Metadata

Metadata

Assignees

Labels

api: bigqueryIssues related to the BigQuery API.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions