I have a Python function that takes a cloud Firestore collection name as an arg and streams thru every document in that collection to check for errors.
In simplified form, it essentially looks something like this:
from firebase_admin import firestore
def find_all_issues(collection: str) -> None:
fs_client = firestore.client()
coll_ref = fs_client.collection(collection)
doc_snapshot_stream = coll_ref.stream()
for doc_snap in doc_snapshot_stream:
d = doc_snap.to_dict()
# perform error checks on d
The key point here is that I create a stream for the collection and use that to read and process every document in the collection.
The code always works perfect on all but one of my collections.
Unfortunately, on the biggest collection, I not infrequently get an error like this:
grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "Query timed out. Please try either limiting the entities scanned, or run with an updated index configuration."
debug_error_string = "UNKNOWN:Error received from peer ipv4:172.217.15.202:443 {created_time:"2024-06-10T23:10:26.9339739+00:00", grpc_status:14, grpc_message:"Query timed out. Please try either limiting the entities scanned, or run with an updated index configuration."}"
>
This is super frustrating. I can't believe that Firestore cannot create and maintain a rock solid database stream!
The error message offers this very useful sounding advice:
Please try either limiting the entities scanned, or run with an updated index configuration.
The problem is that I have no idea how to execute on it:
limiting the entities scanned: how can I possibly do that? I need to process every document in the collection! Furthermore, I really hope that Google is not expecting Firestore users to somehow manually break up large reads.run with an updated index: I have no idea what index to use to solve this problem. In other Firestore contexts, I have seen Google's error message very helpfully give you the exact index command you should execute to solve the problem. Unfortunately, here Google tells me nothing.
I web searched before posting here and was frustrated to find very little discussion of this issue. But I think that this recent post might be related. Google support says "backend engineers have investigated and found a potential query planner bug".