-
Notifications
You must be signed in to change notification settings - Fork 4k
Description
Checklist
- I have searched the existing issues for similar issues.
- I added a very descriptive title to this issue.
- I have provided sufficient information below to help reproduce this issue.
Summary
I'm trying to use the Google Drive API, which has a quite ugly runtime generated client. When using st.cache_resource with the client, I get a variety of SSL errors (eg: DECRYPTION_FAILED_OR_BAD_RECORD_MAC, BLOCK_CIPHER_PAD_IS_WRONG, BAD_RECORD_TYPE, etc) when changing input triggers a page rerender or with multiple tabs open. I think the client keeps a connection or something open that is not thread/multiprocess safe, similar to described here or here. Given this non-thread safety, st.cache_resource is not the right fit.
I think this same connection (or whatever) also prevents the client from being pickled (TypeError: cannot pickle '_cffi_backend.FFI' object), which means I also can't use it with st.cache_data.
What I'd like is something between the two: for an object to be cached once per thread/process (or "sessions"?). The streamlit caching docs highlight mutation and concurrency issues, but don't really give any guidance for thread unsafe things.
I would have expected this to be a bit more common of an issue (perhaps things like SQLAlchemy's engine's built in thread pooling cover a lot of ground), but couldn't really find any similar issues.
Reproducible Code Example
import streamlit as st
from google.oauth2 import service_account
from googleapiclient.discovery import Resource, build
def get_google_drive_service() -> Resource:
scopes = [
"https://www.googleapis.com/auth/drive",
"https://www.googleapis.com/auth/spreadsheets",
"https://www.googleapis.com/auth/forms",
]
parent_creds = service_account.Credentials.from_service_account_info(
{"service account info": "private"}, scopes=scopes
)
return build("drive", "v3", credentials=parent_creds.with_subject("some.email@example.com"), cache_discovery=False)
gdrive = st.cache_resource(get_google_drive_service())Steps To Reproduce
Get some credentials to create a client as described here and fill them in the snippet above. Then, open a couple different tabs (perhaps add some inputs to trigger rerendering - I can update if there's no obvious solution / this repro is actually worth looking into).
Expected Behavior
There to be an argument to st.cache_resource OR alternative function that supports thread unsafe objects.
Current Behavior
No response
Is this a regression?
- Yes, this used to work in a previous version.
Debug info
- Streamlit version: 1.20.0
- Python version: 3.10.9
- Operating System: macOS 13.3.1 (arm64)
- Browser: Chrome
- Virtual environment:
Additional Information
No response
Are you willing to submit a PR?
- Yes, I am willing to submit a PR!