-
Notifications
You must be signed in to change notification settings - Fork 1.7k
[post-migration] add/ resolve failing tests in bigframes #16489
Description
During migration of the bigframes split repo, there were issues with the mypy presubmits and the Kokoro System Test presubmits failing.
Resolve those issues, ensure the tests pass, and remove the release-blocked tag from the .librarian/config.yaml file for the bigframes package..
Here is a summary of steps that should resolve the Kokoro System Test presubmits:
Part 1: Tasks that the BigFrames team will need to do with a note explaining the ask.
Subject: IAM permissions request for BigFrames CI/CD test service accounts
Hi team,
We are currently blocked by a few cross-project IAM permission errors in the google-cloud-python monorepo CI/CD. The tests run from our project (precise-truck-742), but require read/encryption access to resources in your bigframes-dev-* projects.
Could you please run the following gcloud commands to grant our test service accounts the necessary access?
1. Cloud KMS Encryption Access (bigframes-dev-perf)
- Why: Our BigQuery service agent needs permission to use your CMEK key to test encrypted queries and models.
- Command:
gcloud kms keys add-iam-policy-binding bigframesKey \
--keyring bigframesKeyRing \
--location us \
--project bigframes-dev-perf \
--member="serviceAccount:bq-1065521786570@bigquery-encryption.iam.gserviceaccount.com" \
--role="roles/cloudkms.cryptoKeyEncrypterDecrypter"2. GCS Bucket Read Access (bigframes-dev-testing)
- Why: Our dynamic BigQuery Connection service accounts need to list and read images from
gs://bigframes-dev-testing/a_multimodel/images/*to test Vertex AI multi-model generation. - Commands:
gcloud storage buckets add-iam-policy-binding gs://bigframes-dev-testing \
--member="serviceAccount:bqcx-1065521786570-ljkx@gcp-sa-bigquery-condel.iam.gserviceaccount.com" \
--role="roles/storage.objectViewer"
gcloud storage buckets add-iam-policy-binding gs://bigframes-dev-testing \
--member="serviceAccount:bqcx-1065521786570-qgxu@gcp-sa-bigquery-condel.iam.gserviceaccount.com" \
--role="roles/storage.objectViewer"3. BigQuery Data Viewer Access (bigframes-dev)
- Why: Our Kokoro test runner needs permission to read the static test fixture tables (like
base_table,csv_native_table, etc.) in yourbigframes_tests_sysdataset. - Command:
gcloud projects add-iam-policy-binding bigframes-dev \
--member="serviceAccount:kokoro@precise-truck-742.iam.gserviceaccount.com" \
--role="roles/bigquery.dataViewer"Thank you!
Part 2: Tasks the Cloud Python SDK Team need to do (via PR)
Because precise-truck-742 is our project, we can fix the remaining errors by updating our codebase and our iam_policy.yaml.
- 1. Fix Cloud Run & Vertex AI Permissions (Update
iam_policy.yaml)
Thebqcx-connection service accounts mentioned above also need permission to invoke Cloud Run (for remote functions) and call Vertex AI (for LLM generation) inside your own project.
- Add the following two service accounts to your
iam_policy.yamland grant themroles/run.invokerandroles/aiplatform.user:bqcx-1065521786570-ljkx@gcp-sa-bigquery-condel.iam.gserviceaccount.combqcx-1065521786570-qgxu@gcp-sa-bigquery-condel.iam.gserviceaccount.com
- 2. Fix the ARIMA Model Shape Assertion (Code update)
- File:
tests/system/small/ml/test_forecasting.py - Fix: The BQML
ARIMA_PLUSevaluation function recently added a new metric column, changing the output shape from(1, 5)to(1, 6). Update theexpectedpandas DataFrame in this test to include the new column so it matches the new BQML output.
- 3. Fix PCA and K-Means Flakiness (Code update)
- Files:
tests/system/small/ml/test_cluster.py&test_decomposition.py - Fix: Machine learning model outputs (like K-Means centroids and PCA components) fluctuate slightly due to backend solver updates. Update the
assert_pandas_df_equal_pca_componentsutility to either ignore row order inside thecategorical_valuelists, or slightly increase thertol(relative tolerance) for numerical drift.
- 4. Fix the JSON GCS Race Condition (Code update)
- File:
tests/system/small/test_session.py::test_read_json_gcs_default_engine - Fix: The test writes a file to GCS and immediately tries to read it back using a wildcard (
*). GCS wildcards rely on the "list" API, which is eventually consistent and throws aFileNotFoundErrorif queried too fast. Add a brieftime.sleep(2)or a retry loop between writing the file and reading it back.