Skip to content

Refactor API data-access pattern to only load what is necessary; use prepared statements#12

Merged
ohaibbq merged 6 commits into
candidate/rb20240328from
dan/api-performance
Apr 17, 2024
Merged

Refactor API data-access pattern to only load what is necessary; use prepared statements#12
ohaibbq merged 6 commits into
candidate/rb20240328from
dan/api-performance

Conversation

@ohaibbq

@ohaibbq ohaibbq commented Apr 11, 2024

Copy link
Copy Markdown

This PR greatly improves the performance of the emulator API endpoints.

We no longer load the entire BigQuery project (jobs, datasets, tables, etc) on each request. This was slow for a number of reasons as outlined in goccy/bigquery-emulator#294

We also now utilize unformatted SQLite queries. Previously, go-zetasqlite would rewrite the metadata repository queries to use functions like zetasqlite_equals instead of operators like =, which meant that SQLite would need to do full table scans when it could be doing more performant query plans.

Many endpoints now take tens of microseconds to return instead of 100+ms. Table creation takes roughly ~10ms whereas before we were seeing upwards of 300ms.

POST /bigquery/v2/projects/recidiviz-bq-emulator-project/jobs   {"query": "prettyPrint=false"}
POST /bigquery/v2/projects/recidiviz-bq-emulator-project/jobs took 16.821342ms
GET /bigquery/v2/projects/recidiviz-bq-emulator-project/datasets/validation_views       {"query": "prettyPrint=false"}
GET /bigquery/v2/projects/recidiviz-bq-emulator-project/datasets/validation_views took 287.595µs
POST /bigquery/v2/projects/recidiviz-bq-emulator-project/jobs   {"query": "prettyPrint=false"}
POST /bigquery/v2/projects/recidiviz-bq-emulator-project/jobs took 19.376201ms
GET /bigquery/v2/projects/recidiviz-bq-emulator-project/queries/c0150a30-f7c6-4bbf-b9e3-a00dd494f14d    {"query": "maxResults=0&location=US&prettyPrint=false"}
GET /bigquery/v2/projects/recidiviz-bq-emulator-project/queries/c0150a30-f7c6-4bbf-b9e3-a00dd494f14d took 791.9µs
GET /bigquery/v2/projects/recidiviz-bq-emulator-project/queries/d1e45499-8eb2-4417-bf5d-af0c1266503a    {"query": "maxResults=0&location=US&prettyPrint=false"}
GET /bigquery/v2/projects/recidiviz-bq-emulator-project/queries/d1e45499-8eb2-4417-bf5d-af0c1266503a took 645.539µs
GET /bigquery/v2/projects/recidiviz-bq-emulator-project/datasets/validation_views/tables/outliers_staff_count_percent_change_materialized       {"query": "prettyPrint=false"}
GET /bigquery/v2/projects/recidiviz-bq-emulator-project/datasets/validation_views/tables/outliers_staff_count_percent_change_materialized took 1.660463ms
GET /bigquery/v2/projects/recidiviz-bq-emulator-project/datasets/validation_views/tables/current_supervision_staff_missing_district_materialized        {"query": "prettyPrint=false"}
GET /bigquery/v2/projects/recidiviz-bq-emulator-project/datasets/validation_views/tables/current_supervision_staff_missing_district_materialized took 1.354469ms
POST /bigquery/v2/projects/recidiviz-bq-emulator-project/datasets/validation_views/tables       {"query": "prettyPrint=false"}
POST /bigquery/v2/projects/recidiviz-bq-emulator-project/datasets/validation_views/tables took 13.399834ms
GET /bigquery/v2/projects/recidiviz-bq-emulator-project/datasets/validation_views       {"query": "prettyPrint=false"}
GET /bigquery/v2/projects/recidiviz-bq-emulator-project/datasets/validation_views took 298.365µs
POST /bigquery/v2/projects/recidiviz-bq-emulator-project/jobs   {"query": "prettyPrint=false"}
POST /bigquery/v2/projects/recidiviz-bq-emulator-project/jobs took 16.414274ms
GET /bigquery/v2/projects/recidiviz-bq-emulator-project/queries/55f3fc1a-05c2-41a5-88dd-0f9f592f4a1c    {"query": "maxResults=0&location=US&prettyPrint=false"}
GET /bigquery/v2/projects/recidiviz-bq-emulator-project/queries/55f3fc1a-05c2-41a5-88dd-0f9f592f4a1c took 458.882µs
GET /bigquery/v2/projects/recidiviz-bq-emulator-project/datasets/validation_views/tables/outliers_staff_count_percent_change_errors_materialized        {"query": "prettyPrint=false"}
GET /bigquery/v2/projects/recidiviz-bq-emulator-project/datasets/validation_views/tables/outliers_staff_count_percent_change_errors_materialized took 1.560456ms

We are now able to do a run of recidiviz.tools.deploy.deploy_empty_test_views creating & materializing 1,400 views
against the emulator in ~35 seconds.

@ohaibbq ohaibbq force-pushed the dan/api-performance branch from 862fc36 to 5ba6165 Compare April 12, 2024 18:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants