-
Notifications
You must be signed in to change notification settings - Fork 368
Speed up v3 service plans by removing eager loading of visibilities #2215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up v3 service plans by removing eager loading of visibilities #2215
Conversation
28cd3c2 to
2f85338
Compare
For assessing performance of service related endpoints it is useful to be able to generate large amounts of data about services, plans, instances etc. Add a rake task to do this to replace the slow script that was piped into bin/console as this suffered from being a TTY/slow loading of commands. To speed up inserts, <Class>.new is used instead of <Class>.create as this way we can multi_insert which uses a single DB query to do the whole insert rather than 1 per object which gets slow from so many round-trips to the DB. Co-Authored-By: Florian Braun <FloThinksPi@users.noreply.github.com>
The only place that the visibilities relationship is used is in determining what "visibility_type" a plan has and that only needs to check `.empty?` on the visibilties. By not eagerly loading the visibilities we save having to query and parse a huge amount of data (num_visibilities ~= num_orgs * num_plans). When this data is needed to check for `empty?` the ORM will instead issue a `SELECT 1 ... WHERE service_plan_id = '1234'` for each plan. Even in the scenario where a query is needed for every plan (i.e. the plan is neither public nor space-scoped), this is far more efficient than fetching and parsing the actual table contents. We have seen an improvement from 90s for /v3/service_plans?per_page=500 to 1.3s, even in the worst-case scenario of every plan having visibility_type of organization. Co-Authored-By: Florian Braun <FloThinksPi@users.noreply.github.com>
2f85338 to
34817c0
Compare
|
Interesting! Thanks for the detailed investigation. If the including the n+1 ends up being slow for some cases, we could consider creating a new association on the ServicePlan model |
|
After doing some more profiling with I tested this locally and it is not as big an improvement as removing the eager loading entirely but it is still ~6x faster I think (early numbers). That would suggest that even the worst case "I need to run 5000 |
|
Closing this as we cherry-picked the relevant commit: 2e363c6 We tested this out on one of our environments and everything looked good. FWIW we did quickly try a version of my above suggestion of eager-loading a newly defined calls as a side note there is probably a minor perf improvement to call Again, thanks for this investigation and PR! |
Overview
A short explanation of the proposed change:
Remove eager loading of service plan visibilities when fetching the list of service plans
An explanation of the use cases your change solves
Users trying to query
cf marketplaceagainst deployments with large numbers of organisations and a reasonably large number of brokers/services/plans which thus have an enormous number of service plan visibilitiesLinks to any other associated PRs
Fixes #2213
More info
After a bunch of investigation we found that the service plan visibilities table was causing issues when eagerly loaded into the service plans endpoint, particularly when listing service plans (as
cf marketplacedoes). Whilst the query that the ORM executes (SELECT * FROM service_plan_visibilities WHERE service_plan_id IN (<long list of IDs>)) isn't that expensive for a DB, it returns >10k rows in many cases and the process of parsing that to Ruby objects and linking it to the correct plan does seem to be pretty slow.The only usage of this data in this endpoint appears to be checking whether there are any visibilities for this plan and only in the case where the plan is not public or not space-scoped. By not eagerly loading this table, the ORM instead issues a
SELECT 1 ... WHERE service_plan_id = '1234'for each plan in the list but as this a) is only called if it is not public or space scoped and b) returns much less data it is far faster than the eager loading still. Potentially, this may remain true in environments with significant DB latency as more, smaller requests are needed but as the numbers below will say it would have to be really bad to be worse than it currently is.Tests
Setting up a database as per this comment:
We got the following timings for querying the
service_planendpoints before and after this change:/v2/service_plans?results-per-page=50- 0.12s/v2/service_plans?results-per-page=100- 0.17sBefore change:
/v3/service_plans?per_page=50- 1.24s/v3/service_plans?per_page=100- 2.21s/v3/service_plans?per_page=5000- 30.20sAfter change:
/v3/service_plans?per_page=50- 0.16s/v3/service_plans?per_page=100- 0.22s/v3/service_plans?per_page=5000- 1.72sUnfortunately this hasn't actually done much for CLI users yet as there seems to be something related to the
space_guidsfilter (it hasn't made it worse) but at the very least this clearly makes the basic endpoint much faster for other users (e.g. Stratos, AppsMan)Checklist
I have reviewed the contributing guide
I have viewed, signed, and submitted the Contributor License Agreement
I have made this pull request to the
mainbranchI have run all the unit tests using
bundle exec rakeI have run CF Acceptance Tests