Use streaming for raw data requests.#19
Merged
aaronweeden merged 4 commits intoubccr:mainfrom Jun 5, 2024
Merged
Conversation
2ad7674 to
9cd3831
Compare
This was referenced Nov 22, 2023
270b615 to
f45ec27
Compare
f45ec27 to
a5b6fc1
Compare
74af2f0 to
7a08dce
Compare
7a08dce to
6a29f8b
Compare
3 tasks
eiffel777
approved these changes
May 29, 2024
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This PR makes it so that, for requests for raw data, if the portal is configured to stream JSON text sequence data back (as it will be in XDMoD 11.0 after ubccr/xdmod#1792 and ubccr/xdmod#1858), the data will be properly iterated over and stored in the data frame.
Determining whether the portal supports streaming is accomplished by first making a request to the
rest/warehouse/raw-data/limitendpoint. If the response status code is404(as it will be for 11.0 based on changes in ubccr/xdmod#1792), it runs the streaming algorithm. Otherwise, if the portal has therest/warehouse/raw-data/limitendpoint (i.e., if it is running XDMoD 10.5), it runs the old algorithm of iteratively requesting 10,000 rows (or whatever the portal has as its configured limit).Once XDMoD 10.5 is no longer supported, we can remove the old algorithm.
Motivation and Context
ubccr/xdmod#1792 improves the performance of requests for raw data in the Jobs realm.
Tests performed
In addition to running the automated tests on the existing XDMoD portal, which is running 10.5, I also edited the automated tests to point at my port on
xdmod-devwith the changes from ubccr/xdmod#1792, and ran those to success (thetest_get_raw_dataregression test failed, but on closer inspection this was due to the rows of the data frame being in a different order, which is acceptable).Types of changes
Checklist:
CHANGELOG.mdhas been updateddocs/developing.md) produces no errorsxdmod-notebooksrepository as necessary, and the notebooks all run successfully