-
Notifications
You must be signed in to change notification settings - Fork 30
Increase number per items on CSV, JSON searches #255
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…s#244) - Set items_per_page variable to 1e6 if response is either csv or json
- Add new fixture 'gliders' - has more than 1000 datasets - Adds tests 'test_csv_search' and 'test_json_search' that check if the response has more than 1000 records
|
I think that the test failing here may be fixed by #253. |
|
While you could write a mock response generator, you then would be testing that more so than In this case, I think we can skip calling the endpoints, and instead inspect the URL returned from Additionally we could leave these tests in in addition to testing the URL, mark them, and have pytest skip them by default. Then maybe we set up a weekly cron driven actions workflow to run those tests and update the VCR cassettes to do a fuller test run? |
|
Hi @abkfenris, sounds good to me, I'll add new tests that check the URL. However, I'm still thinking we should drop the test on the CoastWatch server? That seems to be behaving inconsistently. Please let me know what you think works best. Thanks, |
|
Ya, avoiding the Coastwatxh server might be best with Bobs aggressive blocklist. |
|
Do you think you could merge #253 then? Please let me know if you'd like me to make any changes to that. Thanks, |
- On method ERDDAP.get_search_url, add all CSV, JSON and TSV-type responses to variable 'non_paginated_responses'
|
@vinisalazar do you mind rebasing this one to fix the conflict? Please let me know if you need help with that! |
…s#244) - Set items_per_page variable to 1e6 if response is either csv or json
- Add new fixture 'gliders' - has more than 1000 datasets - Adds tests 'test_csv_search' and 'test_json_search' that check if the response has more than 1000 records
- On method ERDDAP.get_search_url, add all CSV, JSON and TSV-type responses to variable 'non_paginated_responses'
|
@ocefpaf I had a go at doing it, not sure if I got it right. I can reset the branch and redo it if needed. |
|
I'll definitely accept the offer for helping out with this one! |
|
Looks like you got it. At least the original changes are in there, only one tests is failing. |
- Improve comment in function 'get_search_url' - Move duplicated comment to fixture 'gliders'
|
Thank you for the review @ocefpaf! |
Hi,
this is in relation to #244. I wasn't sure of the best way to go about this, so I tried keeping it as simple as possible. Please let me know if there's an obvious way that I'm missing.
What I did was simply to set the number of
items_per_pageto a big number (initially it was1e9, but I thought that was excessive?) if the response is either'csv'or'json'. This doesn't require actually making the request, it just builds the URL. I couldn't find a "number of records" variable in the search response. I suppose I could set theitems_per_pagevariable to that, but that would require making the request, and I understand that we just want the URL. Removing theitems_per_pageorpagevariables from the URL didn't make any difference.To test, I added a fixture for the gliders server, since the other two had less than 1000 records. I did not commit the cassettes since they turned out fairly large, at around 3MB each. Do you have any suggestions on what to do about this?
Summary of changes
items_per_pageifresponseis either'csv'or'json'gliderstotest_to_objects.pytest_csv_searchandtest_json_searchPlease feel free to request any changes, and don't forget to add the
gsoc-2022label :)Thank you,
Vini