Python: integration test improvements#1066
Merged
awharrison-28 merged 4 commits intomicrosoft:feature-python-integration-testsfrom May 19, 2023
Merged
Conversation
… of code needed to run them
alexchaomander
approved these changes
May 18, 2023
Contributor
alexchaomander
left a comment
There was a problem hiding this comment.
I like the idea of using more local models to do these tests so that we don't run into issues of the APIs being down or throttling.
mkarle
reviewed
May 18, 2023
| print(f"Query: {query}") | ||
| print(f"\tAnswer 1: {result[0].text}") | ||
| print(f"\tAnswer 2: {result[1].text}\n") | ||
| assert "mammals." in result[0].text |
Contributor
There was a problem hiding this comment.
Nit: This doesn't really assert that it read the document about whales and dolphins.
…on_test_improvements
5 tasks
dluc
pushed a commit
that referenced
this pull request
May 21, 2023
### Motivation and Context Python integration tests were failing consistently for 2 reasons. First, all ubuntu system tests were failing due to missing shared object files related to cuda. Second, macOS + python 3.11 tests were failing due to hsnwlib unable to install on native hardware. See #1066 for description of changes to integration test code to condense test code, make better use of pytest fixtures, and reduce the number of calls to external APIs (OpenAI, AOAI) ### Description - Integration tests were failing on Linux due to a regression in python torch=2.0.1. Downgrading the version to 2.0.0 resolved these failures. - For MacOS + 3.11 specifically, setting the environment variable `HNSWLIB_NO_NATIVE=1` ensures that the hnswlib wheel can be built and installed properly on M1 hardware. - Python integration tests reliably and consistently passing: https://github.com/microsoft/semantic-kernel/actions/runs/5035748596
shawncal
pushed a commit
to shawncal/semantic-kernel
that referenced
this pull request
Jul 6, 2023
### Motivation and Context Python integration tests were failing consistently for 2 reasons. First, all ubuntu system tests were failing due to missing shared object files related to cuda. Second, macOS + python 3.11 tests were failing due to hsnwlib unable to install on native hardware. See microsoft#1066 for description of changes to integration test code to condense test code, make better use of pytest fixtures, and reduce the number of calls to external APIs (OpenAI, AOAI) ### Description - Integration tests were failing on Linux due to a regression in python torch=2.0.1. Downgrading the version to 2.0.0 resolved these failures. - For MacOS + 3.11 specifically, setting the environment variable `HNSWLIB_NO_NATIVE=1` ensures that the hnswlib wheel can be built and installed properly on M1 hardware. - Python integration tests reliably and consistently passing: https://github.com/microsoft/semantic-kernel/actions/runs/5035748596
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation and Context
This PR slims down the number of integration tests running against AOAI and OAI models - reducing time and resources spent running them. Additional justification for slimming down these tests is to reduce the chance of throttling from these endpoints (leads to unstable integration tests).
The scenarios previously being covered are already handled using HF models which are much less expensive to test against and do not run the risk of throttling.
Description
conftest.pyto handle kernel creation and OpenAI model secret handling. I had originally intendedcreate_kernelto do more than just create a kernel, but additional setup wasn't needed. I've left the pytest fixture though since other fixtures can call it, and using it in tests can making importing Kernel from semantic_kernel unncessary.conftest.pyto set up completions tests. For example, setup_hf_text_completion_function allows for testing text2text_generation and text_generation models using the same test file.Contribution Checklist
dotnet format