Python: integration test improvements by awharrison-28 · Pull Request #1066 · microsoft/semantic-kernel

awharrison-28 · 2023-05-18T16:33:32Z

Motivation and Context

This PR slims down the number of integration tests running against AOAI and OAI models - reducing time and resources spent running them. Additional justification for slimming down these tests is to reduce the chance of throttling from these endpoints (leads to unstable integration tests).

The scenarios previously being covered are already handled using HF models which are much less expensive to test against and do not run the risk of throttling.

Description

added a top-level test conftest.py to handle kernel creation and OpenAI model secret handling. I had originally intended create_kernel to do more than just create a kernel, but additional setup wasn't needed. I've left the pytest fixture though since other fixtures can call it, and using it in tests can making importing Kernel from semantic_kernel unncessary.
added a completions conftest.py to set up completions tests. For example, setup_hf_text_completion_function allows for testing text2text_generation and text_generation models using the same test file.
Common test code is now in pytest fixtures instead of common methods.
For a number of the completion tests, I have broken out the asserts to individual tests instead of running one giant test. This makes it easier to identify regressions in specific patterns around invoking skills.
Added retry logic to conversationSummarySkill
renamed tests to be more descriptive.

Contribution Checklist

The code builds clean without any errors or warnings
The PR follows SK Contribution Guidelines (https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md)
The code follows the .NET coding conventions (https://learn.microsoft.com/dotnet/csharp/fundamentals/coding-style/coding-conventions) verified with dotnet format
All unit tests pass, and I have added new tests where possible
I didn't break anyone 😄

… of code needed to run them

alexchaomander

I like the idea of using more local models to do these tests so that we don't run into issues of the APIs being down or throttling.

python/tests/integration/completions/conftest.py

mkarle · 2023-05-18T23:20:42Z

python/tests/integration/embeddings/test_hf_embedding_service.py

+    print(f"Query: {query}")
+    print(f"\tAnswer 1: {result[0].text}")
+    print(f"\tAnswer 2: {result[1].text}\n")
+    assert "mammals." in result[0].text


Nit: This doesn't really assert that it read the document about whales and dolphins.

…on_test_improvements

### Motivation and Context Python integration tests were failing consistently for 2 reasons. First, all ubuntu system tests were failing due to missing shared object files related to cuda. Second, macOS + python 3.11 tests were failing due to hsnwlib unable to install on native hardware. See #1066 for description of changes to integration test code to condense test code, make better use of pytest fixtures, and reduce the number of calls to external APIs (OpenAI, AOAI) ### Description - Integration tests were failing on Linux due to a regression in python torch=2.0.1. Downgrading the version to 2.0.0 resolved these failures. - For MacOS + 3.11 specifically, setting the environment variable `HNSWLIB_NO_NATIVE=1` ensures that the hnswlib wheel can be built and installed properly on M1 hardware. - Python integration tests reliably and consistently passing: https://github.com/microsoft/semantic-kernel/actions/runs/5035748596

### Motivation and Context Python integration tests were failing consistently for 2 reasons. First, all ubuntu system tests were failing due to missing shared object files related to cuda. Second, macOS + python 3.11 tests were failing due to hsnwlib unable to install on native hardware. See microsoft#1066 for description of changes to integration test code to condense test code, make better use of pytest fixtures, and reduce the number of calls to external APIs (OpenAI, AOAI) ### Description - Integration tests were failing on Linux due to a regression in python torch=2.0.1. Downgrading the version to 2.0.0 resolved these failures. - For MacOS + 3.11 specifically, setting the environment variable `HNSWLIB_NO_NATIVE=1` ensures that the hnswlib wheel can be built and installed properly on M1 hardware. - Python integration tests reliably and consistently passing: https://github.com/microsoft/semantic-kernel/actions/runs/5035748596

awharrison-28 added 2 commits May 17, 2023 21:50

significantly slimmed down number of integration tests and the amount…

1726d9c

… of code needed to run them

linting done

99a1bb0

awharrison-28 requested review from alexchaomander and mkarle May 18, 2023 16:33

awharrison-28 changed the title ~~Python/integration test improvements~~ Python: integration test improvements May 18, 2023

github-actions bot added the python Pull requests for the Python Semantic Kernel label May 18, 2023

Merge branch 'main' into python/integration_test_improvements

c9b0154

alexchaomander approved these changes May 18, 2023

View reviewed changes

mkarle reviewed May 18, 2023

View reviewed changes

awharrison-28 changed the base branch from main to feature-python-integration-tests May 19, 2023 23:19

Merge branch 'feature-python-integration-tests' into python/integrati…

7e93fae

…on_test_improvements

awharrison-28 merged commit 5063747 into microsoft:feature-python-integration-tests May 19, 2023

awharrison-28 deleted the python/integration_test_improvements branch May 19, 2023 23:20

awharrison-28 mentioned this pull request May 21, 2023

Feature python integration tests #1134

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python: integration test improvements#1066

Python: integration test improvements#1066
awharrison-28 merged 4 commits intomicrosoft:feature-python-integration-testsfrom
awharrison-28:python/integration_test_improvements

awharrison-28 commented May 18, 2023

Uh oh!

alexchaomander left a comment

Uh oh!

Uh oh!

mkarle May 18, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

awharrison-28 commented May 18, 2023

Motivation and Context

Description

Contribution Checklist

Uh oh!

alexchaomander left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mkarle May 18, 2023

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants