Skip to content

Python: integration test improvements#1066

Merged
awharrison-28 merged 4 commits intomicrosoft:feature-python-integration-testsfrom
awharrison-28:python/integration_test_improvements
May 19, 2023
Merged

Python: integration test improvements#1066
awharrison-28 merged 4 commits intomicrosoft:feature-python-integration-testsfrom
awharrison-28:python/integration_test_improvements

Conversation

@awharrison-28
Copy link
Contributor

Motivation and Context

This PR slims down the number of integration tests running against AOAI and OAI models - reducing time and resources spent running them. Additional justification for slimming down these tests is to reduce the chance of throttling from these endpoints (leads to unstable integration tests).

The scenarios previously being covered are already handled using HF models which are much less expensive to test against and do not run the risk of throttling.

Description

  • added a top-level test conftest.py to handle kernel creation and OpenAI model secret handling. I had originally intended create_kernel to do more than just create a kernel, but additional setup wasn't needed. I've left the pytest fixture though since other fixtures can call it, and using it in tests can making importing Kernel from semantic_kernel unncessary.
  • added a completions conftest.py to set up completions tests. For example, setup_hf_text_completion_function allows for testing text2text_generation and text_generation models using the same test file.
  • Common test code is now in pytest fixtures instead of common methods.
  • For a number of the completion tests, I have broken out the asserts to individual tests instead of running one giant test. This makes it easier to identify regressions in specific patterns around invoking skills.
  • Added retry logic to conversationSummarySkill
  • renamed tests to be more descriptive.

Contribution Checklist

@awharrison-28 awharrison-28 changed the title Python/integration test improvements Python: integration test improvements May 18, 2023
@github-actions github-actions bot added the python Pull requests for the Python Semantic Kernel label May 18, 2023
Copy link
Contributor

@alexchaomander alexchaomander left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of using more local models to do these tests so that we don't run into issues of the APIs being down or throttling.

print(f"Query: {query}")
print(f"\tAnswer 1: {result[0].text}")
print(f"\tAnswer 2: {result[1].text}\n")
assert "mammals." in result[0].text
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This doesn't really assert that it read the document about whales and dolphins.

@awharrison-28 awharrison-28 changed the base branch from main to feature-python-integration-tests May 19, 2023 23:19
@awharrison-28 awharrison-28 merged commit 5063747 into microsoft:feature-python-integration-tests May 19, 2023
@awharrison-28 awharrison-28 deleted the python/integration_test_improvements branch May 19, 2023 23:20
dluc pushed a commit that referenced this pull request May 21, 2023
### Motivation and Context
Python integration tests were failing consistently for 2 reasons. First,
all ubuntu system tests were failing due to missing shared object files
related to cuda. Second, macOS + python 3.11 tests were failing due to
hsnwlib unable to install on native hardware.

See #1066 for
description of changes to integration test code to condense test code,
make better use of pytest fixtures, and reduce the number of calls to
external APIs (OpenAI, AOAI)

### Description
- Integration tests were failing on Linux due to a regression in python
torch=2.0.1. Downgrading the version to 2.0.0 resolved these failures.
- For MacOS + 3.11 specifically, setting the environment variable
`HNSWLIB_NO_NATIVE=1` ensures that the hnswlib wheel can be built and
installed properly on M1 hardware.
- Python integration tests reliably and consistently passing:
https://github.com/microsoft/semantic-kernel/actions/runs/5035748596
shawncal pushed a commit to shawncal/semantic-kernel that referenced this pull request Jul 6, 2023
### Motivation and Context
Python integration tests were failing consistently for 2 reasons. First,
all ubuntu system tests were failing due to missing shared object files
related to cuda. Second, macOS + python 3.11 tests were failing due to
hsnwlib unable to install on native hardware.

See microsoft#1066 for
description of changes to integration test code to condense test code,
make better use of pytest fixtures, and reduce the number of calls to
external APIs (OpenAI, AOAI)

### Description
- Integration tests were failing on Linux due to a regression in python
torch=2.0.1. Downgrading the version to 2.0.0 resolved these failures.
- For MacOS + 3.11 specifically, setting the environment variable
`HNSWLIB_NO_NATIVE=1` ensures that the hnswlib wheel can be built and
installed properly on M1 hardware.
- Python integration tests reliably and consistently passing:
https://github.com/microsoft/semantic-kernel/actions/runs/5035748596
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python Pull requests for the Python Semantic Kernel

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants