feat(openai): add OpenAI integration [backport #5488 to 1.13] by Kyle-Verhoog · Pull Request #5674 · DataDog/dd-trace-py

Kyle-Verhoog · 2023-04-26T23:42:23Z

Add an integration for the OpenAI library. This integration provides tracing for the completion, embeddings and chat completion endpoints along with cost estimation metrics and prompt/completion sampling logs.

Each log, metric and trace are tagged with service, env, version, OpenAI model, OpenAI endpoint and OpenAI organization.

Docs
preview

Design

Logs

A new log writer implementation is added to submit logs. Logs are submitted direct to intake following a similar approach that kyle-verhoog/datadog-python and the .NET tracer have taken already.

Metrics

A statsd client is used specifically for the OpenAI integration.

Testing

Testing is done using VCR to record requests made to OpenAI to ensure ease, consistency and reliability in test cases.

Logs and metrics are tested using mocking of the clients.

Several integration tests using snapshots and subprocess testing ensure that the integration works in a real world OpenAI application.

A manual test app was also used:
https://gist.github.com/Kyle-Verhoog/1f263ed0aade076b313167d1ba3bfa16

Risk

Currently, logs, traces and metrics are all collected, buffered and sent individually through their respective pipelines. Due to this, there is risk that disparity occurs between the tagging and submission of the data. There is also a performance risk as this data is not aggregated or batched when submitted.
Prompts and completions are captured on spans by default with a default limit on the length of the data. This limit only applies to each prompt/completion individually but requests can contain several prompts and completions. If there are many prompts and completions of great length then there is a risk of performance overhead of encoding and transmitting the data.
Logs and metrics clients are specified specifically for this integration. If another integration were to introduce logs then there would be a need for another log writer. Having several log writers could induce thread contention and high memory usage.

Checklist

Change(s) are motivated and described in the PR description.
Testing strategy is described if automated tests are not included in the PR.
Risk is outlined (performance impact, potential for breakage, maintainability, etc).
Change is maintainable (easy to change, telemetry, documentation).
Library release note guidelines are followed.
Documentation is included (in-code, generated user docs, public corp docs).
PR description includes explicit acknowledgement/acceptance of the performance implications of this PR as reported in the benchmarks PR comment.

Reviewer Checklist

Title is accurate.
No unnecessary changes are introduced.
Description motivates each change.
Avoids breaking API changes unless absolutely necessary.
Testing strategy adequately addresses listed risk(s).
Change is maintainable (easy to change, telemetry, documentation).
Release note makes sense to a user of the library.
Reviewer has explicitly acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment.

Add an integration for the [OpenAI library](https://github.com/openai/openai-python). This integration provides tracing for the completion, embeddings and chat completion endpoints along with cost estimation metrics and prompt/completion sampling logs. Each log, metric and trace are tagged with service, env, version, OpenAI model, OpenAI endpoint and OpenAI organization. [Docs preview](https://output.circle-artifacts.com/output/job/fe3599b8-952e-4ceb-ac4f-0f15503e9c0d/artifacts/0/tmp/docs/integrations.html#openai) ## Design ### Logs A new log writer implementation is added to submit logs. Logs are submitted direct to intake following a similar approach that [kyle-verhoog/datadog-python](https://github.com/Kyle-Verhoog/datadog-python/blob/main/datadog/_logging.py) and the [.NET tracer](DataDog/dd-trace-dotnet#2240) have taken already. ### Metrics A statsd client is used specifically for the OpenAI integration. ## Testing Testing is done using VCR to record requests made to OpenAI to ensure ease, consistency and reliability in test cases. Logs and metrics are tested using mocking of the clients. Several integration tests using snapshots and subprocess testing ensure that the integration works in a real world OpenAI application. A manual test app was also used: https://gist.github.com/Kyle-Verhoog/1f263ed0aade076b313167d1ba3bfa16 ## Risk - Currently, logs, traces and metrics are all collected, buffered and sent individually through their respective pipelines. Due to this, there is risk that disparity occurs between the tagging and submission of the data. There is also a performance risk as this data is not aggregated or batched when submitted. - Prompts and completions are captured on spans by default with a default limit on the length of the data. This limit only applies to each prompt/completion individually but requests can contain several prompts and completions. If there are many prompts and completions of great length then there is a risk of performance overhead of encoding and transmitting the data. - Logs and metrics clients are specified specifically for this integration. If another integration were to introduce logs then there would be a need for another log writer. Having several log writers could induce thread contention and high memory usage. Co-authored-by: Kyle Verhoog <kyle@verhoog.ca> Co-authored-by: Kari Halsted <12926135+kayayarai@users.noreply.github.com>

pr-commenter · 2023-04-27T00:08:13Z

Benchmarks

Comparing candidate commit 44d885d in PR branch backport-openai-1.13 with baseline commit 651a2c5 in branch 1.x.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 14 cases.

Old versions of `openai-python` use deprecated `sklearn` instead of `scikit-learn` (fixed here openai/openai-python#227). In order to install `sklearn` in our CI venvs, we need to add an environment variable. ``` × python setup.py egg_info did not run successfully. │ exit code: 1 ╰─> [18 lines of output] The 'sklearn' PyPI package is deprecated, use 'scikit-learn' rather than 'sklearn' for pip commands. Here is how to fix this error in the main use cases: - use 'pip install scikit-learn' rather than 'pip install sklearn' - replace 'sklearn' by 'scikit-learn' in your pip requirements files (requirements.txt, setup.py, setup.cfg, Pipfile, etc ...) - if the 'sklearn' package is used by one of your dependencies, it would be great if you take some time to track which package uses 'sklearn' instead of 'scikit-learn' and report it to their issue tracker - as a last resort, set the environment variable SKLEARN_ALLOW_DEPRECATED_SKLEARN_PACKAGE_INSTALL=True to avoid this error More information is available at https://github.com/scikit-learn/sklearn-pypi-package If the previous advice does not cover your use case, feel free to report it at https://github.com/scikit-learn/sklearn-pypi-package/issues/new [end of output] ```

Kyle-Verhoog changed the base branch from 1.x to 1.13 April 26, 2023 23:48

Kyle-Verhoog marked this pull request as ready for review April 27, 2023 12:48

Kyle-Verhoog requested review from a team as code owners April 27, 2023 12:48

Kyle-Verhoog requested review from Yun-Kim, gnufede and mabdinur April 27, 2023 12:48

brettlangdon approved these changes Apr 27, 2023

View reviewed changes

Merge branch '1.13' into backport-openai-1.13

d2e53df

Yun-Kim approved these changes Apr 27, 2023

View reviewed changes

Kyle-Verhoog merged commit b5c9edc into 1.13 Apr 27, 2023

Kyle-Verhoog deleted the backport-openai-1.13 branch April 27, 2023 13:19

github-actions Bot added this to the v1.13.0 milestone Apr 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(openai): add OpenAI integration [backport #5488 to 1.13]#5674

feat(openai): add OpenAI integration [backport #5488 to 1.13]#5674
Kyle-Verhoog merged 3 commits into
1.13from
backport-openai-1.13

Kyle-Verhoog commented Apr 26, 2023 •

edited by brettlangdon

Loading

Uh oh!

pr-commenter Bot commented Apr 27, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

Kyle-Verhoog commented Apr 26, 2023 • edited by brettlangdon Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Design

Logs

Metrics

Testing

Risk

Checklist

Reviewer Checklist

Uh oh!

pr-commenter Bot commented Apr 27, 2023

Benchmarks

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Kyle-Verhoog commented Apr 26, 2023 •

edited by brettlangdon

Loading