feat(openai): add OpenAI integration [backport #5488 to 1.13]#5674
Merged
Conversation
Add an integration for the [OpenAI library](https://github.com/openai/openai-python). This integration provides tracing for the completion, embeddings and chat completion endpoints along with cost estimation metrics and prompt/completion sampling logs. Each log, metric and trace are tagged with service, env, version, OpenAI model, OpenAI endpoint and OpenAI organization. [Docs preview](https://output.circle-artifacts.com/output/job/fe3599b8-952e-4ceb-ac4f-0f15503e9c0d/artifacts/0/tmp/docs/integrations.html#openai) ## Design ### Logs A new log writer implementation is added to submit logs. Logs are submitted direct to intake following a similar approach that [kyle-verhoog/datadog-python](https://github.com/Kyle-Verhoog/datadog-python/blob/main/datadog/_logging.py) and the [.NET tracer](DataDog/dd-trace-dotnet#2240) have taken already. ### Metrics A statsd client is used specifically for the OpenAI integration. ## Testing Testing is done using VCR to record requests made to OpenAI to ensure ease, consistency and reliability in test cases. Logs and metrics are tested using mocking of the clients. Several integration tests using snapshots and subprocess testing ensure that the integration works in a real world OpenAI application. A manual test app was also used: https://gist.github.com/Kyle-Verhoog/1f263ed0aade076b313167d1ba3bfa16 ## Risk - Currently, logs, traces and metrics are all collected, buffered and sent individually through their respective pipelines. Due to this, there is risk that disparity occurs between the tagging and submission of the data. There is also a performance risk as this data is not aggregated or batched when submitted. - Prompts and completions are captured on spans by default with a default limit on the length of the data. This limit only applies to each prompt/completion individually but requests can contain several prompts and completions. If there are many prompts and completions of great length then there is a risk of performance overhead of encoding and transmitting the data. - Logs and metrics clients are specified specifically for this integration. If another integration were to introduce logs then there would be a need for another log writer. Having several log writers could induce thread contention and high memory usage. Co-authored-by: Kyle Verhoog <kyle@verhoog.ca> Co-authored-by: Kari Halsted <12926135+kayayarai@users.noreply.github.com>
Old versions of `openai-python` use deprecated `sklearn` instead of `scikit-learn` (fixed here openai/openai-python#227). In order to install `sklearn` in our CI venvs, we need to add an environment variable. ``` × python setup.py egg_info did not run successfully. │ exit code: 1 ╰─> [18 lines of output] The 'sklearn' PyPI package is deprecated, use 'scikit-learn' rather than 'sklearn' for pip commands. Here is how to fix this error in the main use cases: - use 'pip install scikit-learn' rather than 'pip install sklearn' - replace 'sklearn' by 'scikit-learn' in your pip requirements files (requirements.txt, setup.py, setup.cfg, Pipfile, etc ...) - if the 'sklearn' package is used by one of your dependencies, it would be great if you take some time to track which package uses 'sklearn' instead of 'scikit-learn' and report it to their issue tracker - as a last resort, set the environment variable SKLEARN_ALLOW_DEPRECATED_SKLEARN_PACKAGE_INSTALL=True to avoid this error More information is available at https://github.com/scikit-learn/sklearn-pypi-package If the previous advice does not cover your use case, feel free to report it at https://github.com/scikit-learn/sklearn-pypi-package/issues/new [end of output] ```
brettlangdon
approved these changes
Apr 27, 2023
Yun-Kim
approved these changes
Apr 27, 2023
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add an integration for the OpenAI library. This integration provides tracing for the completion, embeddings and chat completion endpoints along with cost estimation metrics and prompt/completion sampling logs.
Each log, metric and trace are tagged with service, env, version, OpenAI model, OpenAI endpoint and OpenAI organization.
Docs
preview
Design
Logs
A new log writer implementation is added to submit logs. Logs are submitted direct to intake following a similar approach that kyle-verhoog/datadog-python and the .NET tracer have taken already.
Metrics
A statsd client is used specifically for the OpenAI integration.
Testing
Testing is done using VCR to record requests made to OpenAI to ensure ease, consistency and reliability in test cases.
Logs and metrics are tested using mocking of the clients.
Several integration tests using snapshots and subprocess testing ensure that the integration works in a real world OpenAI application.
A manual test app was also used:
https://gist.github.com/Kyle-Verhoog/1f263ed0aade076b313167d1ba3bfa16
Risk
Checklist
Reviewer Checklist