Skip to content

Conversation

@utkarsharma2
Copy link
Contributor

This PR is part of our larger effort to add first-class integrations to support LLMOps that was presented at Airflow Summit.

This PR adds explicitly the OpenAI Provider. OpenAI is a leading American artificial intelligence organization, which offers one of the most used LLM - ChatGPT and offers embedding models.

The primary objective of this Provider is to present users with an alternative embedding model. This allows them to generate vectors for their proprietary data, a pivotal step towards establishing integrations with LLM models like ChatGPT.

Example DAG:
The OpenAIEmbeddingOperator can accept either a string or a callable returning a list of strings.

OpenAIEmbeddingOperator(
        task_id="embedding_using_xcom_data",
        conn_id="openai_default",
        input_text=xcom_text["input_text"],
        model="text-embedding-ada-002",
    )

Email Discussion related to the effort can be found here - https://lists.apache.org/thread/0d669fmy4hn29h5c0wj0ottdskd77ktp

@utkarsharma2 utkarsharma2 marked this pull request as ready for review October 26, 2023 13:50
Copy link
Member

@pankajastro pankajastro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pankajastro pankajastro requested a review from eladkal November 4, 2023 11:52
@pankajastro pankajastro merged commit cca4aa4 into apache:main Nov 7, 2023
romsharon98 pushed a commit to romsharon98/airflow that referenced this pull request Nov 10, 2023
* Add OpenAI Provider

* Apply suggestions from code review

Co-authored-by: Phani Kumar <94376113+phanikumv@users.noreply.github.com>

* Remove create_completions method from hook

* Change type of input_text param

Since the upstream API accepts str ot list of tokens, we accept the similar inputs from user.

* Updated min-airflow version to 2.5.0

* Updated the interface and fix docs and static files

* Fix tests

* Fix tests

* Change the version

Because of OpenAI SDK not being production ready

* Add embedding_kwargs as a param to operator

* Update tests/providers/openai/hooks/test_openai.py

Co-authored-by: Pankaj Singh <98807258+pankajastro@users.noreply.github.com>

* Remove unwanted params in docstring

* Update Changelog

* Add security.rst file

* Update docs/apache-airflow-providers-openai/index.rst

Co-authored-by: Pankaj Singh <98807258+pankajastro@users.noreply.github.com>

* Add host field for connections

* Update docs/apache-airflow-providers-openai/index.rst

Co-authored-by: Pankaj Singh <98807258+pankajastro@users.noreply.github.com>

* Add changelog.rst file to docs

* Change version to 1.0.0

* Resolve conflicts

* Fix tests

* Fixed tests

* Fix test

* Resolve Conflict

---------

Co-authored-by: Pankaj Koti <pankajkoti699@gmail.com>
Co-authored-by: Phani Kumar <94376113+phanikumv@users.noreply.github.com>
Co-authored-by: Pankaj Singh <98807258+pankajastro@users.noreply.github.com>
@ephraimbuddy ephraimbuddy added the changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) label Nov 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:dev-tools area:providers changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) kind:documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.