Skip to content

Arize-ai/arize-otel-python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

arize banner



Overview

The arize-otel package provides a lightweight wrapper around OpenTelemetry primitives with Arize-aware defaults and options. It is meant to be a very lightweight convenience package to help set up OpenTelemetry for tracing LLM applications and send the traces to Arize.

Installation

Install arize-otel using pip

pip install arize-otel

Quickstart

The arize.otel module provides a high-level register function to configure OpenTelemetry tracing by returning a TracerProvider. The register function can also configure headers and whether or not to process spans one by one or by batch.

The following examples showcase how to use register to setup Opentelemetry in order to send traces to a collector. However, this is NOT the same as instrumenting your application. You can instrument installed OpenInference instrumentors automatically with auto_instrument=True, or manually call a specific instrumentor after register.

Automatically instrument installed OpenInference packages

Set auto_instrument=True to discover installed OpenInference instrumentors and call instrument(tracer_provider=...) for each one:

from arize.otel import register

tracer_provider = register(
    space_id="your-arize-space-id",
    api_key="your-arize-api-key",
    project_name="your-model-id",
    auto_instrument=True,
)

auto_instrument=True only instruments libraries with a corresponding OpenInference instrumentation package installed in your Python environment.

To instrument one library explicitly instead, run instrument() after using register:

from arize.otel import register
# Setup OTel via our convenience function
tracer_provider = register(
    # See details in examples below...
)

# Instrument your application using OpenInference AutoInstrumentators
from openinference.instrumentation.openai import OpenAIInstrumentor
OpenAIInstrumentor().instrument(tracer_provider=tracer_provider)

The above code snippet will yield a fully setup and instrumented application. It is worth noting that this is completely optional. The usage of this package is for convenience only, you can set up OpenTelemetry and send traces to Arize without installing this or any other package from Arize.

In the following sections we have examples on how to use the register function:

Add pre-export span processors

Some OpenInference integrations, such as processor-style integrations that transform native OpenTelemetry spans into OpenInference attributes, expose a SpanProcessor instead of an instrument() method. Pass those processors to register(span_processors=[...]) so they run before the Arize exporter while register continues to configure Arize authentication headers, endpoint, transport, batching, and project metadata:

from arize.otel import register
from openinference.instrumentation.pydantic_ai import OpenInferenceSpanProcessor

tracer_provider = register(
    space_id="your-arize-space-id",
    api_key="your-arize-api-key",
    project_name="your-model-id",
    span_processors=[OpenInferenceSpanProcessor()],
)

Use span_processors for processors that enrich or transform spans before export. You do not need to create a separate Arize SpanExporter just to preserve Arize headers.

Send traces to Arize

To send traces to Arize you need to authenticate via the Space ID and API Key. You can find them in the Space Settings page in the Arize platform. In addition, you'll need to specify the project name, a unique name to identify your project in the Arize platform.

from arize.otel import register

tracer_provider = register(
    space_id = "your-arize-space-id",
    api_key = "your-arize-api-key",
    project_name = "your-model-id",
)

If you are located in the European Union, you'll need to specify the corresponding Endpoint (the default endpoint is Endpoint.ARIZE):

from arize.otel import register, Endpoint

tracer_provider = register(
    endpoint=Endpoint.ARIZE_EUROPE,
    space_id = "your-arize-space-id",
    api_key = "your-arize-api-key",
    project_name = "your-model-id",
)

If you would like to configure your tracing using environment variables instead of passing arguments, read Using Environment Variables.

Send traces to Custom Endpoint

Sending traces to a collector on a custom endpoint is simple, you just need to provide the endpoint as a string. In addition, it is worth noting that the default is to use a GRPCSpanExporter. If you'd like to use a HTTPSpanExporter instead, specify the transport as shown below:

from arize.otel import register

tracer_provider = register(
    endpoint = "https://my-custom-endpoint"
    # any other options...
)

Specify exporter type

If you're using endpoints from the Endpoint enum, you do not need to do this, since we know what exporter to use. However, if you're using a custom endpoint, it is worth noting that the default is to use a GRPCSpanExporter. If you'd like to use a HTTPSpanExporter instead, specify the transport as shown below:

from arize.otel import register, Transport

tracer_provider = register(
    endpoint = "https://my-custom-endpoint"
    transport = Transport.HTTP,
    # any other options...
)

Turn off batch processing of spans

We default to using BatchSpanProcessor from OpenTelemetry because it is non-blocking in case telemetry goes down. In contrast, "SimpleSpanProcessor processes spans as they are created." This can be helpful in development. You can use SimpleSpanProcessor with the option use_batch_processor=False.

from arize.otel import register

tracer_provider = register(
    # other options...
    batch=False
)

Debug

As you're setting up your tracing, it is helpful to print to console the spans created. You can achieve this by setting log_to_console=True.

from arize.otel import register

tracer_provider = register(
    # other options...
    log_to_console=True
)

Routing Traces to Different Arize Spaces and Projects

The register_with_routing function enables dynamic routing of traces to different Arize spaces and projects. This is useful when you need to route traces from a single application to multiple Arize spaces (e.g., based on the team or service generating the request).

Usage

First, set up the tracer provider with routing enabled via register_with_routing. Note that unlike the standard register() function, you don't need to specify a single space_id or project_name upfront.

Then, call with set_routing_context() to set the space id and project to which traces should be routed. The set_routing_context() context manager uses OpenTelemetry's context API to set routing attributes that automatically propagate to all child spans within that context. This works seamlessly with auto-instrumentors (OpenAI, LangChain, LlamaIndex, etc.) because the routing attributes are inherited by all spans created within the context.

from arize.otel import register_with_routing, set_routing_context
from openinference.instrumentation.openai import OpenAIInstrumentor

tracer_provider = register_with_routing(
    api_key="your-arize-api-key",  
    # endpoint and transport are optional and default to Arize's GRPC endpoint
)

OpenAIInstrumentor().instrument(tracer_provider=tracer_provider)

current_project_id = "project-123"
current_space_id = "current-space"

# Both space_id and project_name must be provided for routing to work;
# otherwise, spans will be skipped and not sent to Arize
with set_routing_context(space_id=current_space_id, project_name=current_project_id):
    # All OpenAI calls and spans within this context will be routed to the specified space and project
    response = openai_client.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": "Hello!"}]
    )
    # Traces automatically go to "current-space" with project name "project-123"

Performance Considerations

The routing processor creates a dedicated span processor (with its own exporter) for each unique space_id encountered. These processors are cached in memory for the lifetime of the application. If your application routes to many different spaces (e.g., hundreds or thousands), memory usage will grow accordingly.

Using Environment Variables

The register function will read from environment variables if the arguments are not passed:

from arize.otel import register

tracer_provider = register(
    space_id = ... # Will be read from ARIZE_SPACE_ID env var
    api_key = ... # Will be read from ARIZE_API_KEY env var
    project_name = ... # Will be read from ARIZE_PROJECT_NAME env var
    endpoint = ... # Will be read from ARIZE_COLLECTOR_ENDPOINT env var, defaults to Endpoint.Arize
)

In the event of conflict, if an environment variable is set but a different argument is passed, the argument passed will take precedence and the environment variable will be ignored.

Using OTel Primitives

For more granular tracing configuration, these wrappers can be used as drop-in replacements for OTel primitives:

from opentelemetry import trace as trace_api
from arize.otel import HTTPSpanExporter, TracerProvider, SimpleSpanProcessor

tracer_provider = TracerProvider()
span_exporter = HTTPSpanExporter(endpoint=...)
span_processor = SimpleSpanProcessor(span_exporter=span_exporter)
tracer_provider.add_span_processor(span_processor)
trace_api.set_tracer_provider(tracer_provider)

Wrappers have Arize-aware defaults to greatly simplify the OTel configuration process. A special endpoint keyword argument can be passed to either a TracerProvider, SimpleSpanProcessor or BatchSpanProcessor in order to automatically infer which SpanExporter to use to simplify setup.

Specifying the endpoint directly

from opentelemetry import trace as trace_api
from arize.otel import TracerProvider

tracer_provider = TracerProvider(endpoint="https://your-desired-endpoint.com")
trace_api.set_tracer_provider(tracer_provider)

Configuring resources

# export ARIZE_COLLECTOR_ENDPOINT=https://your-desired-endpoint.com

from opentelemetry import trace as trace_api
from arize.otel import Resource, PROJECT_NAME, TracerProvider

tracer_provider = TracerProvider(resource=Resource({PROJECT_NAME: "my-project"}))
trace_api.set_tracer_provider(tracer_provider)

Using a BatchSpanProcessor

# export ARIZE_COLLECTOR_ENDPOINT=https://your-desired-endpoint.com

from opentelemetry import trace as trace_api
from arize.otel import TracerProvider, BatchSpanProcessor

tracer_provider = TracerProvider()
batch_processor = BatchSpanProcessor()
tracer_provider.add_span_processor(batch_processor)

Specifying a custom GRPC endpoint

from opentelemetry import trace as trace_api
from arize.otel import TracerProvider, BatchSpanProcessor, GRPCSpanExporter

tracer_provider = TracerProvider()
batch_processor = BatchSpanProcessor(
    span_exporter=GRPCSpanExporter(endpoint="https://your-desired-endpoint.com")
)
tracer_provider.add_span_processor(batch_processor)

Questions?

Find us in our Slack Community or email support@arize.com

Copyright, Patent, and License

Copyright 2024 Arize AI, Inc. All Rights Reserved.

This software is licensed under the terms of the 3-Clause BSD License. See LICENSE.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages