Skip to content

adham90/opentrace-ruby

Repository files navigation

OpenTrace Ruby

Gem Version License: MIT

A thin, safe Ruby client that forwards structured application logs to an OpenTrace server over HTTP.

OpenTrace Server -- This gem requires a running OpenTrace server. OpenTrace is a self-hosted observability tool for logs, database monitoring, and intelligent alerting. See the server repo for setup instructions.

This gem will never crash or slow down your application. All network errors are swallowed silently. If the server is unreachable, logs are dropped -- your app continues running normally.

Features

Core

  • Zero-risk integration -- all errors swallowed, never raises to host app
  • Async dispatch -- logs are queued in-memory and sent via a background thread
  • Batch sending -- groups logs into configurable batches for efficient network usage
  • Bounded queue -- caps at 1,000 entries to prevent memory bloat
  • Smart truncation -- oversized payloads are truncated instead of silently dropped
  • Works with any server -- Puma (threads), Unicorn (forks), Passenger, and Falcon (fibers)
  • Fork safe -- detects forked worker processes and re-initializes cleanly
  • Fiber safe -- uses Fiber[] storage for correct request isolation in fiber-based servers
  • Level filtering -- min_level threshold or allowed_levels list to control which severities are forwarded
  • Auto-enrichment -- every log includes hostname, pid, and git_sha automatically
  • Runtime controls -- enable/disable logging at runtime without restarting
  • Graceful shutdown -- pending logs are flushed automatically on process exit
  • Adaptive sampling -- graduated backpressure reduces overhead under load (configurable sample_rate)
  • Deferred payloads -- request thread pushes frozen arrays; heavy work runs on background thread

Instrumentation

  • Custom instrumentation -- OpenTrace.trace("stripe.charge") { ... } with nested spans and timing
  • Exception helper -- OpenTrace.error captures class, message, cleaned backtrace, cause chain, and error fingerprint
  • Exception cause chaining -- walks exception.cause up to 5 levels deep
  • Breadcrumbs -- OpenTrace.add_breadcrumb records a trail of events attached to errors
  • Source code context -- captures surrounding source lines at the error origin (opt-in)
  • Local variables capture -- OpenTrace.capture_binding(e, binding) snapshots variables at crash point (opt-in)
  • Transaction naming -- OpenTrace.set_transaction_name for custom grouping
  • Business events -- OpenTrace.event sends typed events (e.g. payment.completed) that bypass level filtering
  • Context support -- attach global metadata to every log via Hash or Proc

Rails Integration

  • Auto-instrumentation -- controllers, SQL queries, ActiveJob, views, cache, deprecation warnings
  • Rack middleware -- propagates request_id via fiber-local storage
  • Per-request summary -- one rich log per request with SQL, view, cache breakdown and timeline
  • N+1 query detection -- warns when a request exceeds 20 SQL queries
  • Duplicate query detection -- fingerprints SQL queries to find repeated patterns
  • SQL normalization -- replaces literals with ? for grouping; generates stable fingerprints
  • EXPLAIN plan capture -- runs EXPLAIN on slow queries asynchronously on background thread (opt-in)
  • Log trace injection -- injects [trace_id=xxx request_id=yyy] into Rails logger output (opt-in)
  • Session tracking -- extracts session ID from rack session or cookies (opt-in)
  • Logger wrapper -- drop-in replacement that forwards to OpenTrace while keeping your original logger
  • Rails 7.1+ BroadcastLogger -- native support via broadcast_to
  • TaggedLogging -- preserves ActiveSupport::TaggedLogging tags in metadata
  • Error fingerprinting -- stable fingerprint for grouping identical errors across requests
  • Deprecation tracking -- captures Rails deprecation warnings with callsite

Data Protection

  • PII scrubbing -- automatic detection and redaction of emails, credit cards, SSNs, tokens, passwords (opt-in)
  • Lifecycle callbacks -- on_error, after_send, before_breadcrumb, before_send hooks
  • Before-send filter -- drop or modify payloads before delivery

Monitoring

  • DB pool monitoring -- background thread reports connection pool saturation (opt-in)
  • Job queue depth -- monitors Sidekiq, GoodJob, or SolidQueue queue sizes (opt-in)
  • Memory delta tracking -- snapshots process RSS before/after each request (opt-in)
  • External HTTP tracking -- captures outbound Net::HTTP calls with timing (opt-in)
  • GC/Runtime metrics -- periodic collection of GC stats, thread count, and process RSS (opt-in)

Delivery

  • Distributed tracing -- W3C Trace Context (traceparent) propagation across services with span IDs
  • Unix socket transport -- 2-5x faster delivery for co-located servers with automatic HTTP fallback (opt-in)
  • Gzip compression -- automatic payload compression for bandwidth reduction
  • Version negotiation -- startup compatibility check with capability-based feature detection
  • Circuit breaker -- stops sending when server is unreachable, resumes after cooldown
  • Exponential backoff -- retries with jitter on server errors

Installation

Add to your Gemfile:

gem "opentrace"

Then run:

bundle install

Or install directly:

gem install opentrace

Quick Start

OpenTrace.configure do |c|
  c.endpoint = "https://opentrace.example.com"
  c.api_key  = ENV["OPENTRACE_API_KEY"]
  c.service  = "my-app"
end

OpenTrace.log("INFO", "User signed in", { user_id: 42 })

That's it. Logs are queued and sent asynchronously -- your code never blocks.

Configuration

OpenTrace.configure do |c|
  # Required
  c.endpoint    = "https://opentrace.example.com"
  c.api_key     = ENV["OPENTRACE_API_KEY"]
  c.service     = "billing-api"

  # Optional
  c.environment = "production"           # default: nil
  c.timeout     = 1.0                    # HTTP timeout in seconds (default: 1.0)
  c.enabled     = true                   # default: true
  c.min_level   = :info                  # minimum level to forward (default: :info)
  c.allowed_levels = [:warn, :error]     # explicit level list (overrides min_level, default: nil)
  c.batch_size  = 50                     # logs per batch (default: 50)
  c.flush_interval = 5.0                 # seconds between flushes (default: 5.0)

  # Global context -- attached to every log entry
  c.context = { deploy_version: "v1.2.3" }
  # Or use a Proc for dynamic context:
  c.context = -> { { tenant_id: Current.tenant&.id } }

  # Auto-populated (override if needed)
  c.hostname = Socket.gethostname        # auto-detected
  c.pid      = Process.pid               # auto-detected
  c.git_sha  = ENV["REVISION"]           # checks REVISION, GIT_SHA, HEROKU_SLUG_COMMIT

  # SQL logging (Rails only)
  c.sql_logging = false                  # forward individual SQL queries (default: false)
  c.sql_duration_threshold_ms = 100.0    # only log queries slower than this (default: 0.0 = all)

  # Path filtering (defaults include /up, /health, /healthz, /ping, /ready, /livez, /readyz)
  c.ignore_paths = ["/health", %r{\A/assets/}]  # customize paths to skip

  # Per-request summary (Rails only)
  c.request_summary = true               # accumulate events into one rich log (default: true)
  c.timeline = false                     # include event timeline in summary (default: false)
  c.timeline_max_events = 200            # cap timeline entries (default: 200)

  # Background monitors (opt-in)
  c.pool_monitoring = false              # DB connection pool stats (default: false)
  c.pool_monitoring_interval = 30        # seconds between checks (default: 30)
  c.queue_monitoring = false             # job queue depth monitoring (default: false)
  c.queue_monitoring_interval = 60       # seconds between checks (default: 60)

  # Advanced opt-in features
  c.memory_tracking = false              # RSS delta per request (default: false)
  c.http_tracking = false                # external HTTP call tracking (default: false)

  # Sampling & performance
  c.sample_rate = 1.0                    # 0.0-1.0, fraction of requests to trace (default: 1.0)
  c.sampler = ->(env) { 0.1 }           # dynamic per-endpoint sampler (default: nil)
  c.before_send = ->(payload) { payload } # filter/drop payloads before delivery (default: nil)

  # SQL normalization (default: true)
  c.sql_normalization = true             # replace SQL literals with ? for grouping

  # Instrumentation
  c.source_context = false               # capture source code around errors (default: false)
  c.local_vars_capture = false           # enable OpenTrace.capture_binding (default: false)
  c.log_trace_injection = false          # inject trace_id into Rails logger (default: false)
  c.session_tracking = false             # extract session ID from cookies (default: false)

  # EXPLAIN plan capture
  c.explain_slow_queries = false         # run EXPLAIN on slow queries (default: false)
  c.explain_threshold_ms = 100.0         # threshold in ms (default: 100.0)

  # PII protection
  c.pii_scrubbing = false                # scrub PII from metadata (default: false)
  c.pii_patterns = [/CUST-\d{8}/]       # additional patterns (default: nil)
  c.pii_disabled_patterns = [:phone]     # disable built-in patterns (default: nil)

  # Lifecycle callbacks
  c.on_error = ->(exc, meta) { }         # called on error capture (default: nil)
  c.after_send = ->(batch_size, bytes) { } # called after delivery (default: nil)
  c.before_breadcrumb = ->(crumb) { crumb } # filter breadcrumbs (default: nil)

  # GC/Runtime metrics
  c.runtime_metrics = false              # collect GC/thread/memory stats (default: false)
  c.runtime_metrics_interval = 30        # seconds between collections (default: 30)

  # Unix socket transport
  c.transport = :http                    # :http or :unix_socket (default: :http)
  c.socket_path = "/tmp/opentrace.sock"  # path to Unix socket (default: "/tmp/opentrace.sock")
end

If any required field (endpoint, api_key, service) is missing or empty, the gem disables itself automatically. No errors, no logs sent.

Level Filtering

Control which log levels are forwarded with min_level (threshold) or allowed_levels (explicit list):

OpenTrace.configure do |c|
  # ...
  # Option A: Threshold — forward this level and above
  c.min_level = :warn  # only forward WARN, ERROR, and FATAL

  # Option B: Explicit list — forward only these levels (overrides min_level)
  c.allowed_levels = [:warn, :error]  # only forward WARN and ERROR
end

When allowed_levels is set, it takes precedence over min_level. When allowed_levels is nil (the default), min_level is used.

Available levels: :debug, :info, :warn, :error, :fatal

Usage

Direct Logging

OpenTrace.log("INFO", "User signed in", { user_id: 42, ip: "1.2.3.4" })

OpenTrace.log("ERROR", "Payment failed", {
  trace_id: "abc-123",
  user_id: 99,
  exception: {
    class: "Stripe::CardError",
    message: "Your card was declined"
  }
})

Pass trace_id inside metadata and it will be promoted to a top-level field automatically.

Exception Logging

Use OpenTrace.error to log exceptions with automatic class, message, backtrace, and fingerprint extraction:

begin
  dangerous_operation
rescue => e
  OpenTrace.error(e, { user_id: current_user.id, action: "checkout" })
end

This captures:

  • exception_class -- the exception class name
  • exception_message -- truncated to 500 characters
  • backtrace -- cleaned (Rails backtrace cleaner or gem-filtered), limited to 15 frames
  • error_fingerprint -- 12-char hash for grouping identical errors (stable across line number changes)
  • exception_causes -- full cause chain (up to 5 levels via exception.cause)
  • breadcrumbs -- trail of events leading up to the error (if any were added)
  • source_context -- surrounding source code lines at the error origin (when enabled)
  • local_variables -- variable state at crash point (when capture_binding was called)

Custom Instrumentation

Trace any block of code with automatic timing:

OpenTrace.trace("stripe.charge", resource: "Invoice") do |span|
  span.set_tag(:amount, 2000)
  Stripe::Charge.create(amount: 2000, currency: "usd")
end

Spans can be nested -- child spans automatically track their parent:

OpenTrace.trace("checkout.process") do
  OpenTrace.trace("checkout.validate") { validate_cart }
  OpenTrace.trace("checkout.charge") { charge_card }
  OpenTrace.trace("checkout.fulfill") { create_order }
end

Each span emits a log entry with span_operation, span_duration_ms, and parent/child IDs. When a RequestCollector is active, spans also appear in the request summary.

Breadcrumbs

Record a trail of events leading up to an error:

OpenTrace.add_breadcrumb("auth", "User logged in", { provider: "google" })
OpenTrace.add_breadcrumb("nav", "Visited /settings")
OpenTrace.add_breadcrumb("action", "Changed password")

Breadcrumbs are stored per-request (Fiber-local, max 25) and automatically attached to error payloads. They are cleared after each request.

Filter breadcrumbs with a callback:

OpenTrace.configure do |c|
  c.before_breadcrumb = ->(crumb) {
    crumb.category == "noisy" ? nil : crumb  # return nil to drop
  }
end

Local Variables Capture

Capture the state of local variables when an error occurs:

OpenTrace.configure do |c|
  c.local_vars_capture = true
end

def update_profile(user, params)
  user.update!(params)
rescue ActiveRecord::RecordInvalid => e
  OpenTrace.capture_binding(e, binding)  # explicit capture
  OpenTrace.error(e, { action: "update_profile" })
  raise
end

This produces:

{
  "local_variables": [
    { "name": "user", "value": "#<User id: 42, name: nil>", "type": "User" },
    { "name": "params", "value": "{\"name\"=>\"\"}", "type": "Hash" }
  ]
}

Capped at 10 variables, 500 chars per value. No global VM hooks -- zero overhead unless you explicitly call capture_binding.

Transaction Naming

Override the auto-detected transaction name for custom grouping:

OpenTrace.set_transaction_name("API::V2::Users#search")

The transaction name appears in the request log message and metadata, enabling grouping by business operation instead of route.

Business Events

Use OpenTrace.event to send typed business events. Events always send at INFO level and bypass level filtering — they are never suppressed by min_level or allowed_levels:

OpenTrace.event("payment.completed", "User paid $49.99", { user_id: 42, amount: 49.99 })
OpenTrace.event("auth.login", "Google OAuth login", { provider: "google", user_id: 7 })
OpenTrace.event("order.shipped", "Order dispatched", { order_id: "ORD-123" })

Events include an event_type field in the payload, making them filterable on the server. They inherit context, request_id, and static context just like normal logs.

Logger Wrapper

Wrap any Ruby Logger to forward all log output to OpenTrace while keeping the original logger working exactly as before:

require "logger"

logger = Logger.new($stdout)
logger = OpenTrace::Logger.new(logger)

logger.info("This goes to STDOUT and to OpenTrace")
logger.error("So does this")

Attach default metadata to every log from this logger:

logger = OpenTrace::Logger.new(original_logger, metadata: { component: "worker" })
logger.info("Processing job")
# metadata: { component: "worker" }

Global Context

Attach metadata to every log entry using config.context:

# Static context
OpenTrace.configure do |c|
  # ...
  c.context = { deploy_version: "v1.2.3", region: "us-east-1" }
end

# Dynamic context (evaluated on each log call)
OpenTrace.configure do |c|
  # ...
  c.context = -> { { tenant: Current.tenant&.slug } }
end

Context has the lowest priority -- caller-provided metadata overrides context values.

Rails Integration

In a Rails app, add an initializer:

# config/initializers/opentrace.rb
OpenTrace.configure do |c|
  c.endpoint    = ENV["OPENTRACE_ENDPOINT"]
  c.api_key     = ENV["OPENTRACE_API_KEY"]
  c.service     = "my-rails-app"
  c.environment = Rails.env
end

The gem auto-detects Rails and provides the following integrations automatically:

Rack Middleware

Automatically inserted into the middleware stack. Captures request_id from action_dispatch.request_id or HTTP_X_REQUEST_ID and makes it available via OpenTrace.current_request_id. All logs within a request automatically include the request_id.

Request IDs are stored using Fiber[] (fiber-local storage), which works correctly in both threaded servers (Puma) and fiber-based servers (Falcon).

Logger Wrapping

  • Rails 7.1+: Uses BroadcastLogger#broadcast_to to register as a broadcast target (non-invasive)
  • Pre-7.1: Wraps Rails.logger with OpenTrace::Logger which delegates to the original and forwards to OpenTrace

All your existing Rails.logger.info(...) calls automatically get forwarded to OpenTrace.

Per-Request Summary

When request_summary is enabled (the default), the gem accumulates all events during a request -- SQL queries, view renders, cache operations, HTTP calls -- into a single rich log entry emitted at request end. This avoids flooding the queue with hundreds of individual events.

Example payload:

{
  "level": "INFO",
  "message": "GET /dashboard 200 2847ms",
  "metadata": {
    "request_id": "req-abc123",
    "controller": "DashboardController",
    "action": "index",
    "method": "GET",
    "path": "/dashboard",
    "status": 200,
    "duration_ms": 2847.3,

    "request_user_agent": "Mozilla/5.0...",
    "request_accept": "text/html",

    "sql_query_count": 34,
    "sql_total_ms": 423.1,
    "sql_slowest_ms": 312.0,
    "sql_slowest_name": "Order Count",
    "n_plus_one_warning": true,

    "view_render_count": 48,
    "view_total_ms": 890.2,
    "view_slowest_ms": 245.0,
    "view_slowest_template": "dashboard/_activity_feed.html.erb",

    "cache_reads": 8,
    "cache_hits": 5,
    "cache_writes": 3,
    "cache_hit_ratio": 0.63,

    "time_breakdown": {
      "sql_pct": 14.9,
      "view_pct": 31.3,
      "http_pct": 0.0,
      "other_pct": 53.8
    },

    "timeline": [
      { "t": "sql", "n": "User Load", "ms": 1.2, "at": 0.0 },
      { "t": "cache", "a": "read", "hit": true, "ms": 0.1, "at": 6.0 },
      { "t": "sql", "n": "Order Count", "ms": 312.0, "at": 10.0 },
      { "t": "view", "n": "dashboard/index.html.erb", "ms": 890.2, "at": 350.0 }
    ]
  }
}

The timeline shows a waterfall of events in chronological order. Timeline keys are kept short to minimize payload size: t = type, n = name, ms = duration, at = offset from request start, s = status, a = action.

Controller Subscriber

Subscribes to process_action.action_controller and captures:

Field Description
request_id From ActionDispatch
controller Controller class name
action Action name
method HTTP method (GET, POST, etc.)
path Request path
status HTTP response status code
duration_ms Request duration in milliseconds
user_id Auto-captured if controller responds to current_user
params Filtered request parameters (respects filter_parameters)
exception_class Exception class (if raised)
exception_message Exception message (if raised)
backtrace Cleaned backtrace (if exception raised)
error_fingerprint 12-char fingerprint for error grouping
request_content_type Request Content-Type header
request_accept Request Accept header
request_user_agent Request User-Agent (truncated to 200 chars)
request_referer Request Referer header
sql_query_count Total SQL queries in this request
sql_total_ms Total SQL time in this request
n_plus_one_warning true when query count exceeds 20

When request summary is enabled, the log also includes view render stats, cache stats, time breakdown, and timeline (see above).

Log levels are set automatically:

  • ERROR -- exceptions or 5xx status
  • WARN -- 4xx status
  • INFO -- everything else

N+1 Query Detection

Every request tracks the number of SQL queries via a Fiber-local counter. When a request exceeds 20 queries, the log entry includes n_plus_one_warning: true. This makes it easy to query OpenTrace for requests with potential N+1 issues.

SQL Query Subscriber

Subscribes to sql.active_record and logs every query with:

Field Description
sql_name Query name (e.g., "User Load")
sql Query text (truncated to 1000 chars)
sql_duration_ms Query duration in milliseconds
sql_cached Whether the result was cached
sql_table Extracted table name for filtering

SCHEMA queries (migrations, structure dumps) are automatically skipped. Queries over 1 second are logged as WARN, all others as DEBUG.

Configure SQL logging:

OpenTrace.configure do |c|
  # ...
  c.sql_logging = true                  # enable/disable (default: false)
  c.sql_duration_threshold_ms = 100.0   # only log slow queries (default: 0.0 = all)
end

ActiveJob Subscriber

Subscribes to perform.active_job and logs every job execution with:

Field Description
job_class Job class name
job_id Unique job ID
queue_name Queue the job ran on
executions Attempt number
duration_ms Execution duration
job_arguments Serialized arguments (truncated to 512 bytes)
queue_latency_ms Time spent waiting in queue before execution
enqueued_at When the job was enqueued
exception_class Exception class (if failed)
exception_message Exception message (if failed)
backtrace Cleaned backtrace (if failed)
error_fingerprint Fingerprint for error grouping (if failed)

Failed jobs are logged as ERROR, successful jobs as INFO.

Deprecation Warning Subscriber

Subscribes to deprecation.rails and logs all Rails deprecation warnings as WARN:

Field Description
deprecation_message The deprecation message (truncated to 500 chars)
deprecation_callsite File and line where the deprecated API was called
request_id Current request ID (if in web context)

View Render Tracking

When request summary is enabled, subscribes to render_template.action_view and render_partial.action_view. View render events are accumulated in the RequestCollector and included in the per-request summary -- no individual log entries are emitted for views.

The summary includes:

  • view_render_count -- total number of templates/partials rendered
  • view_total_ms -- total rendering time
  • view_slowest_ms / view_slowest_template -- the bottleneck template

Template paths are automatically shortened (e.g., /Users/deploy/app/views/orders/show.html.erb becomes orders/show.html.erb).

Cache Operation Tracking

When request summary is enabled, subscribes to cache_read.active_support, cache_write.active_support, and cache_delete.active_support. Like views, cache events are accumulated -- no individual logs.

The summary includes:

  • cache_reads / cache_hits / cache_writes
  • cache_hit_ratio -- hit rate (0.0 to 1.0)

Error Fingerprinting

Every error (in controller requests, job failures, and OpenTrace.error calls) includes an error_fingerprint -- a 12-character hash derived from the exception class and the first application frame in the backtrace. The fingerprint is:

  • Stable across deploys -- line number changes don't affect it
  • Same error, same fingerprint -- different error messages at the same location produce the same fingerprint
  • Different error, different fingerprint -- different exception classes or different code locations produce different fingerprints

Use it to group and count errors in OpenTrace.

TaggedLogging

If your wrapped logger uses ActiveSupport::TaggedLogging, tags are preserved and injected into the metadata:

Rails.logger.tagged("RequestID-123", "UserID-42") do
  Rails.logger.info("Processing request")
  # metadata: { tags: ["RequestID-123", "UserID-42"] }
end

Background Monitors

DB Connection Pool Monitoring

Opt-in background thread that periodically reports ActiveRecord connection pool stats:

OpenTrace.configure do |c|
  # ...
  c.pool_monitoring = true
  c.pool_monitoring_interval = 30  # seconds (default: 30)
end

Reports pool_size, connections_busy, connections_idle, threads_waiting, and checkout_timeout. Logs at WARN when threads are waiting for a connection, DEBUG otherwise.

Job Queue Depth Monitoring

Opt-in background thread that reports job queue sizes. Supports Sidekiq, GoodJob, and SolidQueue (auto-detected):

OpenTrace.configure do |c|
  # ...
  c.queue_monitoring = true
  c.queue_monitoring_interval = 60  # seconds (default: 60)
end

Reports per-queue sizes and total enqueued count. Logs at WARN when total exceeds 1,000.

Advanced Opt-In Features

These features have measurable overhead or implementation risks. Disabled by default. Enable them after testing in staging.

Memory Delta Tracking

Snapshots process memory (RSS) before and after each request:

OpenTrace.configure do |c|
  # ...
  c.memory_tracking = true
end

Adds to the request summary:

  • memory_before_mb -- RSS before request
  • memory_after_mb -- RSS after request
  • memory_delta_mb -- difference (positive = memory grew)

Uses /proc/self/statm on Linux (~10us) or GC.stat approximation on macOS (~5us). The delta is process-level, so concurrent requests will affect accuracy. Most accurate on single-threaded servers (Unicorn).

External HTTP Tracking

Instruments outbound Net::HTTP calls to capture third-party API performance:

OpenTrace.configure do |c|
  # ...
  c.http_tracking = true
end

Adds to the request summary:

  • http_external_count -- number of outbound HTTP calls
  • http_external_total_ms -- total time in external calls
  • http_slowest_ms / http_slowest_host -- the bottleneck

Each HTTP call appears in the timeline:

{ "t": "http", "n": "POST api.stripe.com", "ms": 184.0, "s": 200, "at": 55.0 }

Failed calls include an error type:

{ "t": "http", "n": "POST api.stripe.com", "ms": 5200.0, "s": 0, "err": "Net::ReadTimeout", "at": 55.0 }

A recursion guard prevents OpenTrace's own HTTP calls to the server from being tracked. The time_breakdown in the request summary includes http_pct alongside sql_pct and view_pct.

Note: This works by prepending a module to Net::HTTP. Libraries that use Net::HTTP internally (Faraday, HTTParty, RestClient) are automatically captured.

Source Code Context

When enabled, error logs include the surrounding source lines at the crash location:

OpenTrace.configure do |c|
  c.source_context = true
end

Produces:

{
  "source_context": {
    "file": "app/models/order.rb",
    "line": 42,
    "context": {
      "39": "  def total",
      "40": "    items.sum(:price) +",
      "41": "      tax_amount +",
      "42": "      shipping_cost",
      "43": "  end"
    }
  }
}

Files are cached (LRU, 50 files max). Only reads files under /app/, /lib/, or /config/ and smaller than 100KB.

SQL Normalization

Enabled by default. Replaces SQL literals with ? placeholders for grouping:

SELECT * FROM users WHERE id = 42 AND email = 'alice@example.com'
  => SELECT * FROM users WHERE id = ? AND email = ?

Each normalized query gets a 12-char fingerprint (MD5) for fast grouping. The request summary includes top_duplicates with the most repeated query patterns.

Duplicate Query Detection

When a RequestCollector is active, SQL queries are fingerprinted and counted. The request summary includes:

  • duplicate_queries -- number of fingerprints seen more than once
  • worst_duplicate_count -- highest repeat count
  • top_duplicates -- top 3 repeated queries with count and fingerprint
  • n_plus_one_warning -- true when worst duplicate exceeds 5

Log Trace Injection

Injects trace context into your existing Rails logger output:

OpenTrace.configure do |c|
  c.log_trace_injection = true
end

Before: Processing by UsersController#show After: [trace_id=abc123 request_id=req-456] Processing by UsersController#show

Session Tracking

Extracts session ID from the Rack session or session cookie:

OpenTrace.configure do |c|
  c.session_tracking = true
end

The session ID appears in request metadata, enabling session-level analysis.

PII Scrubbing

Automatically detects and redacts sensitive data before sending:

OpenTrace.configure do |c|
  c.pii_scrubbing = true
end

Built-in patterns detect: credit card numbers, email addresses, SSNs, phone numbers, bearer tokens, and API keys. Sensitive keys (password, secret, token, api_key, authorization) are always redacted.

Customize patterns:

OpenTrace.configure do |c|
  c.pii_scrubbing = true
  c.pii_patterns = [/CUST-\d{8}/]        # add custom patterns
  c.pii_disabled_patterns = [:phone]       # disable built-in patterns
end

PII scrubbing runs on the background thread -- zero request-thread overhead.

EXPLAIN Plan Capture

Automatically captures EXPLAIN output for slow SQL queries:

OpenTrace.configure do |c|
  c.explain_slow_queries = true
  c.explain_threshold_ms = 50.0  # queries slower than 50ms
end

The SQL text is captured on the request thread (zero DB overhead). EXPLAIN is executed asynchronously on the background thread using a separate DB connection. Max 3 EXPLAIN queries per request.

{
  "explain_plans": [{
    "sql": "SELECT * FROM orders WHERE user_id = 42 ORDER BY created_at DESC",
    "duration_ms": 87.3,
    "explain_plan": "Seq Scan on orders  (cost=0.00..45892.00 rows=12 width=380)\n  Filter: (user_id = 42)"
  }]
}

GC/Runtime Metrics

Background thread collects Ruby runtime stats at a configurable interval:

OpenTrace.configure do |c|
  c.runtime_metrics = true
  c.runtime_metrics_interval = 30  # seconds (default: 30)
end

Metrics collected: gc_count, gc_major_count, gc_minor_count, gc_heap_live_slots, gc_heap_free_slots, gc_malloc_increase_bytes, thread_count, process_rss_mb, process_pid. Sent as event_type: "runtime.metrics".

Unix Socket Transport

For deployments where the OpenTrace server runs on the same host:

OpenTrace.configure do |c|
  c.transport = :unix_socket
  c.socket_path = "/var/run/opentrace.sock"
end

2-5x faster than HTTP for local delivery (no TCP/TLS overhead). Falls back to HTTP automatically if the socket is unavailable. Same batching, compression, and retry guarantees.

Lifecycle Callbacks

Hook into key lifecycle events:

OpenTrace.configure do |c|
  # Called when OpenTrace.error captures an exception
  c.on_error = ->(exception, metadata) {
    Sentry.capture_exception(exception) if metadata[:exception_class] == "CriticalError"
  }

  # Called after each successful batch delivery
  c.after_send = ->(batch_size, bytes) {
    StatsD.histogram("opentrace.batch_size", batch_size)
  }

  # Filter or modify breadcrumbs
  c.before_breadcrumb = ->(crumb) {
    crumb.category == "secret" ? nil : crumb  # return nil to drop
  }

  # Filter or modify entire payloads before delivery
  c.before_send = ->(payload) {
    payload[:metadata].delete(:internal_debug) if Rails.env.production?
    payload  # return nil to drop the entire payload
  }
end

All callbacks are wrapped in rescue -- a broken callback will never affect the host app.

Runtime Controls

OpenTrace.enabled?  # check if logging is active
OpenTrace.disable!  # turn off (logs are silently dropped)
OpenTrace.enable!   # turn back on

Graceful Shutdown

An at_exit hook is registered automatically to flush pending logs (up to 2 seconds) when the process exits. No configuration needed.

For manual control (e.g. a Sidekiq worker), you can drain the queue explicitly:

OpenTrace.shutdown(timeout: 5)

This gives the background thread up to 5 seconds to send any remaining queued logs.

Server Compatibility

OpenTrace works with any Rack-compatible Ruby web server:

Server Concurrency Support
Puma Threads Full support
Unicorn Forked workers Full support (fork-safe)
Passenger Forks + threads Full support (fork-safe)
Falcon Fibers Full support (fiber-safe)

Fork safety: When a process forks (Puma cluster mode, Unicorn, Passenger), the background dispatch thread from the parent is dead in the child. OpenTrace detects the fork via PID check and cleanly re-initializes the queue, mutex, and thread.

Fiber safety: Request IDs use Fiber[] storage instead of Thread.current, so concurrent requests on the same thread (as in Falcon) are correctly isolated.

How It Works

Your App --log()--> [In-Memory Queue] --background thread--> POST /api/logs --> OpenTrace Server
  • Logs are serialized to JSON and pushed onto an in-memory queue
  • A single background thread reads from the queue and sends batches via POST /api/logs
  • enqueue is non-blocking -- it uses try_lock so it never waits on a mutex
  • The thread is started lazily on the first log call -- no threads are created at boot
  • If the queue exceeds 1,000 items, new logs are dropped (oldest are preserved)
  • Payloads exceeding 256 KB (configurable via max_payload_bytes) are intelligently truncated (backtrace, params, SQL removed first)
  • If still too large after truncation, the payload is split and retried in smaller batches
  • Failed requests are retried with exponential backoff (up to 3 attempts by default)
  • A circuit breaker stops sending when the server is unreachable, resuming after a cooldown
  • Rate-limited responses (429) trigger a backoff delay, respecting the server's Retry-After header
  • Authentication failures (401) suspend sending and print a one-time warning to STDERR
  • The HTTP timeout defaults to 1 second
  • Pending logs are flushed on process exit via an at_exit hook

Retry & Circuit Breaker

Failed HTTP requests are retried with exponential backoff and jitter. Only server errors (5xx) and network failures are retried -- client errors (4xx) are not.

OpenTrace.configure do |c|
  # ...
  c.max_retries      = 2    # up to 3 total attempts (default: 2)
  c.retry_base_delay = 0.1  # 100ms initial backoff (default: 0.1)
  c.retry_max_delay  = 2.0  # cap backoff at 2 seconds (default: 2.0)
end

A circuit breaker prevents wasting resources when the server is down. After a threshold of consecutive failures, the circuit opens and all sends are skipped. After a cooldown, a single probe request is sent. If it succeeds, the circuit closes and normal operation resumes.

OpenTrace.configure do |c|
  # ...
  c.circuit_breaker_threshold = 5   # failures before opening (default: 5)
  c.circuit_breaker_timeout   = 30  # seconds before probe (default: 30)
end

Backpressure Handling

The client responds intelligently to HTTP status codes:

Status Behavior
2xx Success -- circuit breaker resets
429 Rate limited -- pauses for Retry-After seconds (or rate_limit_backoff), re-enqueues the batch
401 Auth failed -- suspends sending, prints one-time STDERR warning. Resumes after OpenTrace.configure
5xx Server error -- retried with backoff, counts toward circuit breaker
Other 4xx Client error -- batch dropped silently
OpenTrace.configure do |c|
  # ...
  c.rate_limit_backoff = 5.0  # fallback when Retry-After header is missing (default: 5.0)
end

Delivery Observability

The client exposes internal delivery statistics so you can monitor the health of the log pipeline:

OpenTrace.stats
# => {
#   enqueued: 15234,
#   delivered: 15100,
#   dropped_queue_full: 34,
#   dropped_circuit_open: 100,
#   dropped_auth_suspended: 0,
#   dropped_error: 0,
#   retries: 12,
#   rate_limited: 2,
#   auth_failures: 0,
#   payload_splits: 1,
#   batches_sent: 302,
#   bytes_sent: 4812300,
#   queue_size: 23,
#   circuit_state: :closed,
#   auth_suspended: false,
#   uptime_seconds: 3600
# }

OpenTrace.healthy?      # true when circuit is closed and auth is not suspended
OpenTrace.reset_stats!  # reset counters (useful after reading/reporting)

Drop Callback

Register a callback to be notified when logs are dropped. The callback receives the count of dropped items and the reason:

OpenTrace.configure do |c|
  # ...
  c.on_drop = ->(count, reason) {
    StatsD.increment("opentrace.dropped", count, tags: ["reason:#{reason}"])
  }
end

Reasons: :queue_full, :circuit_open, :auth_suspended, :error

The callback is called synchronously but exceptions are always swallowed -- a broken callback will never affect the client.

Gzip Compression

Outgoing batches are automatically gzip-compressed when they exceed the compression threshold (default: 1KB). This typically achieves 70-85% bandwidth reduction for log payloads with repetitive keys and values.

OpenTrace.configure do |c|
  # ...
  c.compression = true       # enable gzip compression (default: true)
  c.compression_threshold = 1024  # only compress payloads > 1KB (default: 1024)
  c.max_payload_bytes = 262_144   # max batch size before splitting (default: 256KB)
end

Compression uses Zlib::BEST_SPEED (level 1) for minimal CPU overhead (~0.14ms per batch). The server must support Content-Encoding: gzip on request bodies. OpenTrace server v0.6+ includes transparent decompression middleware.

Version Negotiation

On the first dispatch cycle, the client makes a lightweight GET /api/version call to discover the server's API version and capabilities. This runs once per process (or after fork) and never blocks enqueue.

# Check server capabilities programmatically
client = OpenTrace.send(:client)
client.supports?(:request_summaries)  # true if server advertises it
client.supports?(:gzip_request)       # true if server supports gzip

If the server requires a newer client API version, a warning is printed to STDERR:

[OpenTrace] Server requires API version >= 2, but this client supports version 1.
Please upgrade the opentrace gem. Log forwarding may not work correctly.

Every request includes an X-API-Version: 1 header so the server can reject incompatible clients with a clear error. Old servers without /api/version are handled gracefully — the check silently skips and all features remain enabled.

Distributed Tracing

When trace_propagation is enabled (the default), the middleware extracts or generates a W3C-compatible trace context for each request:

  • Incoming: Reads traceparent header (W3C standard), falls back to X-Trace-ID, then X-Request-ID
  • Outgoing: When http_tracking is enabled, injects traceparent, X-Trace-ID, and X-Request-ID into outbound HTTP requests

This enables cross-service correlation — all logs from a distributed request chain share the same trace_id.

OpenTrace.configure do |c|
  # ...
  c.trace_propagation = true   # extract/propagate trace context (default: true)
  c.http_tracking = true       # also inject into outgoing HTTP calls (opt-in)
end

Each log entry includes trace_id, span_id, and parent_span_id (when available) as top-level fields. The server indexes these for fast trace lookups.

Request Summary Architecture

When request_summary is enabled, events within a request are accumulated in a Fiber-local RequestCollector instead of being pushed to the queue individually:

Request Start
  Middleware creates RequestCollector in Fiber[]
    SQL events ──► collector.record_sql()      (no queue push)
    View events ──► collector.record_view()    (no queue push)
    Cache events ──► collector.record_cache()  (no queue push)
    HTTP events ──► collector.record_http()    (no queue push)
  Request End
    Controller subscriber builds request_summary from collector
    One queue push: metadata (user/request context) + request_summary (perf data)
  Middleware cleans up RequestCollector

This means a request with 30 SQL queries, 50 view renders, and 10 cache operations produces one log entry instead of 91.

Structured Request Metrics

When a RequestCollector is active, performance data is sent as a separate request_summary field instead of being merged into metadata. This allows the server to store it in a dedicated request_summaries table with indexed columns for fast analytical queries.

# Sent automatically by the Rails subscriber — no code changes needed.
# The payload looks like:
{
  "metadata": { "request_id": "req-abc", "user_id": 42 },
  "request_summary": {
    "controller": "InvoicesController",
    "action": "index",
    "method": "GET",
    "path": "/invoices",
    "status": 200,
    "duration_ms": 45.2,
    "sql_count": 3,
    "sql_total_ms": 12.1,
    "n_plus_one": false,
    "view_count": 2,
    "view_total_ms": 28.3,
    "cache_reads": 1,
    "cache_hits": 1,
    "cache_hit_ratio": 1.0,
    "timeline": [{"t": "sql", "n": "Invoice Load", "ms": 8.2, "at": 2.0}]
  }
}

You can also pass request_summary: manually:

OpenTrace.log("INFO", "Custom request", { user_id: 42 },
  request_summary: { controller: "Custom", action: "run", sql_count: 5 })

Backward compatibility: Old servers ignore the request_summary field. When no collector is active (background jobs, non-Rails), data falls back to metadata as before.

Log Payload Format

Each log is sent as a JSON object to POST /api/logs:

{
  "timestamp": "2026-02-08T12:41:00.000000Z",
  "level": "ERROR",
  "service": "billing-api",
  "environment": "production",
  "trace_id": "abc-123",
  "message": "PG::UniqueViolation",
  "metadata": {
    "user_id": 42,
    "request_id": "req-456",
    "hostname": "web-01",
    "pid": 12345,
    "git_sha": "a1b2c3d"
  },
  "request_summary": {
    "controller": "InvoicesController",
    "action": "index",
    "method": "GET",
    "path": "/invoices",
    "status": 200,
    "duration_ms": 45.2,
    "sql_count": 3,
    "sql_total_ms": 12.1,
    "view_count": 2,
    "view_total_ms": 28.3,
    "timeline": [...]
  }
}
Field Type Required
timestamp string (ISO 8601) yes
level string yes
message string yes
service string no
environment string no
trace_id string no
span_id string no
parent_span_id string no
event_type string no
metadata object no
request_summary object no

The server accepts a single JSON object or an array of objects.

Requirements

  • Ruby >= 3.2 (uses Fiber[] for fiber-local storage)
  • Rails >= 6 (optional, auto-detected)

License

MIT

About

Thin, safe Ruby client for OpenTrace log ingestion

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages