Skip to content

Conversation

@rafel-roboflow
Copy link
Contributor

@rafel-roboflow rafel-roboflow commented Jan 8, 2026

Resolves DG-1

  • Introduced a new visualization block for displaying customizable text on images.
  • Added utility functions for text layout and drawing.

Description

Please include a summary of the change and which issue is fixed or implemented. Please also include relevant motivation and context (e.g. links, docs, tickets etc.).

List any dependencies that are required for this change.

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

How has this change been tested, please provide a testcase or example of how you tested the change?

YOUR_ANSWER

Any specific deployment considerations

For example, documentation changes, usability, usage/costs, secrets, etc.

Docs

  • Docs updated? What were the changes:

Note

Introduces a new roboflow_core/text_display@v1 visualization block for rendering text onto images with parameter interpolation, styling, and flexible positioning.

  • Adds text_display/v1.py implementing TextDisplayVisualizationBlockV1 with templated text (using {{ $parameters.* }}), optional parameter operations, and options for text_color, background_color (including transparency), background_opacity, font_scale, font_thickness, padding, text_align, border_radius, and positioning via position_mode (absolute/relative with anchor, offset_x, offset_y)
  • Adds text_display/utils.py with layout and drawing utilities: compute_layout, draw_text_lines, draw_background (alpha/rounded corners), and anchor-based positioning
  • Wires the block into loader.py (imports and inclusion in load_blocks) so it’s available to workflows; outputs updated image via OUTPUT_IMAGE_KEY

Written by Cursor Bugbot for commit 520dcd9. This will update automatically on new commits. Configure here.

- Introduced a new visualization block for displaying customizable text on images.
- Added utility functions for text layout and drawing.
@CLAassistant
Copy link

CLAassistant commented Jan 8, 2026

CLA assistant check
All committers have signed the CLA.

@balthazur
Copy link
Contributor

bugbot run

Comment on lines +83 to +84
box_x = 0 if box_w > img_w else max(0, min(box_x, img_w - box_w))
box_y = 0 if box_h > img_h else max(0, min(box_y, img_h - box_h))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚡️Codeflash found 180% (1.80x) speedup for clamp_box in inference/core/workflows/core_steps/visualizations/text_display/utils.py

⏱️ Runtime : 1.03 milliseconds 369 microseconds (best of 195 runs)

📝 Explanation and details

The optimization replaces nested max(0, min(...)) function calls with explicit if-elif chains, yielding a 180% speedup (1.03ms → 369μs).

Key Performance Gains:

  1. Eliminates redundant comparisons: The original code always evaluates max(0, min(box_x, img_w - box_w)) even when box_w > img_w makes the result deterministic (0). The optimized version short-circuits with early returns.

  2. Reduces function call overhead: Python function calls (max, min) carry overhead. The optimized version uses direct comparisons and assignments, which are faster primitive operations.

  3. Better branch prediction: The if-elif chain provides clearer branching patterns that modern CPUs can predict more effectively than nested function calls.

Test Case Performance:

  • Best speedups (150-200%): Cases where boxes fit within bounds or need simple clamping (most common scenarios)
  • Slight regressions (1-10% slower): Cases where box_w > img_w AND box_h > img_h (rare edge case requiring both early exits)
  • Stress tests: Show consistent 160-220% improvements, indicating the optimization scales well

Impact on Workloads:
The function is called from compute_layout() during text overlay rendering—a common operation in computer vision pipelines. Since text bounding boxes typically fit within image bounds (the common case), this optimization directly benefits the hot path. The 180% speedup means text visualization workflows can process ~2.8x more frames or annotations per second, significantly improving throughput in real-time video processing or batch annotation tasks.

Correctness verification report:

Test Status
⏪ Replay Tests 🔘 None Found
⚙️ Existing Unit Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
🌀 Generated Regression Tests 1210 Passed
📊 Tests Coverage 100.0%
🌀 Click to see Generated Regression Tests
import pytest  # used for our unit tests
from inference.core.workflows.core_steps.visualizations.text_display.utils import (
    clamp_box,
)

# unit tests

# -------------------------
# Basic Test Cases
# -------------------------


def test_box_fits_within_image_top_left_corner():
    # Box fits entirely within image, positioned at top-left
    codeflash_output = clamp_box(
        0, 0, 10, 10, 100, 100
    )  # 1.81μs -> 743ns (143% faster)


def test_box_fits_within_image_center():
    # Box fits entirely within image, positioned at center
    codeflash_output = clamp_box(
        45, 45, 10, 10, 100, 100
    )  # 1.76μs -> 678ns (159% faster)


def test_box_near_right_edge():
    # Box near right edge, should clamp to img_w - box_w
    codeflash_output = clamp_box(
        95, 10, 10, 10, 100, 100
    )  # 1.77μs -> 853ns (107% faster)


def test_box_near_bottom_edge():
    # Box near bottom edge, should clamp to img_h - box_h
    codeflash_output = clamp_box(
        10, 95, 10, 10, 100, 100
    )  # 1.76μs -> 803ns (119% faster)


def test_box_exactly_fills_image():
    # Box exactly fills image, should be placed at (0, 0)
    codeflash_output = clamp_box(
        0, 0, 100, 100, 100, 100
    )  # 1.62μs -> 649ns (150% faster)


# -------------------------
# Edge Test Cases
# -------------------------


def test_box_width_larger_than_image():
    # Box width larger than image width, should clamp to x=0
    codeflash_output = clamp_box(
        50, 10, 120, 10, 100, 100
    )  # 1.42μs -> 669ns (113% faster)


def test_box_height_larger_than_image():
    # Box height larger than image height, should clamp to y=0
    codeflash_output = clamp_box(
        10, 50, 10, 120, 100, 100
    )  # 1.40μs -> 689ns (103% faster)


def test_box_width_and_height_larger_than_image():
    # Both box width and height larger than image, should clamp to (0, 0)
    codeflash_output = clamp_box(
        50, 50, 120, 120, 100, 100
    )  # 608ns -> 635ns (4.25% slower)


def test_box_negative_position_clamped_to_zero():
    # Negative box position, should clamp to (0, 0)
    codeflash_output = clamp_box(
        -10, -10, 10, 10, 100, 100
    )  # 2.04μs -> 738ns (176% faster)


def test_box_position_exceeds_image_bounds():
    # Box position outside image, should clamp to max allowed position
    codeflash_output = clamp_box(
        200, 200, 10, 10, 100, 100
    )  # 1.77μs -> 844ns (109% faster)


def test_box_zero_width_and_height():
    # Box with zero width/height, should clamp to (0, 0)
    codeflash_output = clamp_box(
        10, 10, 0, 0, 100, 100
    )  # 1.73μs -> 674ns (157% faster)


def test_box_width_equal_to_image_width():
    # Box width equals image width, should clamp to x=0
    codeflash_output = clamp_box(
        50, 10, 100, 10, 100, 100
    )  # 1.75μs -> 779ns (124% faster)


def test_box_height_equal_to_image_height():
    # Box height equals image height, should clamp to y=0
    codeflash_output = clamp_box(
        10, 50, 10, 100, 100, 100
    )  # 1.74μs -> 744ns (134% faster)


def test_box_at_maximum_possible_position():
    # Box at maximum possible position
    codeflash_output = clamp_box(
        90, 90, 10, 10, 100, 100
    )  # 1.65μs -> 621ns (165% faster)


def test_box_minimum_size_at_maximum_position():
    # Box of size 1 at maximum position
    codeflash_output = clamp_box(
        99, 99, 1, 1, 100, 100
    )  # 1.62μs -> 632ns (156% faster)


def test_box_position_and_size_zero():
    # All values zero, should clamp to (0, 0)
    codeflash_output = clamp_box(0, 0, 0, 0, 0, 0)  # 1.60μs -> 628ns (155% faster)


def test_box_size_zero_with_nonzero_image():
    # Box size zero, image size nonzero, position arbitrary
    codeflash_output = clamp_box(
        50, 50, 0, 0, 100, 100
    )  # 1.66μs -> 622ns (167% faster)


def test_box_size_equals_image_size_and_position_nonzero():
    # Box size equals image size, position nonzero, should clamp to (0, 0)
    codeflash_output = clamp_box(
        10, 10, 100, 100, 100, 100
    )  # 1.63μs -> 781ns (109% faster)


def test_box_size_just_one_less_than_image():
    # Box size just one less than image, position at edge
    codeflash_output = clamp_box(
        99, 99, 99, 99, 100, 100
    )  # 1.64μs -> 730ns (125% faster)


def test_box_size_one_with_large_image():
    # Box size one, image size large, position at edge
    codeflash_output = clamp_box(
        999, 999, 1, 1, 1000, 1000
    )  # 2.08μs -> 913ns (128% faster)


def test_box_negative_size():
    # Negative box size, should clamp position to (0, 0)
    codeflash_output = clamp_box(
        10, 10, -10, -10, 100, 100
    )  # 1.87μs -> 788ns (138% faster)


def test_image_size_zero():
    # Image size zero, box size nonzero, should clamp to (0, 0)
    codeflash_output = clamp_box(10, 10, 10, 10, 0, 0)  # 636ns -> 696ns (8.62% slower)


def test_box_size_larger_than_zero_image():
    # Box size larger than zero image, should clamp to (0, 0)
    codeflash_output = clamp_box(10, 10, 10, 10, 0, 0)  # 628ns -> 649ns (3.24% slower)


# -------------------------
# Large Scale Test Cases
# -------------------------


def test_large_box_and_image():
    # Large box and image, box fits inside
    codeflash_output = clamp_box(
        500, 500, 100, 100, 1000, 1000
    )  # 2.12μs -> 872ns (143% faster)


def test_large_box_exceeds_image_bounds():
    # Large box, position exceeds image bounds, should clamp to max allowed
    codeflash_output = clamp_box(
        950, 950, 100, 100, 1000, 1000
    )  # 1.95μs -> 951ns (105% faster)


def test_large_box_larger_than_image():
    # Box larger than image, should clamp to (0, 0)
    codeflash_output = clamp_box(
        500, 500, 2000, 2000, 1000, 1000
    )  # 691ns -> 702ns (1.57% slower)


def test_large_box_zero_size():
    # Large image, box size zero, position arbitrary
    codeflash_output = clamp_box(
        999, 999, 0, 0, 1000, 1000
    )  # 1.96μs -> 797ns (146% faster)


def test_large_box_negative_position():
    # Large image, negative box position, should clamp to (0, 0)
    codeflash_output = clamp_box(
        -100, -100, 100, 100, 1000, 1000
    )  # 1.94μs -> 740ns (162% faster)


def test_large_box_near_right_bottom_edge():
    # Large image, box near right and bottom edge
    codeflash_output = clamp_box(
        995, 995, 10, 10, 1000, 1000
    )  # 1.86μs -> 915ns (103% faster)


def test_large_box_width_equal_to_image():
    # Large image, box width equal to image width, should clamp to x=0
    codeflash_output = clamp_box(
        500, 500, 1000, 10, 1000, 1000
    )  # 2.14μs -> 911ns (135% faster)


def test_large_box_height_equal_to_image():
    # Large image, box height equal to image height, should clamp to y=0
    codeflash_output = clamp_box(
        500, 500, 10, 1000, 1000, 1000
    )  # 2.13μs -> 879ns (143% faster)


def test_many_boxes_in_bounds():
    # Test many boxes within bounds to check performance and correctness
    for i in range(0, 1000, 100):
        codeflash_output = clamp_box(
            i, i, 10, 10, 1000, 1000
        )  # 10.3μs -> 3.63μs (184% faster)


def test_many_boxes_out_of_bounds():
    # Test many boxes out of bounds to check clamping
    for i in range(900, 1100, 10):
        codeflash_output = clamp_box(
            i, i, 100, 100, 1000, 1000
        )  # 18.2μs -> 6.99μs (160% faster)


def test_many_boxes_larger_than_image():
    # Test many boxes larger than image
    for w in range(1001, 1100, 10):
        for h in range(1001, 1100, 10):
            codeflash_output = clamp_box(50, 50, w, h, 1000, 1000)


def test_many_boxes_zero_size():
    # Test many boxes with zero size
    for i in range(0, 1000, 100):
        codeflash_output = clamp_box(
            i, i, 0, 0, 1000, 1000
        )  # 10.2μs -> 3.52μs (188% faster)


# -------------------------
# Mutation-sensitive cases
# -------------------------


def test_mutation_sensitive_x_clamping():
    # If the x clamping logic is changed, this should fail
    codeflash_output = clamp_box(
        999, 10, 10, 10, 1000, 1000
    )  # 1.90μs -> 887ns (114% faster)


def test_mutation_sensitive_y_clamping():
    # If the y clamping logic is changed, this should fail
    codeflash_output = clamp_box(
        10, 999, 10, 10, 1000, 1000
    )  # 1.96μs -> 884ns (121% faster)


# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
import pytest  # used for our unit tests
from inference.core.workflows.core_steps.visualizations.text_display.utils import (
    clamp_box,
)

# function to test
# (already imported above)

# unit tests


class TestClampBoxBasicCases:
    """Test basic functionality under normal conditions."""

    def test_box_at_origin(self):
        """Test box positioned at (0, 0) - should remain unchanged."""
        # Box at origin within a 100x100 image
        box_x, box_y = clamp_box(
            0, 0, 50, 50, 100, 100
        )  # 1.93μs -> 738ns (162% faster)

    def test_box_in_middle(self):
        """Test box positioned in the middle of the image - should remain unchanged."""
        # 20x20 box at position (40, 40) in a 100x100 image
        box_x, box_y = clamp_box(
            40, 40, 20, 20, 100, 100
        )  # 1.86μs -> 683ns (172% faster)

    def test_box_at_maximum_valid_position(self):
        """Test box at the rightmost and bottommost valid position."""
        # 30x30 box at position (70, 70) in a 100x100 image (box ends at 100, 100)
        box_x, box_y = clamp_box(
            70, 70, 30, 30, 100, 100
        )  # 1.78μs -> 652ns (173% faster)

    def test_box_at_right_edge(self):
        """Test box positioned exactly at the right edge."""
        # 50x20 box at position (50, 10) in a 100x100 image
        box_x, box_y = clamp_box(
            50, 10, 50, 20, 100, 100
        )  # 1.81μs -> 675ns (169% faster)

    def test_box_at_bottom_edge(self):
        """Test box positioned exactly at the bottom edge."""
        # 20x50 box at position (10, 50) in a 100x100 image
        box_x, box_y = clamp_box(
            10, 50, 20, 50, 100, 100
        )  # 1.75μs -> 640ns (174% faster)


class TestClampBoxOutOfBoundsCases:
    """Test behavior when box position is out of bounds."""

    def test_negative_x_position(self):
        """Test box with negative x coordinate - should clamp to 0."""
        # 30x30 box at position (-10, 20) in a 100x100 image
        box_x, box_y = clamp_box(
            -10, 20, 30, 30, 100, 100
        )  # 2.10μs -> 828ns (154% faster)

    def test_negative_y_position(self):
        """Test box with negative y coordinate - should clamp to 0."""
        # 30x30 box at position (20, -10) in a 100x100 image
        box_x, box_y = clamp_box(
            20, -10, 30, 30, 100, 100
        )  # 1.98μs -> 759ns (161% faster)

    def test_both_negative_positions(self):
        """Test box with both negative coordinates - should clamp both to 0."""
        # 30x30 box at position (-5, -15) in a 100x100 image
        box_x, box_y = clamp_box(
            -5, -15, 30, 30, 100, 100
        )  # 1.93μs -> 801ns (141% faster)

    def test_x_beyond_right_edge(self):
        """Test box positioned beyond the right edge - should clamp x."""
        # 30x30 box at position (80, 20) in a 100x100 image (would end at 110)
        box_x, box_y = clamp_box(
            80, 20, 30, 30, 100, 100
        )  # 1.74μs -> 807ns (116% faster)

    def test_y_beyond_bottom_edge(self):
        """Test box positioned beyond the bottom edge - should clamp y."""
        # 30x30 box at position (20, 80) in a 100x100 image (would end at 110)
        box_x, box_y = clamp_box(
            20, 80, 30, 30, 100, 100
        )  # 1.70μs -> 764ns (122% faster)

    def test_both_beyond_edges(self):
        """Test box positioned beyond both right and bottom edges."""
        # 30x30 box at position (90, 85) in a 100x100 image
        box_x, box_y = clamp_box(
            90, 85, 30, 30, 100, 100
        )  # 1.70μs -> 800ns (112% faster)

    def test_far_beyond_edges(self):
        """Test box positioned very far beyond image bounds."""
        # 20x20 box at position (1000, 2000) in a 100x100 image
        box_x, box_y = clamp_box(
            1000, 2000, 20, 20, 100, 100
        )  # 1.84μs -> 895ns (105% faster)


class TestClampBoxOversizedBoxCases:
    """Test behavior when box is larger than the image."""

    def test_box_wider_than_image(self):
        """Test box wider than image - x should be clamped to 0."""
        # 150x30 box at position (10, 20) in a 100x100 image
        box_x, box_y = clamp_box(
            10, 20, 150, 30, 100, 100
        )  # 1.42μs -> 703ns (101% faster)

    def test_box_taller_than_image(self):
        """Test box taller than image - y should be clamped to 0."""
        # 30x150 box at position (20, 10) in a 100x100 image
        box_x, box_y = clamp_box(
            20, 10, 30, 150, 100, 100
        )  # 1.45μs -> 673ns (115% faster)

    def test_box_larger_in_both_dimensions(self):
        """Test box larger than image in both dimensions - both should be 0."""
        # 200x200 box at position (50, 50) in a 100x100 image
        box_x, box_y = clamp_box(
            50, 50, 200, 200, 100, 100
        )  # 587ns -> 650ns (9.69% slower)

    def test_box_exactly_wider_than_image(self):
        """Test box exactly one pixel wider than image."""
        # 101x50 box at position (5, 10) in a 100x100 image
        box_x, box_y = clamp_box(
            5, 10, 101, 50, 100, 100
        )  # 1.42μs -> 676ns (109% faster)

    def test_box_exactly_taller_than_image(self):
        """Test box exactly one pixel taller than image."""
        # 50x101 box at position (10, 5) in a 100x100 image
        box_x, box_y = clamp_box(
            10, 5, 50, 101, 100, 100
        )  # 1.40μs -> 687ns (104% faster)

    def test_oversized_box_with_negative_position(self):
        """Test oversized box with negative position - should still clamp to 0."""
        # 150x150 box at position (-10, -20) in a 100x100 image
        box_x, box_y = clamp_box(
            -10, -20, 150, 150, 100, 100
        )  # 607ns -> 616ns (1.46% slower)


class TestClampBoxEdgeCases:
    """Test edge cases and boundary conditions."""

    def test_box_same_size_as_image(self):
        """Test box exactly the same size as image - only valid position is (0, 0)."""
        # 100x100 box in a 100x100 image
        box_x, box_y = clamp_box(
            0, 0, 100, 100, 100, 100
        )  # 1.76μs -> 672ns (162% faster)

    def test_box_same_size_as_image_with_offset(self):
        """Test box same size as image but with non-zero position - should clamp to 0."""
        # 100x100 box at position (10, 20) in a 100x100 image
        box_x, box_y = clamp_box(
            10, 20, 100, 100, 100, 100
        )  # 1.77μs -> 789ns (124% faster)

    def test_zero_width_box(self):
        """Test box with zero width."""
        # 0x50 box at position (50, 25) in a 100x100 image
        box_x, box_y = clamp_box(
            50, 25, 0, 50, 100, 100
        )  # 1.71μs -> 641ns (166% faster)

    def test_zero_height_box(self):
        """Test box with zero height."""
        # 50x0 box at position (25, 50) in a 100x100 image
        box_x, box_y = clamp_box(
            25, 50, 50, 0, 100, 100
        )  # 1.64μs -> 629ns (160% faster)

    def test_zero_size_box(self):
        """Test box with zero width and height."""
        # 0x0 box at position (30, 40) in a 100x100 image
        box_x, box_y = clamp_box(
            30, 40, 0, 0, 100, 100
        )  # 1.67μs -> 617ns (171% faster)

    def test_zero_width_image(self):
        """Test with zero width image."""
        # 50x50 box at position (10, 10) in a 0x100 image
        box_x, box_y = clamp_box(
            10, 10, 50, 50, 0, 100
        )  # 1.46μs -> 694ns (110% faster)

    def test_zero_height_image(self):
        """Test with zero height image."""
        # 50x50 box at position (10, 10) in a 100x0 image
        box_x, box_y = clamp_box(
            10, 10, 50, 50, 100, 0
        )  # 1.44μs -> 693ns (108% faster)

    def test_zero_size_image(self):
        """Test with zero size image."""
        # 50x50 box at position (10, 10) in a 0x0 image
        box_x, box_y = clamp_box(10, 10, 50, 50, 0, 0)  # 627ns -> 607ns (3.29% faster)

    def test_one_pixel_box(self):
        """Test 1x1 pixel box."""
        # 1x1 box at position (50, 60) in a 100x100 image
        box_x, box_y = clamp_box(
            50, 60, 1, 1, 100, 100
        )  # 1.76μs -> 660ns (166% faster)

    def test_one_pixel_box_at_edge(self):
        """Test 1x1 pixel box at maximum position."""
        # 1x1 box at position (99, 99) in a 100x100 image
        box_x, box_y = clamp_box(
            99, 99, 1, 1, 100, 100
        )  # 1.77μs -> 619ns (186% faster)

    def test_one_pixel_box_beyond_edge(self):
        """Test 1x1 pixel box beyond edge."""
        # 1x1 box at position (100, 100) in a 100x100 image
        box_x, box_y = clamp_box(
            100, 100, 1, 1, 100, 100
        )  # 1.68μs -> 779ns (115% faster)

    def test_one_pixel_image(self):
        """Test with 1x1 pixel image."""
        # 1x1 box at position (0, 0) in a 1x1 image
        box_x, box_y = clamp_box(0, 0, 1, 1, 1, 1)  # 1.67μs -> 637ns (162% faster)


class TestClampBoxLargeScaleCases:
    """Test performance and scalability with large data samples."""

    def test_very_large_image_dimensions(self):
        """Test with very large image dimensions (4K resolution)."""
        # 100x100 box at position (2000, 1500) in a 3840x2160 image
        box_x, box_y = clamp_box(
            2000, 1500, 100, 100, 3840, 2160
        )  # 2.12μs -> 900ns (135% faster)

    def test_very_large_image_with_clamping(self):
        """Test clamping with very large image dimensions."""
        # 200x200 box at position (10000, 5000) in a 3840x2160 image
        box_x, box_y = clamp_box(
            10000, 5000, 200, 200, 3840, 2160
        )  # 1.97μs -> 972ns (102% faster)

    def test_8k_resolution_image(self):
        """Test with 8K resolution image."""
        # 500x500 box at position (4000, 2000) in a 7680x4320 image
        box_x, box_y = clamp_box(
            4000, 2000, 500, 500, 7680, 4320
        )  # 1.92μs -> 829ns (132% faster)

    def test_extremely_large_box_on_large_image(self):
        """Test very large box on large image."""
        # 5000x3000 box at position (1000, 500) in a 7680x4320 image
        box_x, box_y = clamp_box(
            1000, 500, 5000, 3000, 7680, 4320
        )  # 1.82μs -> 726ns (151% faster)

    def test_oversized_box_on_large_image(self):
        """Test oversized box on large image."""
        # 10000x5000 box at position (100, 200) in a 7680x4320 image
        box_x, box_y = clamp_box(
            100, 200, 10000, 5000, 7680, 4320
        )  # 684ns -> 694ns (1.44% slower)

    def test_multiple_clamp_operations(self):
        """Test multiple clamping operations to ensure consistency."""
        # Perform 500 clamping operations with various parameters
        for i in range(500):
            box_x, box_y = clamp_box(
                i * 2, i * 3, 50, 50, 1000, 1000
            )  # 427μs -> 140μs (205% faster)

    def test_rapid_clamping_with_varying_positions(self):
        """Test rapid clamping with varying positions."""
        # Test 300 different positions
        for x in range(0, 3000, 10):
            box_x, box_y = clamp_box(
                x, x // 2, 100, 100, 1920, 1080
            )  # 258μs -> 86.8μs (198% faster)

    def test_large_negative_positions(self):
        """Test with very large negative positions."""
        # Box at position (-10000, -5000) in a 1920x1080 image
        box_x, box_y = clamp_box(
            -10000, -5000, 200, 150, 1920, 1080
        )  # 1.86μs -> 726ns (157% faster)

    def test_large_positive_positions(self):
        """Test with very large positive positions."""
        # Box at position (1000000, 500000) in a 1920x1080 image
        box_x, box_y = clamp_box(
            1000000, 500000, 200, 150, 1920, 1080
        )  # 1.73μs -> 841ns (106% faster)

    def test_stress_test_various_box_sizes(self):
        """Stress test with various box sizes."""
        # Test 200 different box sizes
        img_w, img_h = 2000, 2000
        for size in range(10, 2010, 10):
            box_x, box_y = clamp_box(
                100, 100, size, size, img_w, img_h
            )  # 174μs -> 54.7μs (219% faster)
            if size > img_w:
                pass
            else:
                pass
            if size > img_h:
                pass
            else:
                pass


# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To test or edit this optimization locally git merge codeflash/optimize-pr1895-2026-01-08T17.52.47

Suggested change
box_x = 0 if box_w > img_w else max(0, min(box_x, img_w - box_w))
box_y = 0 if box_h > img_h else max(0, min(box_y, img_h - box_h))
if box_w > img_w:
box_x = 0
elif box_x < 0:
box_x = 0
elif box_x > img_w - box_w:
box_x = img_w - box_w
if box_h > img_h:
box_y = 0
elif box_y < 0:
box_y = 0
elif box_y > img_h - box_h:
box_y = img_h - box_h

Static Badge

Comment on lines +286 to +289
blended = cv2.addWeighted(overlay, alpha, roi, 1 - alpha, 0)

# Write blended result back to image
img[y1_clamped:y2_clamped, x1_clamped:x2_clamped] = blended
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚡️Codeflash found 18% (0.18x) speedup for draw_background_with_alpha in inference/core/workflows/core_steps/visualizations/text_display/utils.py

⏱️ Runtime : 3.03 milliseconds 2.57 milliseconds (best of 250 runs)

📝 Explanation and details

The optimized code achieves a 17% speedup through a single, impactful change in the draw_background_with_alpha function:

Key Optimization: In-Place Alpha Blending

The critical change is replacing:

blended = cv2.addWeighted(overlay, alpha, roi, 1 - alpha, 0)
img[y1_clamped:y2_clamped, x1_clamped:x2_clamped] = blended

with:

cv2.addWeighted(overlay, alpha, roi, 1 - alpha, 0, dst=roi)
img[y1_clamped:y2_clamped, x1_clamped:x2_clamped] = roi

Why this is faster:

  1. Eliminates temporary array allocation: The original code creates a new blended array to store the result. By using the dst=roi parameter, cv2.addWeighted writes directly into the existing roi array, eliminating one memory allocation.

  2. Reduces memory operations: Line profiler shows the cv2.addWeighted call drops from ~1.04ms to ~0.69ms (33% faster), and the subsequent assignment operation drops from ~0.42ms to ~0.29ms (31% faster).

  3. Better cache locality: Since roi is a view into the original image array, writing directly to it keeps the data in cache rather than creating a separate result buffer.

Performance Impact Analysis

Based on the function_references, draw_background_with_alpha is called from draw_background, which is likely in the rendering path for text display visualizations. The optimization particularly benefits:

  • Large rectangles (e.g., test_large_full_image_rectangle: 133% faster on 500×500 images) - The memory savings compound with larger regions
  • Repeated operations (e.g., test_large_performance_multiple_calls: 4.5% faster over 50 calls) - Reduced GC pressure accumulates over many draws
  • Alpha blending scenarios where background_opacity < 1.0 - All alpha-blended backgrounds benefit from this optimization

The optimization has minimal impact on small rectangles (often <2% change) but provides substantial gains when drawing larger backgrounds, making it valuable for typical visualization workloads where text overlays with semi-transparent backgrounds are common.

Correctness verification report:

Test Status
⏪ Replay Tests 🔘 None Found
⚙️ Existing Unit Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
🌀 Generated Regression Tests 176 Passed
📊 Tests Coverage 93.9%
🌀 Click to see Generated Regression Tests
import cv2
import numpy as np

# imports
import pytest
from inference.core.workflows.core_steps.visualizations.text_display.utils import (
    draw_background_with_alpha,
)


# Helper function to create a blank image
def blank_img(width, height, color=(0, 0, 0)):
    """Create a blank image of the given color (BGR)."""
    arr = np.zeros((height, width, 3), dtype=np.uint8)
    arr[:, :] = color
    return arr


# Helper function to check if two images are equal
def images_equal(img1, img2):
    return np.array_equal(img1, img2)


# =======================
# BASIC TEST CASES
# =======================


def test_basic_full_alpha_rectangle():
    """Test drawing a solid rectangle with alpha=1 (should fully overwrite region)."""
    img = blank_img(10, 10, color=(0, 0, 0))
    draw_background_with_alpha(
        img, (2, 2), (7, 7), (10, 20, 30), alpha=1.0, border_radius=0
    )  # 15.4μs -> 14.8μs (4.10% faster)
    roi = img[2:7, 2:7]


def test_basic_zero_alpha_rectangle():
    """Test drawing with alpha=0 (should not change the image)."""
    img = blank_img(10, 10, color=(50, 60, 70))
    img_before = img.copy()
    draw_background_with_alpha(
        img, (2, 2), (7, 7), (100, 110, 120), alpha=0.0, border_radius=0
    )  # 14.3μs -> 14.2μs (0.253% faster)


def test_basic_half_alpha_rectangle():
    """Test drawing with alpha=0.5 (should blend colors equally)."""
    img = blank_img(10, 10, color=(100, 100, 100))
    draw_background_with_alpha(
        img, (0, 0), (10, 10), (200, 0, 0), alpha=0.5, border_radius=0
    )  # 13.9μs -> 14.1μs (1.38% slower)
    # Center pixel should be average of (100,100,100) and (200,0,0)
    expected = np.array([150, 50, 50], dtype=np.uint8)


def test_basic_rounded_rectangle():
    """Test that drawing with border_radius>0 does not raise and modifies image."""
    img = blank_img(20, 20, color=(0, 0, 0))
    draw_background_with_alpha(
        img, (5, 5), (15, 15), (0, 255, 0), alpha=1.0, border_radius=4
    )  # 27.1μs -> 27.1μs (0.285% faster)


def test_basic_non_square_rectangle():
    """Test drawing a non-square rectangle."""
    img = blank_img(20, 10, color=(0, 0, 0))
    draw_background_with_alpha(
        img, (2, 1), (18, 8), (123, 222, 111), alpha=1.0, border_radius=0
    )  # 13.8μs -> 13.6μs (1.53% faster)
    roi = img[1:8, 2:18]


# =======================
# EDGE TEST CASES
# =======================


def test_edge_rectangle_outside_image():
    """Rectangle completely outside image should not change the image."""
    img = blank_img(10, 10, color=(10, 20, 30))
    img_before = img.copy()
    draw_background_with_alpha(
        img, (20, 20), (30, 30), (255, 0, 0), alpha=1.0, border_radius=0
    )  # 2.59μs -> 2.65μs (2.38% slower)


def test_edge_rectangle_partially_outside_image():
    """Rectangle partially outside image should be clamped to image bounds."""
    img = blank_img(10, 10, color=(0, 0, 0))
    draw_background_with_alpha(
        img, (-5, -5), (5, 5), (255, 255, 255), alpha=1.0, border_radius=0
    )  # 14.8μs -> 14.6μs (1.23% faster)
    # Only top-left 5x5 should be white
    roi = img[0:5, 0:5]


def test_edge_zero_area_rectangle():
    """Rectangle with zero area should not modify the image."""
    img = blank_img(10, 10, color=(1, 2, 3))
    img_before = img.copy()
    draw_background_with_alpha(
        img, (5, 5), (5, 10), (10, 20, 30), alpha=1.0, border_radius=0
    )  # 2.48μs -> 2.71μs (8.21% slower)
    draw_background_with_alpha(
        img, (5, 5), (10, 5), (10, 20, 30), alpha=1.0, border_radius=0
    )  # 1.59μs -> 1.52μs (4.07% faster)


def test_edge_negative_border_radius():
    """Negative border_radius should be treated as 0 (rectangle)."""
    img = blank_img(10, 10, color=(0, 0, 0))
    draw_background_with_alpha(
        img, (2, 2), (8, 8), (50, 100, 150), alpha=1.0, border_radius=-5
    )  # 14.8μs -> 14.7μs (0.993% faster)
    roi = img[2:8, 2:8]


def test_edge_large_border_radius():
    """border_radius larger than half the rect min side should be clamped."""
    img = blank_img(10, 10, color=(0, 0, 0))
    # border_radius=100, but max possible is 3 for a 7x7 rect
    draw_background_with_alpha(
        img, (2, 2), (9, 9), (100, 200, 50), alpha=1.0, border_radius=100
    )  # 26.2μs -> 26.5μs (0.974% slower)


def test_edge_alpha_out_of_bounds():
    """Alpha < 0 should act as 0, alpha > 1 as 1 (cv2.addWeighted clamps)."""
    img = blank_img(10, 10, color=(10, 10, 10))
    img_copy = img.copy()
    draw_background_with_alpha(
        img, (0, 0), (10, 10), (200, 0, 0), alpha=-0.5, border_radius=0
    )  # 13.8μs -> 13.4μs (2.99% faster)
    draw_background_with_alpha(
        img, (0, 0), (10, 10), (200, 0, 0), alpha=2.0, border_radius=0
    )  # 6.73μs -> 6.54μs (2.89% faster)


def test_edge_single_pixel_rectangle():
    """Test drawing a 1x1 rectangle."""
    img = blank_img(5, 5, color=(0, 0, 0))
    draw_background_with_alpha(
        img, (2, 2), (3, 3), (255, 100, 50), alpha=1.0, border_radius=0
    )  # 13.1μs -> 13.0μs (0.600% faster)


def test_edge_rectangle_touching_image_border():
    """Test rectangle exactly on the image border."""
    img = blank_img(5, 5, color=(0, 0, 0))
    draw_background_with_alpha(
        img, (0, 0), (5, 5), (11, 22, 33), alpha=1.0, border_radius=0
    )  # 13.3μs -> 13.0μs (2.04% faster)


# =======================
# LARGE SCALE TEST CASES
# =======================


def test_large_full_image_rectangle():
    """Draw a rectangle covering the whole image (500x500)."""
    img = blank_img(500, 500, color=(10, 20, 30))
    draw_background_with_alpha(
        img, (0, 0), (500, 500), (100, 150, 200), alpha=0.7, border_radius=0
    )  # 696μs -> 298μs (133% faster)
    # Test a few random pixels for correct blending
    expected = (0.7 * np.array([100, 150, 200]) + 0.3 * np.array([10, 20, 30])).astype(
        np.uint8
    )
    for y in [0, 250, 499]:
        for x in [0, 250, 499]:
            pass


def test_large_many_small_rectangles():
    """Draw many small rectangles over a large image."""
    img = blank_img(100, 100, color=(0, 0, 0))
    for i in range(0, 100, 10):
        for j in range(0, 100, 10):
            draw_background_with_alpha(
                img, (i, j), (i + 10, j + 10), (i, j, 255), alpha=1.0, border_radius=3
            )
    # Test that the center of each square is colored and not black
    for i in range(5, 100, 10):
        for j in range(5, 100, 10):
            pass


def test_large_performance_multiple_calls():
    """Test many sequential calls do not crash or slow down."""
    img = blank_img(50, 50, color=(0, 0, 0))
    for i in range(50):
        draw_background_with_alpha(
            img,
            (i, 0),
            (min(i + 10, 50), 50),
            (i * 5 % 256, i * 7 % 256, i * 11 % 256),
            alpha=0.3 + 0.01 * i,
            border_radius=i % 6,
        )  # 734μs -> 702μs (4.54% faster)


def test_large_alpha_gradient():
    """Draw rectangles with increasing alpha to form a gradient."""
    img = blank_img(100, 10, color=(0, 0, 0))
    for i in range(10):
        draw_background_with_alpha(
            img,
            (i * 10, 0),
            ((i + 1) * 10, 10),
            (255, 0, 0),
            alpha=i / 9,
            border_radius=0,
        )  # 66.7μs -> 63.7μs (4.59% faster)
    # Middle should be blended
    mid = img[5, 45]


# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To test or edit this optimization locally git merge codeflash/optimize-pr1895-2026-01-08T17.56.22

Suggested change
blended = cv2.addWeighted(overlay, alpha, roi, 1 - alpha, 0)
# Write blended result back to image
img[y1_clamped:y2_clamped, x1_clamped:x2_clamped] = blended
cv2.addWeighted(overlay, alpha, roi, 1 - alpha, 0, dst=roi)
# Write blended result back to image
img[y1_clamped:y2_clamped, x1_clamped:x2_clamped] = roi

Static Badge

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no bugs!

rafel-roboflow and others added 2 commits January 9, 2026 11:35
…y/v1.py

Co-authored-by: Grzegorz Klimaszewski <166530809+grzegorz-roboflow@users.noreply.github.com>
@grzegorz-roboflow grzegorz-roboflow merged commit ac7ff18 into main Jan 9, 2026
51 checks passed
@grzegorz-roboflow grzegorz-roboflow deleted the feature-text-display-workflow branch January 9, 2026 11:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants