Skip to content

Conversation

@lukasmasuch
Copy link
Collaborator

@lukasmasuch lukasmasuch commented Dec 11, 2025

Describe your changes

Fixes a regression where starting with an empty column in data editor leads to the value being returned as list.

GitHub Issue Link (if applicable)

Testing Plan

  • Added unit and e2e tests.

Contribution License Agreement

By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.

@snyk-io
Copy link
Contributor

snyk-io bot commented Dec 11, 2025

Snyk checks have passed. No issues have been found so far.

Status Scanner Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues
Licenses 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

@lukasmasuch lukasmasuch added security-assessment-completed Security assessment has been completed for PR change:bugfix PR contains bug fix implementation impact:users PR changes affect end users labels Dec 11, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Dec 11, 2025

✅ PR preview is ready!

Name Link
📦 Wheel file https://core-previews.s3-us-west-2.amazonaws.com/pr-13309/streamlit-1.52.1-py3-none-any.whl
📦 @streamlit/component-v2-lib Download from artifacts
🕹️ Preview app pr-13309.streamlit.app (☁️ Deploy here if not accessible)

@lukasmasuch lukasmasuch marked this pull request as ready for review December 11, 2025 10:06
Copilot AI review requested due to automatic review settings December 11, 2025 10:06
@lukasmasuch
Copy link
Collaborator Author

@cursor review

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a regression in the data editor where editing cells in empty (None-only) columns incorrectly wrapped scalar values in lists. For example, entering "42" would return [42] instead of 42.

Key Changes:

  • Modified _parse_value() to distinguish between LIST and EMPTY column data kinds, preserving scalar values for empty columns
  • Added comprehensive unit tests for scalar value handling with empty columns
  • Added E2E test to verify the end-to-end behavior with both cell editing and row addition

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
lib/streamlit/elements/widgets/data_editor.py Splits handling of LIST and EMPTY column kinds to preserve scalar values from the frontend
lib/tests/streamlit/elements/data_editor_test.py Adds parameterized test cases for scalar values and dedicated test for empty column editing
e2e_playwright/st_data_editor_config.py Adds test app with empty column data editor configuration
e2e_playwright/st_data_editor_config_test.py Adds regression test for empty column editing behavior
e2e_playwright/shared/dataframe_utils.py Adds edit_cell_value() helper function for E2E tests

@lukasmasuch lukasmasuch added the ai-review If applied to PR or issue will run AI review workflow label Dec 11, 2025
@github-actions github-actions bot removed the ai-review If applied to PR or issue will run AI review workflow label Dec 11, 2025
@github-actions
Copy link
Contributor

Summary

This PR fixes a regression in the data editor where editing cells in columns that start with None values (empty columns) incorrectly wraps the edited value in a list. For example, entering "42" in an empty column would return [42] instead of 42.

The fix is minimal and targeted: it separates the handling of ColumnDataKind.EMPTY from ColumnDataKind.LIST in the _parse_value function, ensuring that scalar values entered in empty columns remain scalars while list values (from multiselect inputs) remain lists.

Changed Files:

  • lib/streamlit/elements/widgets/data_editor.py - Core bug fix
  • lib/tests/streamlit/elements/data_editor_test.py - Unit tests
  • e2e_playwright/st_data_editor_config.py - E2E test app
  • e2e_playwright/st_data_editor_config_test.py - E2E tests
  • e2e_playwright/shared/dataframe_utils.py - New helper utility

Code Quality

Core Fix (data_editor.py)

The fix is well-implemented:

        if column_data_kind == ColumnDataKind.EMPTY:
            # For empty columns, preserve the value type from the frontend.
            # If it's a list (e.g., from multiselect), return as list.
            # If it's a scalar (e.g., from number input), return as scalar.
            return list(value) if is_list_like(value) else value  # ty: ignore

Strengths:

  • Minimal, surgical change - only modifies the exact behavior needed
  • Good inline comments explaining the rationale
  • Preserves backwards compatibility for list column types
  • Uses existing is_list_like utility consistently

New Helper Utility (dataframe_utils.py)

The edit_cell_value helper function is a nice addition:

def edit_cell_value(page: Page, value: str, *, wait_for_run: bool = True) -> None:
    """Edit the currently open cell by filling a value and pressing Enter.

    This helper function fills a value in the currently open cell editor
    and submits it by pressing Enter. The cell must already be open for editing
    (e.g., via click_on_cell with double_click=True).

    Parameters
    ----------
    page : Page
        The Playwright page.
    value : str
        The value to fill in the cell.
    wait_for_run : bool
        Whether to wait for the app to complete a run after submitting.
        Defaults to True.
    """
    cell_overlay = get_open_cell_overlay(page)
    cell_overlay.locator(".gdg-input").fill(value)
    page.keyboard.press("Enter")
    if wait_for_run:
        wait_for_app_run(page)
  • Follows Numpydoc style
  • Uses keyword-only argument for optional wait_for_run parameter
  • Improves test readability and reduces code duplication

Test Coverage

Unit Tests ✅

The unit tests are comprehensive and well-structured:

  1. Parameterized tests for _parse_value cover multiple scalar types (int, str, float, bool) with ColumnDataKind.EMPTY
  2. Integration test test_apply_cell_edits_empty_columns verifies the fix in context with realistic DataFrame operations
  3. Tests properly verify that scalar values remain scalars AND list values remain lists
        # Scalar values should remain scalars, not be wrapped in lists
        assert df.iat[0, 0] == 42
        assert not isinstance(df.iat[0, 0], list)

        assert df.iat[0, 1] == "hello"
        assert not isinstance(df.iat[0, 1], list)

        # List values should remain lists
        assert df.iat[0, 2] == ["a", "b"]
        assert isinstance(df.iat[0, 2], list)

E2E Tests ⚠️

The E2E test has good coverage but there's a potential issue with the assertion approach:

Concern: The test uses exact_match=True with expect_prefixed_markdown:

    expect_prefixed_markdown(app, "Empty column result:", "42", exact_match=True)

However, the app writes str(empty_col_result) which is a DataFrame string representation:

st.write("Empty column result:", str(empty_col_result))

DataFrame string representations include column headers and index, so str(DataFrame) would produce multi-line output like:

   number_col text_col
0          42     None

Other similar tests in the codebase use exact_match=False for DataFrame outputs (see st_dataframe_interactions_test.py lines 540 and 621).

Recommendation: Change exact_match=True to exact_match=False on lines 339, 346, and 364 to match existing patterns for DataFrame verification, or verify that the current assertions actually pass in CI.

Backwards Compatibility

Fully backwards compatible

  • The fix only changes behavior for ColumnDataKind.EMPTY columns
  • Existing behavior for ColumnDataKind.LIST is preserved
  • No public API changes
  • Users who were affected by the bug (values incorrectly wrapped in lists) will now see correct behavior

Security & Risk

No security concerns identified

  • The change is confined to value parsing/conversion logic
  • No new external inputs or attack surfaces
  • No changes to authentication, authorization, or data transmission

Risk assessment: Low risk. The fix is minimal and well-tested. The only risk is if there are edge cases where users relied on the buggy behavior (values being wrapped in lists), but this is unlikely since the previous behavior was clearly a regression.

Recommendations

  1. E2E test assertions: Verify the E2E tests pass in CI. If they fail, change exact_match=True to exact_match=False on lines 339, 346, and 364 in st_data_editor_config_test.py to align with similar DataFrame tests in the codebase.

  2. Minor: Consider adding a test case for None value with ColumnDataKind.EMPTY in the parameterized unit tests (though this is already covered by the existing (None, ColumnDataKind.*, None) pattern).

Verdict

APPROVED: The core bug fix is correct, minimal, and well-tested with comprehensive unit tests. The E2E tests have a potential assertion issue that should be verified in CI, but the overall PR quality is high and the fix addresses a clear regression.


This is an automated AI review. Please verify the feedback and use your judgment.

@lukasmasuch lukasmasuch merged commit a69f286 into develop Dec 11, 2025
43 checks passed
@lukasmasuch lukasmasuch deleted the fix/start-from-none-data-editor-regression branch December 11, 2025 14:17
lukasmasuch added a commit that referenced this pull request Dec 16, 2025
Fixes a regression where starting with an empty column in data editor
leads to the value being returned as list.

- Closes #13305

- Added unit and e2e tests.

---

**Contribution License Agreement**

By submitting this pull request you agree that all contributions to this
project are made under the Apache 2.0 license.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

change:bugfix PR contains bug fix implementation impact:users PR changes affect end users security-assessment-completed Security assessment has been completed for PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Data Editor Number column returns list when starting from None values

3 participants