Skip to content

Snow 893080 feat bulk save base model#673

Merged
sfc-gh-mraba merged 1 commit into
mainfrom
SNOW-893080-feat-bulk-save-base-model
Apr 20, 2026
Merged

Snow 893080 feat bulk save base model#673
sfc-gh-mraba merged 1 commit into
mainfrom
SNOW-893080-feat-bulk-save-base-model

Conversation

@sfc-gh-mraba

@sfc-gh-mraba sfc-gh-mraba commented Apr 15, 2026

Copy link
Copy Markdown
Collaborator

Please answer these questions before submitting your pull requests. Thanks!

  1. What GitHub issue is this PR addressing? Make sure that there is an accompanying issue to your PR.

    Fixes SNOW-893080: session.bulk_save_objects does not put all objects in one INSERT #441

  2. Fill out the following pre-review checklist:

    • I am adding a new automated test(s) to verify correctness of my new code
    • I am adding new logging messages
    • I am adding new credentials
    • I am adding a new dependency
  3. Please describe how your code solves the related issue.

Fixes #441.

session.bulk_save_objects() emits O(N) INSERT statements when ORM models have randomly-populated nullable columns, because SQLAlchemy groups rows by their set of
non-None parameter keys.

How

Two new components in src/snowflake/sqlalchemy/orm.py:

  • SnowflakeBase / snowflake_declarative_base() — declarative base whose __init__ pre-populates every plain-nullable column with None (or its scalar default) so all instances share the same key set. Primary keys, server_default, callable, and SQL-expression defaults are excluded, mirroring SQLAlchemy's own _insert_cols_as_none logic.
  • SnowflakeSession — Session subclass that passes render_nulls=True to the internal bulk-save call, preventing pre-populated Nones from being stripped before grouping.

Both parts are required. SnowflakeBase is SA 2.x only; snowflake_declarative_base() works on SA 1.4 and 2.x. All three names exported from snowflake.sqlalchemy.

Tests

Unit tests in tests/test_unit_orm.py (no database required) cover column pre-population rules, key-set uniformity, render_nulls=True call, and public exports.

@sfc-gh-mraba sfc-gh-mraba self-assigned this Apr 16, 2026
…atching

Introduce src/snowflake/sqlalchemy/orm.py with three public utilities:

- _snowflake_constructor: custom ORM __init__ that pre-populates every
  plain-nullable mapped column with None (or its scalar default) at
  construction time, so every instance always has the same state_dict
  key set regardless of which kwargs the caller supplied.

  Mirrors SA's mapper._insert_cols_as_none exclusion logic: primary
keys,
  server_default columns, callable defaults, SQL-expression defaults,
and
  should_evaluate_none columns are intentionally left absent.

- snowflake_declarative_base(): function-based factory (works in both
  SA 1.4 and SA 2.x) that installs _snowflake_constructor via
  declarative_base(constructor=...).

- SnowflakeBase: SA 2.x DeclarativeBase subclass whose __init__
delegates
  to _snowflake_constructor. Guarded behind IS_VERSION_20 so the module
  loads correctly under SA 1.4.

- SnowflakeSession: Session subclass that overrides bulk_save_objects to
  call _bulk_save_mappings with render_nulls=True. Without this flag,
  SA strips None values from the parameter dict before grouping rows,
  so constructor-pre-populated Nones would still be stripped and objects
  with col=None vs col='value' would land in different INSERT batches.

Both parts are required: the base class normalises the param-key set;
the session override prevents those Nones from being stripped.

All three names are exported from snowflake.sqlalchemy.__init__.py.

Unit tests in tests/test_unit_orm.py verify (no DB required):
- plain nullable column pre-populated with None
- primary key NOT pre-populated
- server_default column NOT pre-populated
- scalar-default column pre-populated with the scalar value
- callable-default column NOT pre-populated
- SQL-expression-default column NOT pre-populated
- user-supplied values preserved and override defaults
- invalid kwargs raise TypeError
- all objects produce identical state_dict key sets (core invariant)
- SnowflakeSession calls _bulk_save_mappings with render_nulls=True
- public exports available from snowflake.sqlalchemy

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

SNOW-893080: [N] refine: fix __all__ export, add missing tests, apply
lint fixes

- Add SnowflakeBase to __all__ (conditionally on SA 2.x) so it is
  exposed via wildcard imports
- Add should_evaluate_none (JSON) column to test model and verify it
  is NOT pre-populated by _snowflake_constructor
- Add test for user-supplied value on server_default column
- Add end-to-end test combining SnowflakeBase + SnowflakeSession to
  verify uniform parameter-key sets and single _bulk_save_mappings call
- Add empty-list edge case test for bulk_save_objects
- Add __all__ membership tests for all three public exports
- Apply pyupgrade/black/isort formatting fixes from pre-commit hooks

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

[D] SNOW-893080: add README docs for SnowflakeBase and SnowflakeSession

Add a new "Bulk Insert Optimization for ORM Models" section to README.md
documenting SnowflakeBase, snowflake_declarative_base(), and
SnowflakeSession.
Covers when/why to use these utilities, SA 1.4 vs 2.x differences, the
pairing
requirement between the base class and SnowflakeSession, and full code
examples
for both SA versions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@sfc-gh-mraba sfc-gh-mraba force-pushed the SNOW-893080-feat-bulk-save-base-model branch from a4f0afe to a7741e5 Compare April 16, 2026 11:12
@sfc-gh-mraba sfc-gh-mraba marked this pull request as ready for review April 16, 2026 13:24
@sfc-gh-mraba sfc-gh-mraba requested a review from a team as a code owner April 16, 2026 13:24
@sfc-gh-mraba sfc-gh-mraba merged commit d5bf05a into main Apr 20, 2026
63 checks passed
@sfc-gh-mraba sfc-gh-mraba deleted the SNOW-893080-feat-bulk-save-base-model branch April 20, 2026 07:04
@github-actions github-actions Bot locked and limited conversation to collaborators Apr 20, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SNOW-893080: session.bulk_save_objects does not put all objects in one INSERT

2 participants