-
Notifications
You must be signed in to change notification settings - Fork 0
Closed
Labels
mvpPart of the Minimum Viable ProductPart of the Minimum Viable Productpriority:highMust be in the next sprintMust be in the next sprintsize:mMedium — 4 to 8 hoursMedium — 4 to 8 hoursstatus:readyRefined and ready for sprint selectionRefined and ready for sprint selectiontype:featureNew functionalityNew functionality
Milestone
Description
Description
SQLite stores all values as TEXT when inserted via sqlite3_bind_text. This means numeric comparisons like WHERE age > 30 silently fail because '30' > '9' is false in string ordering. Column type inference makes the tool behave correctly for numeric data without the user needing to cast explicitly.
Acceptance Criteria
- Integer columns (all values match
[+-]?[0-9]+) are bound assqlite3_bind_int64 - Float columns (all values match a floating-point pattern) are bound as
sqlite3_bind_double - NULL/empty values do not prevent a column from being inferred as numeric
- Columns with mixed types fall back to TEXT
-
SELECT max(price), min(price) FROM treturns correct numeric results - Type inference is performed on the first N rows (default: 100) stored in a memory buffer
- After the buffer is consumed and types are determined, remaining rows are streamed and inserted directly
- A
--no-type-inferenceflag allows opting out (pure TEXT mode, no buffering)
Notes
- Approach — buffer-first, single-pass:
- Read the first N rows (default 100) into an in-memory buffer while scanning to infer column types.
- Once types are determined (or N rows are exhausted), create the table with the inferred types.
- Insert buffered rows first, then continue reading from stdin and inserting directly — no second pass needed.
- This avoids seeking stdin, which is not possible for piped input.
- The "two-pass option" mentioned previously is not viable for stdin (which cannot seek). The buffer-first approach achieves the same result without seeking.
--no-type-inferenceis useful for large inputs where even buffering 100 rows is undesirable, or when the user knows the schema.
Refinement note (Sprint 1): Resolved contradictory notes about two-pass vs streaming. Adopted buffer-first single-pass approach. Updated AC and Notes accordingly.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
mvpPart of the Minimum Viable ProductPart of the Minimum Viable Productpriority:highMust be in the next sprintMust be in the next sprintsize:mMedium — 4 to 8 hoursMedium — 4 to 8 hoursstatus:readyRefined and ready for sprint selectionRefined and ready for sprint selectiontype:featureNew functionalityNew functionality