Skip to content

SQLite scanner: use SQLite-like type affinity for determining types#18

Merged
Mytherin merged 8 commits intoduckdb:mainfrom
Mytherin:main
Nov 12, 2022
Merged

SQLite scanner: use SQLite-like type affinity for determining types#18
Mytherin merged 8 commits intoduckdb:mainfrom
Mytherin:main

Conversation

@Mytherin
Copy link
Contributor

Fixes #12 and #14

This PR reworks the SQLite scanner to use SQLite's column affinity rules, with some minor adaptations. The rules are as follows:

  1. If the declared type contains the string "INT" then it is assigned INTEGER affinity.
  2. If the declared type of the column contains any of the strings "CHAR", "CLOB", or "TEXT" then that column has TEXT affinity. Notice that the type VARCHAR contains the string "CHAR" and is thus assigned TEXT affinity.
  3. If the declared type for a column contains the string "BLOB" or if no type is specified then the column has affinity BLOB.
  4. If the declared type for a column contains any of the strings "REAL", "FLOA", or "DOUB" then the column has REAL affinity.
  5. Otherwise, the affinity is NUMERIC.

The rules are implemented as-is, however, the final catch-all type "numeric" turns out to be quite a strange type.

A column with NUMERIC affinity may contain values using all five storage classes. When text data is inserted into a NUMERIC column, the storage class of the text is converted to INTEGER or REAL (in order of preference) if the text is a well-formed integer or real literal, respectively. If the TEXT value is a well-formed integer literal that is too large to fit in a 64-bit signed integer, it is converted to REAL. For conversions between TEXT and REAL storage classes, only the first 15 significant decimal digits of the number are preserved. If the TEXT value is not a well-formed integer or real literal, then the value is stored as TEXT. For the purposes of this paragraph, hexadecimal integer literals are not considered well-formed and are stored as TEXT. (This is done for historical compatibility with versions of SQLite prior to version 3.8.6 2014-08-15 where hexadecimal integer literals were first introduced into SQLite.) If a floating point value that can be represented exactly as an integer is inserted into a column with NUMERIC affinity, the value is converted into an integer. No attempt is made to convert NULL or BLOB values.

As such, we have some additional fallback options:

  1. Convert to date if the type is exactly "date"
  2. Convert to timestamp if the type contains "time" (e.g. timestamp, datetime, timestamptz)
  3. Convert to double if the type contains "dec" or "num" (decimal, numeric)

Finally, we give up and go back to VARCHAR.

As a result of this new type conversion we can also drop a few conversion loops (smallint, int and decimal) as they are never generated.

@Mytherin Mytherin changed the title SQLite scanner: use SQLite-like type affinity for determing types SQLite scanner: use SQLite-like type affinity for determining types Nov 11, 2022
@Mytherin Mytherin linked an issue Nov 11, 2022 that may be closed by this pull request
@Mytherin Mytherin merged commit 948c839 into duckdb:main Nov 12, 2022
ak2k pushed a commit to ak2k/duckdb-sqlite that referenced this pull request Jun 7, 2025
SQLite scanner: use SQLite-like type affinity for determining types
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SQLite's LONGVARCHAR datatype is not supported SQLite's real(float) datatype is not supported

1 participant