Skip to content

Conversation

@alumni
Copy link
Collaborator

@alumni alumni commented Sep 20, 2025

Description of change

Add support for vector columns in MySQL and MariaDB

Pull-Request Checklist

  • Code is up-to-date with the master branch
  • This pull request links relevant issues as Fixes #00000
  • There are new or updated unit tests validating the change
  • Documentation has been updated to reflect this change

@pkg-pr-new
Copy link

pkg-pr-new bot commented Sep 20, 2025

typeorm-sql-js-example

npm i https://pkg.pr.new/typeorm/typeorm@11670

commit: 0faefd9

@alumni alumni force-pushed the feat/mysql-vector-support branch from 9a76fd1 to 00719e7 Compare September 20, 2025 15:27
@coveralls
Copy link

coveralls commented Sep 20, 2025

Coverage Status

coverage: 80.794% (+0.03%) from 80.768%
when pulling 0faefd9 on alumni:feat/mysql-vector-support
into dd55218 on typeorm:master.

@alumni alumni force-pushed the feat/mysql-vector-support branch from 997b05e to 0faefd9 Compare November 26, 2025 00:08
@alumni alumni marked this pull request as ready for review November 26, 2025 00:30
@qodo-free-for-open-source-projects

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 PR contains tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Division precision

The conversion from bytes to dimensions divides by 4 without handling potential precision loss. If CHARACTER_MAXIMUM_LENGTH is not divisible by 4, this could lead to incorrect dimension values being stored or compared.

let length: number =
    dbColumn["CHARACTER_MAXIMUM_LENGTH"]
if (tableColumn.type === "vector") {
    // MySQL and MariaDb store the vector length in bytes, not in number of dimensions.
    length = length / 4
}
Incomplete transformer

The vectorTransformer handles both Buffer and number[] in the from method but doesn't validate the input type in the to method. If a non-array value is passed, it will fail at runtime without a clear error message.

to: (value: number[]) => {
    const length = value.length
    const arrayBuffer = new ArrayBuffer(length * 4)
    const dataView = new DataView(arrayBuffer)

    for (let index = 0; index < length; index++) {
        dataView.setFloat32(index * 4, value[index], true)
    }

    return Buffer.from(arrayBuffer)
},
Type mismatch

The entity property realVector is typed as number[] but the column type is real_vector which may return Buffer depending on driver configuration. This could cause type inconsistencies if the transformer is not applied correctly or if the driver behavior changes.

@Column("real_vector", {
    transformer: vectorTransformer,
})
realVector: number[]

@qodo-free-for-open-source-projects

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
High-level
Automate vector data transformation within driver

Integrate the ValueTransformer logic for MySQL/MariaDB vector types directly
into the MysqlDriver. This will automate the conversion between number[] and
Buffer, removing the need for manual user implementation.

Examples:

test/functional/database-schema/vectors/mysql/entity/Embedding.ts [13-59]
const vectorTransformer: ValueTransformer = {
    to: (value: number[]) => {
        const length = value.length
        const arrayBuffer = new ArrayBuffer(length * 4)
        const dataView = new DataView(arrayBuffer)

        for (let index = 0; index < length; index++) {
            dataView.setFloat32(index * 4, value[index], true)
        }


 ... (clipped 37 lines)

Solution Walkthrough:

Before:

// In user's entity file: Embedding.ts
const vectorTransformer: ValueTransformer = {
    to: (value: number[]) => {
        // Manual conversion from number[] to Buffer
        const buffer = new ArrayBuffer(value.length * 4);
        // ... fill buffer
        return Buffer.from(buffer);
    },
    from: (value: Buffer | number[]) => {
        // Manual conversion from Buffer to number[]
        // ... logic to read buffer
        return array;
    },
};

@Entity()
export class Embedding {
    @Column("vector", {
        transformer: vectorTransformer,
    })
    vector: number[];
}

After:

// In user's entity file: Embedding.ts
// No manual transformer is needed.

@Entity()
export class Embedding {
    @Column("vector")
    vector: number[]; // The driver handles the transformation automatically.
}

// In src/driver/mysql/MysqlDriver.ts (conceptual)
class MysqlDriver {
    // ...
    // The driver would override methods like `preparePersistentValue`
    // and `prepareHydratedValue` to automatically handle the
    // Buffer <-> number[] conversion for the 'vector' type.
}
Suggestion importance[1-10]: 9

__

Why: This is a significant design improvement that greatly enhances the developer experience for the new vector feature, making it consistent with how the ORM handles other complex types.

High
Possible issue
Add buffer length validation

Add buffer length validation in the from function of the vectorTransformer to
prevent potential RangeError exceptions when processing malformed or short
buffers.

test/functional/database-schema/vectors/sap/entity/BufferEmbedding.ts [8-35]

 const vectorTransformer: ValueTransformer = {
     to: (value: number[]) => {
         const length = value.length
         const arrayBuffer = new ArrayBuffer(4 + length * 4)
         const dataView = new DataView(arrayBuffer)
 
         dataView.setUint32(0, length, true)
         for (let index = 0; index < length; index++) {
             dataView.setFloat32(4 + index * 4, value[index], true)
         }
 
         return Buffer.from(arrayBuffer)
     },
     from: (value: Buffer) => {
+        if (value.byteLength < 4) {
+            throw new Error("Invalid fvecs buffer: too short for length prefix.")
+        }
+
         const dataView = new DataView(
             value.buffer,
             value.byteOffset,
             value.byteLength,
         )
         const length = dataView.getUint32(0, true)
+
+        if (value.byteLength !== 4 + length * 4) {
+            throw new Error("Invalid fvecs buffer: byte length does not match encoded length.")
+        }
+
         const array = new Array<number>(length)
         for (let index = 0; index < length; index++) {
             array[index] = dataView.getFloat32(4 + index * 4, true)
         }
 
         return array
     },
 }
  • Apply / Chat
Suggestion importance[1-10]: 5

__

Why: The suggestion correctly identifies missing buffer validation in a test helper function, which could lead to unhandled errors, and the proposed fix is accurate, improving the robustness of the test code.

Low
  • More

@typeorm typeorm deleted a comment from coderabbitai bot Nov 26, 2025
@alumni alumni changed the title feat(mysql): add support for vector columns feat(mysql): add support for vector columns on MariaDB and MySQL Nov 26, 2025
Copy link
Collaborator

@gioboa gioboa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👏💪

Copy link
Collaborator

@OSA413 OSA413 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about migrations? will changing a vector dimension be reflected in a new generated migration?

@alumni
Copy link
Collaborator Author

alumni commented Nov 27, 2025

It should, it follows the same logic from string length (Driver.withLengthColumnTypes) - so the previous tests cover it.

The only adjustment I had to do was that the DB reported length in characters (bytes) is 4x the size of the vector, because each vector component takes 4 bytes (they are 32bit floating point numbers). This is covered by the tests in the PR already.

@alumni alumni merged commit cfb3d6c into typeorm:master Nov 27, 2025
95 of 97 checks passed
ThbltLmr pushed a commit to ThbltLmr/typeorm that referenced this pull request Dec 2, 2025
mgohin pushed a commit to mgohin/typeorm that referenced this pull request Jan 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants