Skip to content

Enhance PostgresType to support array types (serialization, not deserialization)#2017

Merged
robfrank merged 4 commits into
ArcadeData:mainfrom
ExtReMLapin:postgres_driver_arrays_support
Mar 4, 2025
Merged

Enhance PostgresType to support array types (serialization, not deserialization)#2017
robfrank merged 4 commits into
ArcadeData:mainfrom
ExtReMLapin:postgres_driver_arrays_support

Conversation

@ExtReMLapin

@ExtReMLapin ExtReMLapin commented Feb 27, 2025

Copy link
Copy Markdown
Contributor

Fixes an issue that leaded the array to be returned to the client as a string because it was not implemented in the postgres types

Serialization is implemented, deserialization is NOT implemented.
First code wrote in java so i'm all ears open to fix things in this PR.

Fixed my NPM install so precommit is fixed aswell

@ExtReMLapin

Copy link
Copy Markdown
Contributor Author

Not going to lie, only tested it on a long 1024 array of floats, didn't have the data to test with ints, bools etc

}
} else {
// Default to text array for empty lists
valueType = PostgresType.ARRAY_TEXT;

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe do something else, not sure

@lvca lvca requested a review from robfrank February 27, 2025 20:38
@lvca lvca added the enhancement New feature or request label Feb 27, 2025
@lvca lvca added this to the 25.3.1 milestone Feb 27, 2025
@robfrank

Copy link
Copy Markdown
Collaborator

Can you give me an example on the data to store in the db?
some SQL commands to create the data, so I can add some relevant tests

@ExtReMLapin

Copy link
Copy Markdown
Contributor Author

It's from the data from this thread : #2005

https://limewire.com/d/10245dd2-fd50-4d33-865e-3a3969026047#_n2Dys-Z8KKH032BHBltTagwu_DpoRev8_PKBWQLKiA

Python code :

(Just install the package with pip install psycopg not the binary package)

import psycopg
import time

with psycopg.connect(user="root", password="rootroot",
                    host='localhost',
                    port='5432',
                    dbname='ORANO_DOC',
                    sslmode='disable'
                    ) as connection:
    connection.autocommit = True
    _time = time.time()
    with connection.cursor() as cursor:
        cursor.execute("""MATCH {type: EMBEDDING, as: embb}-->{ as: target}
RETURN embb.vector, target.asRID()""")
        results = cursor.fetchall()
        print(results[0])

        
    print('time', time.time()-_time)

@ExtReMLapin

ExtReMLapin commented Feb 28, 2025

Copy link
Copy Markdown
Contributor Author

As for the data type I ran tests with ARRAY_OF_FLOATS

@ExtReMLapin

Copy link
Copy Markdown
Contributor Author

Not going to lie, only tested it on a long 1024 array of floats, didn't have the data to test with ints, bools etc

As for the data type I ran tests with ARRAY_OF_FLOATS

Now covered by tests : #2039

@robfrank robfrank left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@robfrank robfrank merged commit 444bbc7 into ArcadeData:main Mar 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants