wasm rdbms by justcoon · Pull Request #1212 · golemcloud/golem

justcoon · 2024-12-26T18:36:29Z

fixes: #1016
/claim #1016

rdbms host implementation for postgres and mysql

wasi-rdbms: https://github.com/golemcloud/wasi-rdbms
underlying implementation using sqlx rust library,
golem-test-framework
- added support for mysql (docker)
- added invoke_and_await_typed functions to WorkerService - (typed response is possible to transform to json, which then make test results comparison easier)
created rdbms-service test component

TODO

query_stream implementation for transaction (it may require implement custom sqlx Executor, as provided implementation is for mut Connections, and then there is error like: error[E0515]: cannot return value referencing local variable transaction)
fix CI tests failures
postgres range custom type - address conversion for missing subtypes
add durability implementation for rdbms host functions
durability implementation - clarify SerializableError implementation for rdbms errors (initial impl. using SerializableError::Generic, should be SerializableError::Rdbms { error: crate::services::rdbms::Error } added?, what is preferred approach in case of SerializableError in case of adding new modules) - update SerializableError::Rdbms was added
public oplog transforming oplog payloads to ValueAndType where AnalysedType is in general WIT type definition, this is issue in case of rdbms::postgres which have recursive types, there is a question, what can be done in this case
update implementation to use https://github.com/golemcloud/wasi-rdbms
extract actual host function implementation to a common code to reduce duplicity
figure out if IntoValueAndType can be used instead of RdbmsIntoValueAndType ( RdbmsIntoValueAndType was added because https://doc.rust-lang.org/error_codes/E0119.html)

algora-pbc · 2025-02-24T22:52:28Z

💵 To receive payouts, sign up on Algora, link your Github account and connect with Stripe.

golem-worker-executor-base/src/durable_host/rdbms/mysql.rs

afsalthaj · 2025-02-25T10:21:55Z

golem-worker-executor-base/src/services/rdbms/mod.rs

+    Vec<T>: RdbmsIntoValueAndType,
+{
+    fn merge_types(first: AnalysedType, second: AnalysedType) -> AnalysedType {
+        if let (AnalysedType::Record(f), AnalysedType::Record(s)) = (first.clone(), second) {


what's the need of this merge_types?

postgres have support for custom (user defined types), where types are in general defined by data, this trait is used to create AnalysedType from basic and user defined types

vigoo

Nice work!

I have added an initial set of comments. (Did not review much the actual services::rdbms module).

vigoo · 2025-02-25T09:25:34Z

golem-test-framework/src/dsl/mod.rs

+        worker_id: impl Into<TargetWorkerId> + Send + Sync,
+        function_name: &str,
+        params: Vec<ValueAndType>,
+    ) -> crate::Result<Result<TypeAnnotatedValue, Error>>;


We have introduced the ValueAndType type as a much more friendly alternative to the protobuf-generated TypeAnnotatedValue and want to use that in code (and only use TAV on the gRPC APIs). This was not fully done in existing code, but new code should use ValueAndType.

(I did not reach in the review the tests that require this yet but hopefully they could be even better if the result type is ValueAndType)

ok, thanks, will take a look on that (when i added first impl., of this api, i think it was not there)

I changed impl. , now there is ValueAndType in response

vigoo · 2025-02-25T09:31:24Z

golem-worker-executor-base/src/durable_host/rdbms/mysql.rs

+                Ok(params) => Ok(RdbmsRequest::<MysqlType>::new(pool_key, statement, params)),
+                Err(error) => Err(RdbmsError::QueryParameterFailure(error)),
+            };
+            durability.persist(self, (), result).await


The input parameter should not be () but at least the statement and possibly also the params. This way these are added to the oplog and are visible through oplog query, useful for debugging (you see what query you did, not just that you did a query).

I changed impl., now there is Option<RdbmsRequest<T>> in input, (similar to query and execute functions)

vigoo · 2025-02-25T09:32:27Z

golem-worker-executor-base/src/durable_host/rdbms/mysql.rs

+        statement: String,
+        params: Vec<DbValue>,
+    ) -> anyhow::Result<Result<DbResult, Error>> {
+        let worker_id = self.state.owned_worker_id.worker_id.clone();


Suggested change

let worker_id = self.state.owned_worker_id.worker_id.clone();

let worker_id = self.state.owned_worker_id.worker_id();

vigoo · 2025-02-25T09:33:26Z

golem-worker-executor-base/src/durable_host/rdbms/mysql.rs

+        statement: String,
+        params: Vec<DbValue>,
+    ) -> anyhow::Result<Result<u64, Error>> {
+        let worker_id = self.state.owned_worker_id.worker_id.clone();


Suggested change

let worker_id = self.state.owned_worker_id.worker_id.clone();

let worker_id = self.state.owned_worker_id.worker_id();

golem-worker-executor-base/src/durable_host/rdbms/mysql.rs

golem-worker-executor-base/src/durable_host/rdbms/postgres.rs

vigoo · 2025-02-25T10:02:28Z

golem-worker-executor-base/src/durable_host/rdbms/mysql.rs

+            self,
+            "rdbms::mysql::db-connection",
+            "query-stream",
+            DurableFunctionType::WriteRemoteBatched(Some(begin_oplog_idx)),


Marking every query as WriteRemote or WriteRemoteBatched, even if they are not changing the database, is a simplification that has negative side effects.

See the following PR description to understand what this is used for: #682

If even read queries are treated as writes, that interferes with the linked logic, if there are interleaved http requests and database requests.

So it would be nice to be able to detect if a query is "read-only" and use ReadRemote function type in that case.

Also note that the above logic (of checking concurrent write effects) apply automatically when you use begin_durable_function/end_durable_function with write-remote function type. This may be the correct thing but needs to be a conscious decision. But only marking side effects that are changing the DB as WriteRemote/WriteRemoteBatched definitely helps.

I was not sure what is the best solution in this case,
as you mentioned, queries are in general readonly,

I think that all function invocations (in relation to query stream ) needs to be handled in case of durability like one batched operation, (similar to transaction)

if DurableFunctionType.ReadRemote will be used in HostDbResultStream.get_next and HostDbResultStream.get_columns (for stream, which is not in transaction)

for example, if first 2 invocations HostDbResultStream.get_next will be processed/persisted, then worker will crash , I am not sure if it can be easy to continue with next HostDbResultStream.get_next (considering that first 2 chunks can be taken from state/oplog and next needs to be read from DB)

but I am not sure, maybe I do not understand something, please let me know, thank you very much

I think you are right and maybe we are missing something (in the core). Because yes the whole set of operations (opening the transaction/stream etc) have to be retried on replay if it was interrupted - but on the other hand it is not necessarily having the "write remote" semantics which means that we did something to the outside world that we cannot undo. So for example if it was just a SELECT, or a transaction that never got committed, then in the DB it's like nothing happened.

Anyway I think this should not block this PR, but something I want to think more about.

vigoo · 2025-02-25T10:05:04Z

golem-worker-executor-base/src/durable_host/rdbms/mysql.rs

+        self.observe_function_call("rdbms::mysql::db-connection", "begin-transaction");
+
+        let begin_oplog_index = self
+            .begin_durable_function(&DurableFunctionType::WriteRemoteBatched(None))


begin_durable_function must be paired with an end_durable_function on all code paths otherwise it is leaking. I think now it is only ended in case of some errors but not on the happy path. (I may be wrong, lots of lines to review)

currently end_durable_function is invoked HostDbTransaction.drop, but will add all also handling to HostDbTransaction.commit and HostDbTransaction.rollback and will check other cases

vigoo · 2025-02-25T10:06:45Z

golem-worker-executor-base/src/durable_host/rdbms/mysql.rs

+        let handle = resource.rep();
+        self.state
+            .open_function_table
+            .insert(handle, begin_oplog_index);


Similarly to the previous comment - this entry has to be removed on each code path when the transaction/stream is completed.

vigoo · 2025-02-25T10:17:18Z

golem-worker-executor-base/src/services/rdbms/mod.rs

+    }
+}
+
+pub trait RdbmsIntoValueAndType {


There should not be any need for this type class. We already have IntoValue where you implement two functions, one that returns the value and another that returns the type. There is also IntoValueAndType which is implemented automatically for T: IntoValue that gives you an into_value_and_type function just like the one here.

because postgres have custom (user defined types), which basically mean recursion, I needed to have possibility to create value and type together (ValueAndType),
as you also mentioned there IntoValueAndType

but as there is impl<T: IntoValue + Sized> IntoValueAndType for T
https://github.com/golemcloud/golem/blob/main/wasm-rpc/src/value_and_type.rs#L92-L96

if I wanted directly implement IntoValueAndType for some types, it ended in some cases with
https://doc.rust-lang.org/error_codes/E0119.html

so I added RdbmsIntoValueAndType, not sure if there is some different solution for problem, but I will think about that

The reason why there is a separate IntoValue and IntoValueAndType type class is that IntoValue has a static get_type method (not requiring self). That is useful in many cases and simple to implement for most types. However there are some more dynamic cases where you don't know the type statically. For those, we are implementing IntoValueAndType directly (and not implementing IntoValue) because there you have access to self to produce both the value and type.

I think if you would only implement IntoValueAndType in these cases it should work
(For example one type such the above is impl IntoValueAndType for SerializableInvokeRequest)

Could you give it one more try, would be nice to not have an extra trait for this

I tried to figure out how to use IntoValueAndType, one more time, but was not able to figure out solution

it is possible to do implementation of IntoValueAndType for specific type (similar like for mentioned SerializableInvokeRequest),
but if it is need to be in another common type type like Option, Vec or Result
for example Option<SerializableInvokeRequest>, it will not work

| 1026 | Ok(payload.into_value_and_type()) | ^^^^^^^^^^^^^^^^^^^ method cannot be called on `Option<SerializableInvokeRequest>` due to unsatisfied trait bounds | ::: /Users/coon/.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/src/rust/library/core/src/option.rs:572:1 | 572 | pub enum Option<T> { | ------------------ doesn't satisfy `_: IntoValueAndType` or `_: IntoValue` | = note: the following trait bounds were not satisfied: `std::option::Option<SerializableInvokeRequest>: IntoValue` which is required by `std::option::Option<SerializableInvokeRequest>: IntoValueAndType` `&std::option::Option<SerializableInvokeRequest>: IntoValue` which is required by `&std::option::Option<SerializableInvokeRequest>: IntoValueAndType` `&mut std::option::Option<SerializableInvokeRequest>: IntoValue` which is required by `&mut std::option::Option<SerializableInvokeRequest>: IntoValueAndType`

if I want to add impl<T: IntoValueAndType> IntoValueAndType for Option<T>

it will end with compilation issues

error[E0119]: conflicting implementations of trait `value_and_type::IntoValueAndType` for type `std::option::Option<_>` --> wasm-rpc/src/value_and_type.rs:291:1 | 93 | impl<T: IntoValue + Sized> IntoValueAndType for T { | ------------------------------------------------- first implementation here ... 291 | impl<T: IntoValueAndType> IntoValueAndType for Option<T> { | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ conflicting implementation for `std::option::Option<_>` For more information about this error, try `rustc --explain E0119`.

based on what I have seen
https://std-dev-guide.rust-lang.org/policy/specialization.html
can be solution for that problem, but this is unstable feature

do you have some different ideas ?

Yeah I see, and it is not possible to implement a impl<T: IntoValueAndType> IntoValueAndType for Option<T> because you cannot figure out the type information in the None case. In your trait there is still a get type. This makes me realize I may not fully understand the root problem :)

If you can return an AnalysedType you could just implement IntoValue. So I guess in some cases this base type is not the truth? So for these recursive types you just return something "random" if it's None?

So for these recursive types you just return something "random" if it's None

it is not completely random type

RdbmsIntoValueAndType implementation is following

pub trait RdbmsIntoValueAndType { fn into_value_and_type(self) -> ValueAndType; fn get_base_type() -> AnalysedType; } impl<T: RdbmsIntoValueAndType> RdbmsIntoValueAndType for Option<T> { fn into_value_and_type(self) -> ValueAndType { match self { Some(v) => { let v = v.into_value_and_type(); ValueAndType::new( Value::Option(Some(Box::new(v.value))), analysed_type::option(v.typ), ) } None => ValueAndType::new(Value::Option(None), Self::get_base_type()), } } fn get_base_type() -> AnalysedType { analysed_type::option(T::get_base_type()) } }

it is more like combination of IntoValueAndType and IntoValue

so in case of postgres DbValue where there are common and user defined type, DbValue::get_base_type() returning "common" schema only, but DbValue::into_value_and_type may have schema with additional custom user defined types, if specific DbValue holds data with user defined type

Ok let's leave it like this for now

afsalthaj · 2025-02-25T10:34:22Z

golem-worker-executor-base/src/services/rdbms/tests.rs

+}
+
+#[test]
+async fn postgres_create_insert_select_test(


May be these tests are necessary, but I couldn't understand yet why. Could you please explain @justcoon ?

those test are testing conversion and type mappings, query execution and transaction implementations

afsalthaj · 2025-02-25T10:35:21Z

golem-worker-executor-base/src/services/rdbms/tests.rs

+                $30, $31, $32, $33, $34, $35, $36, $37, $38::tsvector, $39::tsquery,
+                $40, $41, $42
+            );
+        "#;


Was these required because of the type conversions that we have?

yes, to test conversions, and type mappings

golem-worker-executor-base/src/services/rdbms/sqlx_common.rs

afsalthaj · 2025-02-25T12:13:31Z

LGTM @justcoon nice one.

# Conflicts: # Cargo.lock # golem-worker-executor-base/src/preview2/mod.rs # golem-worker-executor-base/src/worker/mod.rs

# Conflicts: # Cargo.toml

justcoon added 30 commits December 20, 2024 18:20

golem-wit rdbms

a5c833a

rdbms - durable host implementation - wip

1eecf2c

RdbmsService - initial

28ec7c2

Db service types

60daa98

Db service type, pg pool - wip

cd2705c

Postgres execute

8af70ad

rdbms-service wasm component

f2c8d71

Postgres query - wip

34ddeb1

Postgres query - wip

eaf4276

rdbms-service wasm component

d12beb8

test - wip

bc7d992

rdbms create linker

08ba003

rdbms-service wasm component

3d9da6d

rdbms-service wasm component

1e9d460

rdbms-service wip

ad165ad

rdbms-service wip

41ffbf5

cache fixes, tests

9cf08f0

tokio-postgres

9d4071b

tokio-postgres

0b2d4dd

StreamDbResultSet

afb0c29

sqlx postgres

ee9f53d

golem-wit rdbms update

5afbed8

initial mysql durable host implementation

5aa2617

sqlx common - wip, rdbms-service component update

b643b65

sqlx common - wip, rdbms-service component update

eced927

sqlx common - query executor

4ae36ab

SqlxRdbms

968a1e0

rdbms error

c3a23b2

mysql test

512191a

pool workers cache, config, some clippy fixes

69debca

algora-pbc bot mentioned this pull request Feb 24, 2025

Implement wasm-rdbms interface for standardized database access #1016

Closed

algora-pbc bot added the 🙋 Bounty claim label Feb 24, 2025

Merge branch 'main' into rdbms

9df5363

afsalthaj reviewed Feb 25, 2025

View reviewed changes

golem-worker-executor-base/src/durable_host/rdbms/mysql.rs Show resolved Hide resolved

afsalthaj reviewed Feb 25, 2025

View reviewed changes

vigoo reviewed Feb 25, 2025

View reviewed changes

afsalthaj reviewed Feb 25, 2025

View reviewed changes

golem-worker-executor-base/src/services/rdbms/sqlx_common.rs Show resolved Hide resolved

mschuwalow self-requested a review February 25, 2025 20:45

justcoon added 15 commits February 25, 2025 22:56

Merge branch 'main' into rdbms

281867f

# Conflicts: # Cargo.lock # golem-worker-executor-base/src/preview2/mod.rs # golem-worker-executor-base/src/worker/mod.rs

merge fixes

951e927

Merge branch 'main' into rdbms

ec14636

rdms host - common impl - wip

6b44d09

rdms host - common impl - wip

19f5419

rdms host - common impl - wip

f9a44bf

Merge branch 'main' into rdbms

3319ba3

Merge branch 'main' into rdbms

eb7b601

test dsl - invoke_and_await_typed - use ValueAndType in response

8ee15c1

durability - query-stream request payload

d02faee

Merge branch 'main' into rdbms

5608875

# Conflicts: # Cargo.toml

Merge branch 'main' into rdbms

d610d7c

cleanup

8290b44

Merge branch 'main' into rdbms

cdff02b

durability - RdbmsTransactionState::Closed

1683294

vigoo approved these changes Mar 6, 2025

View reviewed changes

Merge branch 'main' into rdbms

d7178eb

vigoo merged commit f0996f1 into golemcloud:main Mar 6, 2025
22 checks passed

	let worker_id = self.state.owned_worker_id.worker_id.clone();
	let worker_id = self.state.owned_worker_id.worker_id();

Conversation

justcoon commented Dec 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

algora-pbc bot commented Feb 24, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vigoo left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

justcoon Feb 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

afsalthaj commented Feb 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

justcoon commented Dec 26, 2024 •

edited

Loading

justcoon Feb 25, 2025 •

edited

Loading