Conversation
1.12.0 triggers a bug in either eastwood or the new clojure version that causes a weird exception error during eastwood linting. I bumped it to 1.12 previously to save a few kestrokes in non-essential code in some tests so it's easiest to bump it back down until the bug is fixed.
dpetran
left a comment
There was a problem hiding this comment.
Here's how I've understood the changes here:
Instead of a singular store, a Connection now has a storage Catalog that encapsulates multiple stores. The connection can now read from any of the specified stores, using the address to discern which store in the catalog is the desired one. The connection must pick a specific storage to write to.
We also have a RemoteSystem, but I don't actually see that being used. They are Publications, which can subscribe to a ledger. Subscribed publications are sent new commits via websocket. There's no concept of a remote storage, just a subscribed ledger. They are also JsonArchives, so we can read from them.
Everything is now run through the Connection, appropriately, since it connects everything together.
|
|
||
| ;; ledger operations | ||
|
|
||
| (defn parse-connection-options |
There was a problem hiding this comment.
I'm generally not a fan of putting a function in a different ns than the only place it is used, but I think we might want to tuck this into connection or system. And maybe move promise-wrap into a util ns while we're at it.
There was a problem hiding this comment.
This is still a work in progress, but I think the changes I'm envisioning are big enough to warrant their own pr so I decided to leave this as is for now.
Since the cyclic namespace dependencies have (mostly) been cleaned up, I think we should do two things to clean up the clojure api a bit more: (1) remove the "ledger" as a first class api object and (2) have the fluree.db.connection namespace be the api entry point (appropriately renamed, of course).
Ledger data is already cached on the connection object, and the current ledger object is another, in my opinion redundant, source of state. We should load dbs and transact through the connection object, both localizing the application state to the connection and simplifying the api.
We have a fluree.db.api namespace because the namespace hierarchy was so complex and cyclic. now that most functionality flows through the connection, the code in the connection namespace can serve as the main api with a little more cleanup, and the separate api (as well as transact and query api namespaces) would be made redundant.
| (let [{:keys [method] :as opts*} (parse-connection-options opts) | ||
|
|
||
| config (case method | ||
| :remote (<? (remote-conn/connect opts*)) |
There was a problem hiding this comment.
If we no longer have a :remote method we can remove that from the parse-connection-options function.
| {:status 400, :error :db/unsupported-operation}))))] | ||
| (system/start config))))) | ||
|
|
||
| (defn connect-file |
There was a problem hiding this comment.
Why lose this one and not the others? Just curious.
There was a problem hiding this comment.
We're going to lose all of them eventually. This one just wasn't used by any of the tests.
| (:require [fluree.db.connection :as connection] | ||
| [fluree.db.cache :as cache] | ||
| [fluree.db.storage :as storage] | ||
| [fluree.db.remote-system :as remote] |
There was a problem hiding this comment.
It is used, but only by code in fluree/server as of now.
|
|
||
| (defrecord Ledger [id address alias did state cache primary-publisher | ||
| secondary-publishers commit-storage index-storage reasoner]) | ||
| (defrecord Ledger [conn id address alias did state cache commit-storage |
There was a problem hiding this comment.
Should this be commit-catalog and index-catalog?
| (go-try | ||
| (loop [nses nameservices] | ||
| (when-let [nameservice (first nses)] | ||
| (or (<? (nameservice/lookup nameservice ledger-address)) |
There was a problem hiding this comment.
Do we have any way to ensure that this address is the most up-to-date? Say we have two nameservices, one is offline for quite a while but then comes back on right as we load, will we fork at that point? And will we have a mechanism for "rebasing" if we need to later?
There was a problem hiding this comment.
After discussing, this will create a fork. Also, if the only nameservice with a ledger is read only and the user transacts to it, a local fork will be created.
This patch removes the remote connection and replaces it with the
fluree.db.remote-system/RemoteSystemcomponent that is integrated in the generalized connection objects. Remote system objects implement thefluree.db.storage/JsonArchiveprotocol, so they behave like read only storage for both commits and index nodes. They also implement thefluree.db.nameservice/iNameServiceprotocol to allow for commit lookups and thefluree.db.nameservice/Publicationprotocols to allow for subscriptions.General connections can have multiple name services, storage components, publications, and publishers, and the remote systems just sit along side all of those components and behave the same way as they do.
Because there can now be multiple storage components, this patch also adds the
fluree.db.storage/Catalogrecord to integrate storage objects and route reads or writes to the correct storage components based on the address of the data being written or read.To differentiate between addresses with the same storage method on different systems (e.g.
fluree:file://...on the local and remote system), I've introduced an optional storage identifier associated with storage components. These are not necessary for a fully local system with only a single storage component per storage method, but it's required for multiple components with the same method or for remote systems. The identifier has a namespace and a name separated by a '/' and can't contain a ':', and they work the same way as maven coordinates. The namespace should be something the user controls, like a domain, and the name part should be unique within the namespace. When storage identifiers exist on the storage component, addresses for that component will contain the identifier. This allows those addresses to be usable as part of a broader network of interconnected systems.Besides these changes, I have done some more work to disentangle namespace dependencies making a lot of functionality go through connections, and start in the connection namespace.
This patch builds on #876, #858, #882, and #888, in that order, so please review those first if you haven't already.