Conversation
…nto refactor/general-connection
|
|
||
| (derive :fluree.storage/file :fluree/content-storage) | ||
| (derive :fluree.storage/file :fluree/byte-storage) | ||
| (derive :fluree.storage/file :fluree/json-archive) |
There was a problem hiding this comment.
What's the semantic distinction between these types of storage? content-storage and byte-storage I get, but what's a json-archive?
There was a problem hiding this comment.
json-archive corresponds to the fluree.db.storage/JsonArchive protocol. It's a read-only protocol with a single method for retrieving json data. It's the other side of the content-addressed store protocol, which only has a method for writing data. most of the storage implementations implement both protocols, but the remote storage implementation only implements JsonArchive because you can't modify data in a remote store; you can only read it.
| (-index-file-delete [conn file-address] "Deletes an index item from storage")) | ||
| (-txn-read [conn txn-key] "Reads a transaction from storage")) | ||
|
|
||
| (comment |
There was a problem hiding this comment.
Is this comment still accurate? My understanding is that the Connection ledger state is a map of ledger-alias->channel[Ledger].
dpetran
left a comment
There was a problem hiding this comment.
Ok, this looks good to me.
Here's how I understand what's different:
There is now only one Connection (!!!). The connection has these storage keys:
:store, for implementingiConnection, reading and writing commits and txns:index-store, for reading and writing index nodes:primary-nsand:aux-nses, which use a mutable :byte-storage store.
I'm a fan of all the different protocols for storage, but I'm not sure I understand why :index-store is separate from :store, since I think they are the same in practice (I think, for instance, that both keys point to the same store instance).
A Connection is still a single-store-for-all-ledgers record, you couldn't have one ledger using a MemoryStore and another using a FileStore in the same Connection, I think.
| [parallelism cache-max-mb defaults] | ||
| (-> (base-config parallelism cache-max-mb defaults) | ||
| (assoc :fluree.storage/memory {}) | ||
| (assoc-in [:fluree.nameservice/storage-backed :address-prefix] "fluree:memory://"))) |
There was a problem hiding this comment.
This is a personal bugaboo, so feel free to not take it seriously, but I would really love to see the double-slash // removed. Tim Berners-Lee calls the decision to add the double slash his chief regret, and I'd really love to not perpetuate it : )
There was a problem hiding this comment.
The intention here has always been that this allows us to distinguish a location. it is probably misused a bit here with 'memory' as doesn't really make sense in this situation.
When a // exists, we know a remote location is being specified (e.g. IP/dns address, but it could be a P2P network address or otherwise).
I've looked into multi-addr as an alternative, but it has more complexity than I think we'll ever need and it is something new people have to learn. For that reason I have preferred the // as, like it or not, it is an established convention.
An index store might be backed by the same underlying storage found under the But, besides all that, I would like to further decouple index storage from commit storage because these are two different types of data that the system uses in different ways, so there shouldn't be a constraint that they use the same storage medium. I could easily imagine a use case where a user would like to store indexes on the local filesystem, but commits on ipfs, for example.
This should also change eventually. I think a user should be able to configure storage on a per-ledger basis. They should even be able to configure multiple storage media on a single ledger, but I'm changing enough right now so I think this specific change should wait until later. |
This patch remove the specific connection types in favor of a generalized connection record with a configuration system to provide specific features the old connections had. The configuration and system is managed with integrant, and I think we should add validation to connection configs in a later pull request, but I left that out here.
This patch also removes many methods from the connection protocols, but more work needs to be done to disentangle and remove circular dependencies from the code that operates on connections before we can remove the connection protocols all together, and that work will come in a later pull request.
This patch also adds an index-store to back
FlakeDBrecords that combines storage, serialization, and caching so thatFlakeDBrecords will eventually no longer need access to connections to function. More work has to be done to disentangle dbs from connections which will come in a later pull request.These changes are based on #858, so please review that first if you haven't already.