Conversation
|
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
|
Hi @H4ad ! Could you resolve the conflicts? Thanks |
allevo
left a comment
There was a problem hiding this comment.
This PR seems ok. I left some comments. In particular, related to DocumentID: why do we need to support both, number or string?
| delete store.docs[internalId] | ||
| store.count-- | ||
|
|
||
| return true |
There was a problem hiding this comment.
we should remove also from sharedInternalDocumentStore, right?
There was a problem hiding this comment.
If we just remove it from sharedInternalDocumentStore.idToInternalId, it's fast enough.
If we try to remove it from sharedInternalDocumentStore.internalIdToId, it will be slower as sorter.
What we can do is perform a cleanup of internalIdToId on serialize.
What do you prefer?
There was a problem hiding this comment.
Just to make sure, so no need to remove the id from the sharedInternalDocumentStore, right?
There was a problem hiding this comment.
No sorry, we should implement it.
For now, we don't care about the performance during the remove.
There was a problem hiding this comment.
For now, we don't care about the performance during the remove.
Agreed. I'd dedicate a separate PR to this
| @@ -0,0 +1,70 @@ | |||
| import { Orama } from '../types.js'; | |||
|
|
|||
| export type DocumentID = string | number | |||
There was a problem hiding this comment.
My idea was to introduce less breaking change as possible (I also don't know what is public API and what is internal API).
Someone that changes the implementation of sorter, index, etc... will not need to modify their code in order to accept this change.
But if you want to go full breaking change mode, I can use InternalID in every code (but still return the original ID on search).
There was a problem hiding this comment.
I'm ok to introduce a little breaking change with this. Regarding the naming. Is this a documented or IndexId?
There was a problem hiding this comment.
This will bump Orama to v1.1.0
There was a problem hiding this comment.
Ok, so I will change all the references for DocumentID to InternalID, and I will only use DocumentID on getByID and when we return the documents from search.
Regarding the naming. Is this a documented or IndexId?
I didn't understand your question but DocumentID is just an alias to reference the ID of the Document that was generated or was passed by the user, we you see this type, is referring to these two cases.
But I think we should add some documentation about it on Orama Docs, just to be clear about how we store IDs to be more efficient.
There was a problem hiding this comment.
Ok, understood you point. fine for me
allevo
left a comment
There was a problem hiding this comment.
Again, amazing work!
LGTM
micheleriva
left a comment
There was a problem hiding this comment.
Terrific job @H4ad. As always!
LGTM
/claim #426
This is my attempt to closes #426, in this implementation, I added a new component called
internalDocumentIDStorewhich is responsible for keeping the internal IDs and also has a list to reverse those IDs.About the performance, to compare this change with the old behavior, here are the current stats for the old behavior:
And the new behavior is:
database-size.mjs
This is what looks like the current serialization:
{ "internalIdStore":{ "internalIdToId":[ "4dca7125-6c6f-461c-9cb0-b0dba12119bc" ] }, "index":{ "indexes":{ "name":{ "word":"", "subWord":"", "children":{ "b":{ "word":"ba2c536", "subWord":"ba2c536", "children":{ }, "docs":[ 1 ], "end":true } }, "docs":[ ], "end":false } }, "searchableProperties":[ "name" ], "searchablePropertiesWithTypes":{ "name":"string" }, "frequencies":{ "name":{ "1":{ "ba2c536":1 } } }, "tokenOccurrencies":{ "name":{ "ba2c536":1 } }, "avgFieldLength":{ "name":1 }, "fieldLengths":{ "name":{ "1":1 } } }, "docs":{ "docs":{ "1":{ "id":"4dca7125-6c6f-461c-9cb0-b0dba12119bc", "name":"ba2c536" } }, "count":1 }, "sorting":{ "sortableProperties":[ "name" ], "sortablePropertiesWithTypes":{ "name":"string" }, "sorts":{ "name":{ "docs":{ "1":0 }, "orderedDocs":[ [ 1, "ba2c536" ] ], "type":"string" } }, "enabled":true, "isSorted":true } }