Skip to content

Support for indexes on fields of embedded properties type like EMBEDDED/EMBEDDEDLIST/EMBEDDEDMAP/EMBEDDEDSET#7093

Closed
theleapofcode wants to merge 8 commits intoorientechnologies:developfrom
theleapofcode:develop
Closed

Support for indexes on fields of embedded properties type like EMBEDDED/EMBEDDEDLIST/EMBEDDEDMAP/EMBEDDEDSET#7093
theleapofcode wants to merge 8 commits intoorientechnologies:developfrom
theleapofcode:develop

Conversation

@theleapofcode
Copy link

I am using OrientDB as a document store and need to query nested fields of a document among millions of records. Hence i needed indexing support on fields inside EMBEDDED object, EMBEDDEDLIST/SET/MAP. I saw the above requirement is labelled with 3.0 milestone - #5201 .

I know this feature is in your roadmap but i need it asap. I am waiting on this feature for a customer deployment and it is blocked because of performance issues without indexing.

I have added support for this in my commit. All tests are running fine. Please review it and merge it and make it available in next release asap. I have come a long way in using OrientDB in my project and now i am blocked on a customer deployment due to this.

Please take a look at my UT in orientdb-core - "com.orientechnologies.orient.core.sql.EmbeddedIndexingTest". These are all the scenarios i wanted and with my changes/commit, they are working fine.

…ike EMBEDDED/EMBEDDEDLIST/EMBEDDEDSET/EMBEDDEDMAP
Merging with orienttechnologies/develop before pull request
@theleapofcode
Copy link
Author

Any updates on this PR? I am blocked on this issue for a customer deliverable and getting this merged would be of great help.

@lvca
Copy link
Member

lvca commented Jan 27, 2017

The team is getting more time to analyze this because we're under release of v3.0M1. Please give us some more days.

@tglman
Copy link
Member

tglman commented Jan 27, 2017

hi @theleapofcode,

This is pretty impressive but anyway it miss some points that is actually the complicate part of the index management, the handling of index changes on update, I got your code and tried it out, it work and is pretty nice.

Anyway changing this snippet:

	Map<String, ODocument> emap1 = new HashMap<String, ODocument>();
	emap1.put("personal", mobile3);
	emap1.put("official", mobile4);
	person1.field("mobilesmap", emap1);
	person1.save();

To:

        Map<String, ODocument> emap1 = new HashMap<String, ODocument>();
	emap1.put("official", mobile4);
	person1.field("mobilesmap", emap1);
	person1.save();

	emap1 = person1.field("mobilesmap");
	emap1.put("personal", mobile3);
	person1.save();

a test failed.

So is so far good, but we cannot include this till the updated is supported, in any case as soon as we start to work on this we can start from this pull request.

if you will to work also in include the support of update we can give you any help/info you need.

I'm aware that this is quite important for you so we will try to push a bit on this as well.

Regards

@theleapofcode
Copy link
Author

Hi @tglman ,

Thanks for the review and feedback. I have pushed a fix for properly handling indexes on embedded fields upon updating of records. Also added UTs to check update scenarios for all cases. And the scenario you mentioned in your review is also working. Please review my PR and merge it asap.

Regards,
Dileep

@theleapofcode
Copy link
Author

Added a commit for support of lucene index on embedded fields just like others along with UTs. Please review the code and merge asap.

@smolinari
Copy link
Contributor

Nice work! If a merge can happen, it would be fantastic, as this is a much needed addition to ODB.

After this, all we then need is automatic class and property creation upon insert and allowing indexing without schema and we are at a true No-SQL solution like Mongo. 😄

Scott

@lvca
Copy link
Member

lvca commented Jan 31, 2017

@tglman Please take care of this PR. Thanks @theleapofcode for this.

@tglman
Copy link
Member

tglman commented Feb 2, 2017

Hi,

I'm sorry to stop you again, anyway this time I'll try to list all the requirements needed for declare the embedded index supported:

I annotate the cases this pull request support with an (Ok,No,Maybe)

  • Support of indexing of a plain nested property "a.b.c" (Ok)
  • Support of indexing of nested collection "a.b.c" where b is a collection (Ok)
  • Support of indexing of deep nesting "a.b.c.d.e...." (No)
  • Support of indexing of nested collection of collection "a.b.c" where all fields "a","b","c" are embedded collections without document structures in the middle (Maybe).
  • Support of update of a single or multiple element of a nested structure (Only Some cases)
  • Support remove of index nested filed (Maybe)
  • Support remove of index nested filed with a mix of nested collections and document (No)

This are not requirement for OrientDB but from your example you want them inside

  • Support of index of filtered maps (Ok).
  • Support of index of filtered collections (is not described in this PR, but for completeness is a requirement) (No)

Anyway you are not too far, all you need to do is support in a better way the events on the nested operations, inside orient we have a tracking structure that let us know what value is actually changed of the document and as well keep track of deleted values for clean up the index, you already manipulated the code that use it : https://github.com/orientechnologies/orientdb/pull/7093/files#diff-221d36d3f64505bf02bf2315fc8570eeR216 but there is not changes that show the support of nesting.
In the specific there is a: https://github.com/orientechnologies/orientdb/blob/develop/core/src/main/java/com/orientechnologies/orient/core/record/impl/ONestedMultiValueChangeEvent.java that keep the changes of a nested collection of a nested collection, and it can have inside eventually a document that has is own tracking that should get out and checked as well.

Another point i see you do most of the tests just on SQL, that is a bit limited use case because it not yet support some way to update like: ((Map<String,ODocument>) doc.field("map")).get("some").remove("nested_field"), this with the current code is not going to trigger the index because of the missing management for nested events.

So for consider this pull request done and mergiable should support the cases listed before, with arbitrary nesting in both SQL and Java api.

I'm aware you really need this change, but is farly important for us to merge only complete features.
Feel free to ask any question or if you need reach us for help is something is not clear, your work is definitely valuable, we just need to be fineshed up.

Regards

@lvca
Copy link
Member

lvca commented Feb 3, 2017

@tglman thanks for the detailed analysis. @theleapofcode this PR is something OrientDB users really wanted and they were waiting for it since long time ago (see @smolinari). Please let us know if you have time to work on it so we can put in the upcoming 3.0M1 next week. Thanks so much.

@theleapofcode
Copy link
Author

Thanks @tglman for the detailed analysis.

@lvca I would love to contribute this feature as I and lot of others are waiting for this. But its very hard for me to find time this week as i am neck deep into my project work. Even for this PR, i could somehow squeeze a day worth of time and contribute. But as @tglman pointed, there are lot more analysis and cases i need to add to make it complete and frankly i wont be able to complete it this week.

For my immediate requirement in my project, my fork works fine and i can live with it till its officially added. Also whatever i have done seems to be in the right direction and if you guys can prioritize this feature and build upon my PR and include it in next release, it would be great for me and lot of others.

Let me see if i can squeeze few hours here and there to work on this but i cannot commit dedicated time. Will keep you updated if i find time to work on this. Hope you can understand. Looking forward to release of this feature asap.

Regards,
Dileep

@lvca
Copy link
Member

lvca commented Feb 15, 2017

Hi guys, any news on this? I see users demand this feature. Look at: https://groups.google.com/d/msg/orient-database/wVUXCuzdVeY/quQ9SI8zBQAJ

@theleapofcode
Copy link
Author

Hi @lvca,

I have not yet worked on this. And to be frank will not be able to work on this anytime soon.. Busy with lot of work on my project and other stuff. If you guys can prioritize this requirement and add it in your upcoming release, it would be great. Will keep you updated in case i do work on this if time permits.

Regards,
Dileep

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

8 participants