Skip to content

Write column OID to the source#14636

Closed
BaurzhanSakhariev wants to merge 22 commits intodrop-columnfrom
b/write-oid2
Closed

Write column OID to the source#14636
BaurzhanSakhariev wants to merge 22 commits intodrop-columnfrom
b/write-oid2

Conversation

@BaurzhanSakhariev
Copy link
Contributor

@BaurzhanSakhariev BaurzhanSakhariev commented Aug 31, 2023

Next step after #14635

TODO:

  • (reads) Change LuceneReferenceResolver/Lucene reading logic to access data by oid instead of column name
  • (reads) Need to update query building to use oid instead of column name
  • Primary key lookup? (check PKLookupOperation)
  • (indexing) Change mapper implementations to use oid as field name
  • (indexing) Change source generation to use oid as field name
  • optional Change _raw lookup to rewrite oid to name and don’t' include dropped columns
  • optional select _raw must hide dropped columns

This is a pre-requisite for using column OID-s instead of names in the _source.

Phase 1: Collect new columns and do a schema update. Don't create a Lucene document with source at this point.
Phase 2: Index. Target references still might have unassigned OID since this is only a preparation step.
Base automatically changed from b/oid-in-src2 to drop-column September 4, 2023 17:24
@BaurzhanSakhariev BaurzhanSakhariev force-pushed the b/write-oid2 branch 3 times, most recently from 2f3d562 to 0622fcf Compare September 5, 2023 12:36

@Nullable
public final ValueIndexer<? super T> valueIndexer(RelationName table,
public final ValueIndexer<? super T> valueIndexer(DocTableInfo table,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this change needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole second commit was needed only to make ObjectIndexer aware of table.versionCreated. As you noted in the prev comment, we need to parse sourceKeyWriter instead.

I mean, I didn't want to inject version in the indexer like

if (valueIndexer instance of ObjectIndexer objectIndexer) { only ObjectIndexer cares about Version
    objectIndexer.someNewMethodToInjectVersion(version)
}

as I thought it's not nice - but it looks shorter. I will drop second commit, and do direct injection instead. Will pass sourceKeyWriter as suggested above.

Copy link
Contributor Author

@BaurzhanSakhariev BaurzhanSakhariev Sep 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah ok, I recalled why I didn't want to do "direct injection" - in case of nested indexers, need to propagate it
for Object/Array/Dynamic indexers. I will think how I can propagate that shortly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added soureKeyWriter to the indexValue and then it gets propagated to all nested (or newly created by DynamicIndexer) ObjectIndexers. Thanks for the suggestion!

I rebased the PR

@BaurzhanSakhariev BaurzhanSakhariev force-pushed the b/write-oid2 branch 2 times, most recently from cef00d6 to cd5ee19 Compare September 5, 2023 14:49
Comment on lines 534 to 544
addedColumnsByIdent.put(ref.column(), ref);
// Also add to columnsByIdent,so that getRef supplier doesn't get stale.
columnsByIdent.put(ref.column(), ref);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed for the nested ObjectIndexer-s, after phase 1 table can get stale.
We update explicitly only targets via Indexer.updateTargets.

Before it was not important since we used column name anyway, now we are using getRef.apply on each ObjectIndexer.indexValue since we need to get OID now.

@BaurzhanSakhariev BaurzhanSakhariev force-pushed the b/write-oid2 branch 2 times, most recently from b37e849 to 6f94952 Compare September 5, 2023 15:02
Comment on lines 371 to +376
Symbol[] returnValues) {
Iterator<Reference> refsIterator = table.iterator();
while (refsIterator.hasNext()) {
Reference reference = refsIterator.next();
columnsByIdent.put(reference.column(), reference);
}
Copy link
Contributor Author

@BaurzhanSakhariev BaurzhanSakhariev Sep 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not using table.columns() since we need all columns, including nested ones.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants