Skip to content

Allow to specify highlighter parameters on a per field level basis#356

Closed
lukas-vlcek wants to merge 1 commit intoelastic:masterfrom
lukas-vlcek:73200c9d1ee5fd288182417138428dbfd8b70c80
Closed

Allow to specify highlighter parameters on a per field level basis#356
lukas-vlcek wants to merge 1 commit intoelastic:masterfrom
lukas-vlcek:73200c9d1ee5fd288182417138428dbfd8b70c80

Conversation

@lukas-vlcek
Copy link
Copy Markdown
Contributor

Now it is possible to override global highlighter settings at the field level. On top of this it is also possible to specify "fragment_type" : "content" which will return highlighted content of the field (no fragments).

@kimchy
Copy link
Copy Markdown
Member

kimchy commented Sep 5, 2010

Few changes that I plan to push:

  • The global setting are not used. Also, you can't rely on the order the elements arrive in the json (so the globals are set before the fields).
  • You can't rely on hashcode to provide uniqueness when constructing the map key.
  • I am going to simplify things a bit in how thins are parsed (no need for global element).
  • Also, single fragment builder will be used when setting number of fragments to 0.

@kimchy
Copy link
Copy Markdown
Member

kimchy commented Sep 5, 2010

Allow to specify highlighter parameters on a per field level basis, closed by cc1eac1.

@lukas-vlcek
Copy link
Copy Markdown
Contributor Author

Cool, just the javadoc needs to be updated according to the changes: http://github.com/elasticsearch/elasticsearch/blob/cc1eac147a5278b46dfa18481dbbd2ed2461aa19/modules/elasticsearch/src/main/java/org/elasticsearch/search/highlight/HighlighterParseElement.java#L44
Replace "fragment_type" : "content" with "number_of_fragments" : 0

@lukas-vlcek
Copy link
Copy Markdown
Contributor Author

There is one major issue. If numberOfFragments is 0 then the Highlighter does not produce any fragments. Dirty workaround is to add the following line of code after this line: http://github.com/elasticsearch/elasticsearch/blob/cc1eac147a5278b46dfa18481dbbd2ed2461aa19/modules/elasticsearch/src/main/java/org/elasticsearch/search/highlight/HighlightPhase.java#L100

field.numberOfFragments(1);

@lukas-vlcek
Copy link
Copy Markdown
Contributor Author

I think it needs to be set to any number greater then 0. I did not check the Lucene source code but after I did the above change then the highlighter correctly yielded whole field content with highlighted text.

@kimchy
Copy link
Copy Markdown
Member

kimchy commented Sep 6, 2010

applied.

alpar-t added a commit to alpar-t/elasticsearch that referenced this pull request May 9, 2018
- Apply workaround for: GradleUp/shadow#336
- bump plugin to 2.0.4

Changes between 2.0.2 and 2.0.4 of the plugin:
```
477db40 12 days ago john.engelman@target.com Release 2.0.4
3e3da37 3 weeks ago john.engelman@target.com Remove internal Gradle API and annotation internal getters on shadow jar.
31e2380 3 weeks ago john.engelman@target.com Close input streams. Closes elastic#364
f712cc8 3 weeks ago john.engelman@target.com Upgrade ASM to 6.1.1 to address perf issues. Closes elastic#374
2f94b2b 3 weeks ago john.engelman@target.com next version
23bbf3d 7 weeks ago john.r.engelman@gmail.com Add some gradle versions. Update changelog for 2.0.3
7435c74 7 weeks ago john.r.engelman@gmail.com Merge pull request elastic#367 from ttsiebzehntt/366-java10
325c002 7 weeks ago info@martinsadowski.de Update ASM to 6.1
94550e5 3 months ago john.r.engelman@gmail.com Merge pull request elastic#356 from sgnewson/update-file-to-files
66b691e 4 months ago john.r.engelman@gmail.com Merge pull request elastic#358 from 3flex/patch-1
14761b1 4 months ago 3flex@users.noreply.github.com fix markdown for User Guide URL in issue template
a3f6984 4 months ago newson@synopsys.com update inputs.file to inputs.files, to remove warning
```

closes elastic#30389
alpar-t added a commit that referenced this pull request May 10, 2018
* Solve Gradle deprecation warnings around shadowJar

- Apply workaround for: GradleUp/shadow#336
- bump plugin to 2.0.4

Changes between 2.0.2 and 2.0.4 of the plugin:
```
477db40 12 days ago john.engelman@target.com Release 2.0.4
3e3da37 3 weeks ago john.engelman@target.com Remove internal Gradle API and annotation internal getters on shadow jar.
31e2380 3 weeks ago john.engelman@target.com Close input streams. Closes #364
f712cc8 3 weeks ago john.engelman@target.com Upgrade ASM to 6.1.1 to address perf issues. Closes #374
2f94b2b 3 weeks ago john.engelman@target.com next version
23bbf3d 7 weeks ago john.r.engelman@gmail.com Add some gradle versions. Update changelog for 2.0.3
7435c74 7 weeks ago john.r.engelman@gmail.com Merge pull request #367 from ttsiebzehntt/366-java10
325c002 7 weeks ago info@martinsadowski.de Update ASM to 6.1
94550e5 3 months ago john.r.engelman@gmail.com Merge pull request #356 from sgnewson/update-file-to-files
66b691e 4 months ago john.r.engelman@gmail.com Merge pull request #358 from 3flex/patch-1
14761b1 4 months ago 3flex@users.noreply.github.com fix markdown for User Guide URL in issue template
a3f6984 4 months ago newson@synopsys.com update inputs.file to inputs.files, to remove warning
```

closes #30389

* Improove comment as suggested
ClaudioMFreitas pushed a commit to ClaudioMFreitas/elasticsearch-1 that referenced this pull request Nov 12, 2019
Moving to https for instructions
costin pushed a commit that referenced this pull request Dec 6, 2022
🤖 ESQL: Merge upstream
breskeby pushed a commit to breskeby/elasticsearch that referenced this pull request Feb 11, 2026
This change provides an implementation for deleting local Lucene files on 
indexing shards . This way the disk space required for indexing 
remains as small as possible, allowing in the future to fit "one more shard" 
on demand.

The implementation is based on 3 changes:
- IndexDirectory can be notified when a commit is successfully uploaded to 
the object store. This is required to mark local files as "uploaded". Once 
marked as uploaded commit files can be accessed through the shared 
cache (similarly to what is done on search shards) and can be deleted from 
disk if nothing is actively reading from them.
- IndexDirectory can track the Lucene files created for new segments and 
maintains some RefCounted object for every file (see LocalFileRef), which 
can be checked to know if the file can be read locally from disk or if it must 
be read from the shared cache.
- ReopeningIndexInput can read from a regular IndexInput and checks 
before every operation if the corresponding LocalFileRef object allows the 
operation. If not, it means that the file is/will be deleted from disk, and 
ReopeningIndexInput reopens a new IndexInput from the shared cache 
and restores the position/state. Otherwise it reads/clones/slices from the 
original IndexInput.

Local Lucene files are marked as uploaded (and become candidates for 
deletion) as soon as the commit is uploaded to the object store.

The impact of this change on indexing is OK for append-only use cases but 
is very impacting for update use cases. We will work on improving this in 
the future.
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants