Recently, we moved from version 0.19.11 to last stable 0.90.0 and found one very strange behavior that looks like an issue.
If on the item root level we have specified our custom index_analyzer, search_analyzer (or just analyzer), then index_analyzer works well, but not the search_analyzer. Also, fif we update existing mapping with explicitly specifying search_analyzer on the field level, then it still doesn't seem to work and ES uses standard one.
To reproduce:
Create a new index and define our custom analyzer de_stem:
curl -XPUT 'http://localhost:9200/issue/' -d '{"index": {"number_of_shards": 1,"analysis": {"filter": {"de_snowball": {"type": "snowball","language": "German"}},"analyzer": {"de_stem": {"type": "custom","tokenizer": "standard","filter": ["lowercase", "de_snowball"]}}}}},"number_of_replicas": 0}}'
Put mapping with specified index_analyzer and search_analyzer :
curl -XPUT 'http://localhost:9200/issue/item/_mapping' -d '{"item": {"index_analyzer" : "de_stem","search_analyzer" : "de_stem","properties": {"content": {"dynamic": false,"properties": {"body": {"type": "string"}}}}}}}'
Try search_analyzer for the field content.body with Analyze API
curl -XGET 'localhost:9200/issue/_analyze?pretty=true&field=content.body' -d 'Apple'
Actual result:
{
"tokens" : [ {
"token" : "apple",
"start_offset" : 0,
"end_offset" : 5,
"type" : "<ALPHANUM>",
"position" : 1
} ]
}
Expected result:
{
"tokens" : [ {
"token" : "appl",
"start_offset" : 0,
"end_offset" : 5,
"type" : "<ALPHANUM>",
"position" : 1
} ]
}
The right (expected) result still possible to get, but with explicitly specified search_analyzer:
curl -XGET 'localhost:9200/issue/_analyze?pretty=true&field=content.body&analyzer=de_stem' -d 'Apple'
Index Analyzer is set well
As we see above, search_analyzer seems wasn't set, but index_analyzer works well.
Let's index a document:
curl -PUT 'http://localhost:9200/issue/item/1' -d '{"content" : {"body": "10 Things We Hate About Apple"}}'
If index_analyzer was set well to de_stem the word Apple should be indexed as appl, but not apple (as standard analyzer does).
Let's search for appl first:
curl -XGET 'http://localhost:9200/issue/_search?search_type=count&pretty=true' -d '{"query":{"query_string":{"fields":["content.body"],"query":"appl"}}}'
It works! We get back 1 result:
"hits" : {
"total" : 1,
"max_score" : 0.0,
"hits" : [ ]
}
For the word apple, as expected, it doesn't work since search_analyzer is standard, but index_analyzer is de_stem (so, actual search term will stay apple, but indexed is appl):
curl -XGET 'http://localhost:9200/issue/_search?search_type=count&pretty=true' -d '{"query":{"query_string":{"fields":["content.body"],"query":"apple"}}}'
"hits" : {
"total" : 0,
"max_score" : 0.0,
"hits" : [ ]
}
Specifying search analyzer with Put Mapping API doesn't help
Ok, i try to update mapping and specify explicitly search analyzer for content.body field on the existing index we created above:
curl -PUT 'http://localhost:9200/issue/item/_mapping' -d '{"item": {"properties": {"content": {"dynamic": false,"properties": {"body": {"type": "string", "search_analyzer": "de_stem"}}}}}}'
Response is ok, but the all problems described above stay the same. So, it seems the search_analyzer for the field content.body is still standard.
Recently, we moved from version
0.19.11to last stable0.90.0and found one very strange behavior that looks like an issue.If on the item root level we have specified our custom
index_analyzer,search_analyzer(or justanalyzer), thenindex_analyzerworks well, but not thesearch_analyzer. Also, fif we update existing mapping with explicitly specifyingsearch_analyzeron the field level, then it still doesn't seem to work and ES uses standard one.To reproduce:
Create a new index and define our custom analyzer
de_stem:Put mapping with specified
index_analyzerandsearch_analyzer:Try
search_analyzerfor the fieldcontent.bodywith Analyze APIActual result:
Expected result:
The right (expected) result still possible to get, but with explicitly specified
search_analyzer:Index Analyzer is set well
As we see above,
search_analyzerseems wasn't set, butindex_analyzerworks well.Let's index a document:
If
index_analyzerwas set well tode_stemthe wordAppleshould be indexed asappl, but notapple(asstandardanalyzer does).Let's search for
applfirst:It works! We get back 1 result:
For the word
apple, as expected, it doesn't work sincesearch_analyzeris standard, butindex_analyzerisde_stem(so, actual search term will stayapple, but indexed isappl):Specifying search analyzer with Put Mapping API doesn't help
Ok, i try to update mapping and specify explicitly
search analyzerforcontent.bodyfield on the existing index we created above:Response is
ok, but the all problems described above stay the same. So, it seems thesearch_analyzerfor the fieldcontent.bodyis stillstandard.