[DOCS] Reformat apostrophe token filter docs by jrodewig · Pull Request #48076 · elastic/elasticsearch

jrodewig · 2019-10-15T17:03:30Z

Reformats the apostrophe token filter docs. This PR adds:

A title abbreviation
An analyze API example with resulting tokens
An example adding the token filter to an analyzer

I hope to re-use this format for other token filter docs. All feedback is welcome!

elasticmachine · 2019-10-15T17:03:32Z

Pinging @elastic/es-search (:Search/Analysis)

elasticmachine · 2019-10-15T17:03:33Z

Pinging @elastic/es-docs (>docs)

jrodewig · 2019-10-15T17:05:46Z

docs/reference/analysis/tokenfilters/apostrophe-tokenfilter.asciidoc

+The `apostrophe` token filter produces the following tokens:
+
+[source,text]
+---------------------------
+[ Istanbul, veya, Istanbul ]
+---------------------------
+
+/////////////////////
+[source,console-result]


I would love any feedback here.

The token example is based on the one from the simple analyzer:
https://www.elastic.co/guide/en/elasticsearch/reference/master/analysis-simple-analyzer.html#_example_7

From my understanding, this token filter was built to be used with Turkish. For clarity, it could be worth mentioning that explicitly, perhaps linking to the Turkish analyzer (which makes use of this filter).

That's a great idea. I've added this information to the description at the top with 04a33a6.

jtibshirani

The overall format seems like a nice improvement to me. It's great to give an example of how the text gets analyzed, and I think using the _analyze API will help more users become aware of this helpful endpoint.

One thought about the documentation format -- we could consider linking to the Lucene documentation when it exists, as it often contains more detailed information or paper references. This also gives more clarity around where the implementation lives, so users know where to go to dig into code or file a bug.

jtibshirani · 2019-10-15T19:40:48Z

docs/reference/analysis/tokenfilters/apostrophe-tokenfilter.asciidoc

+The `apostrophe` token filter produces the following tokens:
+
+[source,text]
+---------------------------
+[ Istanbul, veya, Istanbul ]
+---------------------------
+
+/////////////////////
+[source,console-result]


From my understanding, this token filter was built to be used with Turkish. For clarity, it could be worth mentioning that explicitly, perhaps linking to the Turkish analyzer (which makes use of this filter).

jtibshirani · 2019-10-15T19:42:40Z

docs/reference/analysis/tokenfilters/apostrophe-tokenfilter.asciidoc

+    "settings" : {
+        "analysis" : {
+            "analyzer" : {
+                "default" : {


Small comment, maybe we don't want to always call the analyzer default. We could use a more specific name like apostrophe or standard_apostrophe.

I changed the analyzer name to standard_apostrophe with 04a33a6.

To add to my comment: when experimenting with token filters, it seems more likely that the user will want to create a custom analyzer as opposed to overriding the default one. Maybe we could make a few small tweaks to clarify this, including using a specific analyzer name other than defaultand linking to the custom analyzer docs.

Oops, I had a race condition with my comment. This looks good to me, we could also link to the custom analyzer docs if you think it's helpful.

No worries! That makes sense to me.

I added a link to the custom analyzer docs with 701a349
and reworded this paragraph with a847eeb.

jrodewig · 2019-10-15T20:21:55Z

Thanks for your feedback @jtibshirani. I made a few changes based on comments with 04a33a6. This includes linking to Lucene's docs, which is a good idea!

jtibshirani · 2019-10-15T20:25:19Z

docs/reference/analysis/tokenfilters/apostrophe-tokenfilter.asciidoc

+==== Add to an analyzer
+
+The following <<indices-create-index,create index API>> request adds the
+apostrophe token filter to an analyzer.


One last small suggestion -- instead of 'adding to an an analyzer' it could be clearer/ more precise to say 'uses the token filter to configure a new analyzer'.

Thanks for the clarification! Reworded with a847eeb..

jtibshirani

This looks good to me.

[DOCS] Reformat apostrophe token filter docs

08b0c61

jrodewig added >docs General docs changes :Search Relevance/Analysis How text is split into tokens v8.0.0 v7.5.0 v7.4.1 labels Oct 15, 2019

jrodewig requested review from jtibshirani and mayya-sharipova October 15, 2019 17:03

jrodewig commented Oct 15, 2019

View reviewed changes

kat257 mentioned this pull request Oct 15, 2019

[DOCS] Reorganize, rewrite and add examples to analysis topics #44726

Closed

82 tasks

jtibshirani reviewed Oct 15, 2019

View reviewed changes

iter

04a33a6

jtibshirani reviewed Oct 15, 2019

View reviewed changes

jrodewig added 3 commits October 15, 2019 16:28

add custom analyzer link

701a349

reword analyzer example desc

a847eeb

consistent snippet delimiters

5f2174c

jtibshirani approved these changes Oct 15, 2019

View reviewed changes

jrodewig merged commit c367c5c into elastic:master Oct 16, 2019

jrodewig deleted the reformat.apos-token-filter branch October 16, 2019 12:50

jrodewig added a commit that referenced this pull request Oct 16, 2019

[DOCS] Reformat apostrophe token filter docs (#48076)

8677653

jrodewig added a commit that referenced this pull request Oct 16, 2019

[DOCS] Reformat apostrophe token filter docs (#48076)

b5cf982

tomcallahan added v7.4.2 and removed v7.4.1 labels Oct 22, 2019

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

Conversation

jrodewig commented Oct 15, 2019

Uh oh!

elasticmachine commented Oct 15, 2019

Uh oh!

elasticmachine commented Oct 15, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jtibshirani left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jrodewig commented Oct 15, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jtibshirani left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants