|
1 | 1 | # Tag Fields |
2 | 2 |
|
3 | | -RediSearch 0.91 adds a new kind of field - the Tag field. They are similar to full-text fields but use simpler tokenization and encoding in the index. The values in these fields cannot be accessed by general field-less search and can be used only with a special syntax. |
| 3 | +Tag fields are similar to full-text fields but use simpler tokenization and encoding in the index. The values in these fields cannot be accessed by general field-less search and can be used only with a special syntax. |
4 | 4 |
|
5 | 5 | The main differences between tag and full-text fields are: |
6 | 6 |
|
7 | 7 | 1. We do not perform stemming on tag indexes. |
8 | 8 |
|
9 | | -2. The tokenization is simpler: The user can determine a separator (defaults to a comma) for multiple tags, and we only do whitespace trimming at the end of tags. Thus, tags can contain spaces, punctuation marks, accents, etc. The only two transformations we perform are lower-casing (for latin languages only as of now), and whitespace trimming. |
| 9 | +2. The tokenization is simpler: The user can determine a separator (defaults to a comma) for multiple tags, and we only do whitespace trimming at the end of tags. Thus, tags can contain spaces, punctuation marks, accents, etc. |
10 | 10 |
|
11 | | -3. Tags cannot be found from a general full-text search. If a document has a field called "tags" with the values "foo" and "bar", searching for foo or bar without a special tag modifier (see below) will not return this document. |
| 11 | +3. The only two transformations we perform are lower-casing (for latin languages only as of now) and whitespace trimming. Lower-case transformation can be disabled by passing CASESENSITIVE. |
12 | 12 |
|
13 | | -4. The index is much simpler and more compressed: We do not store frequencies, offset vectors of field flags. The index contains only document IDs encoded as deltas. This means that an entry in a tag index is usually one or two bytes long. This makes them very memory efficient and fast. |
| 13 | +4. Tags cannot be found from a general full-text search. If a document has a field called "tags" with the values "foo" and "bar", searching for foo or bar without a special tag modifier (see below) will not return this document. |
14 | 14 |
|
15 | | -5. An unlimited number of tag fields can be created per index, as long as the overall number of fields is under 1024. |
| 15 | +5. The index is much simpler and more compressed: We do not store frequencies, offset vectors of field flags. The index contains only document IDs encoded as deltas. This means that an entry in a tag index is usually one or two bytes long. This makes them very memory efficient and fast. |
| 16 | + |
| 17 | +6. An unlimited number of tag fields can be created per index, as long as the overall number of fields is under 1024. |
16 | 18 |
|
17 | 19 | ## Creating a tag field |
18 | 20 |
|
19 | 21 | Tag fields can be added to the schema in FT.ADD with the following syntax: |
20 | 22 |
|
21 | 23 | ``` |
22 | | -FT.CREATE ... SCHEMA ... {field_name} TAG [SEPARATOR {sep}] |
| 24 | +FT.CREATE ... SCHEMA ... {field_name} TAG [SEPARATOR {sep}] [CASESENSITIVE] |
23 | 25 | ``` |
24 | 26 |
|
25 | 27 | SEPARATOR defaults to a comma (`,`), and can be any printable ASCII character. For example: |
26 | 28 |
|
| 29 | +CASESENSITIVE can be specified to keep the original letters case. |
| 30 | + |
27 | 31 | ``` |
28 | 32 | FT.CREATE idx ON HASH PREFIX 1 test: SCHEMA tags TAG SEPARATOR ";" |
29 | 33 | ``` |
|
0 commit comments