Rewrote the multi-match docs by clintongormley · Pull Request #2 · s1monw/elasticsearch

clintongormley · 2014-02-04T15:16:02Z

No description provided.

This change adds a new "filter_path" parameter that can be used to filter and reduce the responses returned by the REST API of elasticsearch. For example, returning only the shards that failed to be optimized: ``` curl -XPOST 'localhost:9200/beer/_optimize?filter_path=_shards.failed' {"_shards":{"failed":0}}% ``` It supports multiple filters (separated by a comma): ``` curl -XGET 'localhost:9200/_mapping?pretty&filter_path=*.mappings.*.properties.name,*.mappings.*.properties.title' ``` It also supports the YAML response format. Here it returns only the `_id` field of a newly indexed document: ``` curl -XPOST 'localhost:9200/library/book?filter_path=_id' -d '---hello:\n world: 1\n' --- _id: "AU0j64-b-stVfkvus5-A" ``` It also supports wildcards. Here it returns only the host name of every nodes in the cluster: ``` curl -XGET 'http://localhost:9200/_nodes/stats?filter_path=nodes.*.host*' {"nodes":{"lvJHed8uQQu4brS-SXKsNA":{"host":"portable"}}} ``` And "**" can be used to include sub fields without knowing the exact path. Here it returns only the Lucene version of every segment: ``` curl 'http://localhost:9200/_segments?pretty&filter_path=indices.**.version' { "indices" : { "beer" : { "shards" : { "0" : [ { "segments" : { "_0" : { "version" : "5.2.0" }, "_1" : { "version" : "5.2.0" } } } ] } } } } ``` Note that elasticsearch sometimes returns directly the raw value of a field, like the _source field. If you want to filter _source fields, you should consider combining the already existing _source parameter (see Get API for more details) with the filter_path parameter like this: ``` curl -XGET 'localhost:9200/_search?pretty&filter_path=hits.hits._source&_source=title' { "hits" : { "hits" : [ { "_source":{"title":"Book #2"} }, { "_source":{"title":"Book #1"} }, { "_source":{"title":"Book #3"} } ] } } ```

Due to refactoring in 0.21.x we have to update this plugin Closes #2.

----------------- The GCE discovery can also filter machines to include in the cluster based on tags using `discovery.gce.tags` settings. For example, setting `discovery.gce.tags` to `dev` will only filter instances having a tag set to `dev`. Several tags set will require all of those tags to be set for the instance to be included. One practical use for tag filtering is when an GCE cluster contains many nodes that are not running elasticsearch. In this case (particularly with high ping_timeout values) there is a risk that a new node's discovery phase will end before it has found the cluster (which will result in it declaring itself master of a new cluster with the same name - highly undesirable). Adding tag on elasticsearch GCE nodes and then filtering by that tag will resolve this issue. Add your tag when building the new instance: ```sh gcutil --project=es-cloud addinstance myesnode1 --service_account_scope=compute-rw --persistent_boot_disk \ --tags=elasticsearch,dev ``` Then, define it in `elasticsearch.yml`: ```yaml cloud: gce: project_id: es-cloud zone: europe-west1-a discovery: type: gce gce: tags: elasticsearch, dev ``` Closes #2.

Move tests to elasticsearch test framework. In addition to this, we want to refactor some package names to prepare next snapshot/restore feature (see #2). Closes #3.

elasticsearch 1.0 will provide a new feature named `Snapshot & Restore`. We want to add support for [Azure Storage](http://www.windowsazure.com/en-us/documentation/services/storage/). To enable Azure repositories, you have first to set your azure storage settings: ```yaml cloud: azure: storage_account: your_azure_storage_account storage_key: your_azure_storage_key ``` The Azure repository supports following settings: * `container`: Container name. Defaults to `elasticsearch-snapshots` * `base_path`: Specifies the path within container to repository data. Defaults to empty (root directory). * `concurrent_streams`: Throttles the number of streams (per node) preforming snapshot operation. Defaults to `5`. * `chunk_size`: Big files can be broken down into chunks during snapshotting if needed. The chunk size can be specified in bytes or by using size value notation, i.e. `1g`, `10m`, `5k`. Defaults to `64m` (64m max) * `compress`: When set to `true` metadata files are stored in compressed format. This setting doesn't affect index files that are already compressed by default. Defaults to `false`. Some examples, using scripts: ```sh $ curl -XPUT 'http://localhost:9200/_snapshot/my_backup1' -d '{ "type": "azure" }' $ curl -XPUT 'http://localhost:9200/_snapshot/my_backup2' -d '{ "type": "azure", "settings": { "container": "backup_container", "base_path": "backups", "concurrent_streams": 2, "chunk_size": "32m", "compress": true } }' ``` Example using Java: ```java client.admin().cluster().preparePutRepository("my_backup3") .setType("azure").setSettings(ImmutableSettings.settingsBuilder() .put(AzureStorageService.Fields.CONTAINER, "backup_container") .put(AzureStorageService.Fields.CHUNK_SIZE, new ByteSizeValue(32, ByteSizeUnit.MB)) ).get(); ``` Closes #2.

...as it is in the main elasticsearch pom.xml. This is useful for people who want to use slf4j/logback instead of log4j. Closes #2.

Closes #2.

It sounds like Jython 2.5.3 is leaking some threads. Jython 2.5.4.rc1 has the same issue. Jython 2.7-b3 fixes it. Typical error when running tests: ``` ERROR 0.00s J2 | PythonScriptEngineTests (suite) <<< > Throwable #1: com.carrotsearch.randomizedtesting.ThreadLeakError: 1 thread leaked from SUITE scope at org.elasticsearch.script.python.PythonScriptEngineTests: > 1) Thread[id=12, name=org.python.google.common.base.internal.Finalizer, state=WAITING, group=TGRP-PythonScriptEngineTests] > at java.lang.Object.wait(Native Method) > at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135) > at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151) > at org.python.google.common.base.internal.Finalizer.run(Finalizer.java:127) > at __randomizedtesting.SeedInfo.seed([7A5ECFD8D0474383]:0) > Throwable #2: com.carrotsearch.randomizedtesting.ThreadLeakError: There are still zombie threads that couldn't be terminated: > 1) Thread[id=12, name=org.python.google.common.base.internal.Finalizer, state=WAITING, group=TGRP-PythonScriptEngineTests] > at java.lang.Object.wait(Native Method) > at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135) > at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151) > at org.python.google.common.base.internal.Finalizer.run(Finalizer.java:127) > at __randomizedtesting.SeedInfo.seed([7A5ECFD8D0474383]:0) ``` Closes elastic#22.

…point into lucene (elastic#25827) When a replica processes out of order operations, it can drop some due to version comparisons. In the past that would have resulted in a VersionConflictException being thrown and the operation was totally ignored. With the seq# push, we started storing these operations in the translog (but not indexing them into lucene) in order to have complete op histories to facilitate ops based recoveries. This in turn had the undesired effect that deleted docs may be resurrected during recovery in some extreme edge situation (see a complete explanation below). This PR contains a simple fix, which is also an optimization for the recovery process, incoming operation that have a seq# lower than the current local checkpoint (i.e., have already been processed) should not be indexed into lucene. Note that sometimes we can also skip storing them in the translog, but this is not required for the fix and is more complicated. This is the equivalent of elastic#25592 ## More details on resurrected ops Consider two operations: - Index d1, seq no 1 - Delete d1, seq no 3 On a replica they come out of order: - Translog gen 1 contains: - delete (seqNo 3) - Translog gen 2 contains: - index (seqNo 1) (wasn't indexed into lucene, but put into the translog) - another operation (seqNo 10) - Translog gen 3 - another op (seqNo 9) - Engine commits with: - local checkpoint 9 - refers to gen 2 If this replica becomes a primary: - Local recovery will replay translog gen 2 and up, causing index #1 to be re-index. - Even if recovery will start at gen 3, the translog retention policy will cause file based recovery to replay the entire translog. If it happens to start at gen 2 (but not 1), we will run into the same problem. #### Some context - out of order delivery involving deletes: On normal operations, this relies on the gc_deletes setting. We assume that the setting represents an upper bound on the time between the index and the delete operation. The index operation will be detected as stale based on the tombstone map in the LiveVersionMap. Recovery presents a challenge as it can replay an old index operation that was in the translog and override a delete operation that was done when the engine was opened (and is not part of the replayed snapshot). To deal with this situation, we disable GC deletes (i.e. retain all deletes) for the duration of recoveries. This means that the delete operation will be remembered and the index operation ignored. Both of the above scenarios (local recover + peer recovery) create a situation where the delete operation is never replayed. It this "lost" as lucene doesn't remember it happened and our LiveVersionMap is populated with it. #### Solution: Note that both local and peer recovery represent a scenario where we replay translog ops on top of an existing lucene index, potentially with ongoing indexing. Therefore we can treat them the same. The local checkpoint in Lucene represent a marker indicating that all operations below it were performed on the index. This is the only form of "memory" that we have that relates to deletes. If we can achieve the following: 1) All ops below the local checkpoint are not indexed to lucene. 2) All ops above the local checkpoint are It will mean that all variants are covered: (i# == index op seq#, d# == delete op seq#, lc == local checkpoint in commit) 1) i# < d# <= lc - document is already deleted in lucene and stays that way. 2) i# <= lc < d# - delete is replayed on index - document is deleted 3) lc < i# < d# - index is replayed and then delete - document is deleted. More formally - we want to make sure that for all ops that performed on the primary o1 and o2, if o2 is processed on a shard before o1, o1 will be dropped. We have the following scenarios 1) If both o1 or o2 are not included in the replayed snapshot and are above it (i.e., have a higher seq#), they fall under the gc deletes assumption. 2) If both o1 is part of the replayed snapshot but o2 is above it: - if o2 arrives first, o1 must arrive due to the recovery and potentially via replication as well. since gc deletes is disabled we are guaranteed to know of o2's existence. 3) If both o2 and o1 are part of the replayed snapshot: - we fall under the same scenarios as #2 - disabling GC deletes ensures we know of o2 if it arrives first. 4) If o1 falls before the snapshot and o2 is either part of the snapshot or higher: - Since the snapshot is guaranteed to contain all ops that are not part of lucene and are above the lc in the commit used, this means that o1 is part of lucene and o1 < local checkpoint. This means it won't be processed and we're not in the scenario we're discussing. 5) If o2 falls before the snapshot but o1 is part of it: - by the same reasoning above, o2 is < local checkpoint. Since o1 < o2, we also get o1 < local checkpoint and this will be dropped. #### Implementation: For local recovery, we can filter the ops we read of the translog and avoid replaying them. For peer recovery this is tricky as we do want to send the operations in order to have some history on the target shard. Filtering operations on the engine level (i.e., not indexing to lucene if op seq# <= lc) would work for both.

In elastic#28350, we fixed an endless flushing loop which may happen on replicas by tightening the relation between the flush action and the periodically flush condition. 1. The periodically flush condition is enabled only if it is disabled after a flush. 2. If the periodically flush condition is enabled then a flush will actually happen regardless of Lucene state. (1) and (2) guarantee that a flushing loop will be terminated. Sadly, the condition 1 can be violated in edge cases as we used two different algorithms to evaluate the current and future uncommitted translog size. - We use method `uncommittedSizeInBytes` to calculate current uncommitted size. It is the sum of translogs whose generation at least the minGen (determined by a given seqno). We pick a continuous range of translogs since the minGen to evaluate the current uncommitted size. - We use method `sizeOfGensAboveSeqNoInBytes` to calculate the future uncommitted size. It is the sum of translogs whose maxSeqNo at least the given seqNo. Here we don't pick a range but select translog one by one. Suppose we have 3 translogs `gen1={#1,#2}, gen2={}, gen3={#3} and seqno=#1`, `uncommittedSizeInBytes` is the sum of gen1, gen2, and gen3 while `sizeOfGensAboveSeqNoInBytes` is the sum of gen1 and gen3. Gen2 is excluded because its maxSeqno is still -1. This commit removes both `sizeOfGensAboveSeqNoInBytes` and `uncommittedSizeInBytes` methods, then enforces an engine to use only `sizeInBytesByMinGen` method to evaluate the periodically flush condition. Closes elastic#29097 Relates #elastic#28350

Rewrote the multi-match docs

fcae40d

s1monw merged this pull request into s1monw:issues/44 Feb 4, 2014

clintongormley deleted the issues/44 branch June 6, 2014 16:02

s1monw pushed a commit that referenced this pull request Jun 5, 2015

Move to Elasticsearch 0.21.0.Beta1

d080a75

Due to refactoring in 0.21.x we have to update this plugin Closes #2.

s1monw pushed a commit that referenced this pull request Jun 5, 2015

Move to Elasticsearch 0.21.0.Beta1

52ca251

Due to refactoring in 0.21.x we have to update this plugin Closes #2.

s1monw pushed a commit that referenced this pull request Jun 5, 2015

Move tests to elasticsearch test framework

316a141

Move tests to elasticsearch test framework. In addition to this, we want to refactor some package names to prepare next snapshot/restore feature (see #2). Closes #3.

s1monw pushed a commit that referenced this pull request Jun 10, 2015

Make log4j an optional dependency

aa68294

...as it is in the main elasticsearch pom.xml. This is useful for people who want to use slf4j/logback instead of log4j. Closes #2.

s1monw pushed a commit that referenced this pull request Jun 10, 2015

Add documentation

97b63f2

Closes #2.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rewrote the multi-match docs#2

Rewrote the multi-match docs#2
s1monw merged 1 commit intos1monw:issues/44from
clintongormley:issues/44

clintongormley commented Feb 4, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

clintongormley commented Feb 4, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants