cloned from elastic/elasticsearch-hadoop#997
Elasticsearch 2.4.4 + repository-hdfs
Snapshots fail if the HDFS repo is not available when Elasticsearch starts, even though the repo was brought up while the Elasticsearch service is running.
Repro:
- Stopped ES
- Stopped Hadoop HDFS
- Started ES while HDFS is down
- Started HDFS
- Attempted to GET /_snapshot/repo and then attempted to PUT /_snapshot/repo
The API command fails with:
{
"error": {
"root_cause": [
{
"type": "repository_missing_exception",
"reason": "[may2017] missing"
}
],
"type": "repository_missing_exception",
"reason": "[may2017] missing"
},
"status": 404
}
Full cluster log is available upon request.
[2017-05-16 19:28:14,286][WARN ][repositories ] [cops-4] failed to create repository [hdfs][may2017]
org.elasticsearch.common.inject.CreationException: Guice creation errors:
1) Error injecting constructor, java.net.ConnectException: Call From node-4.dev.domain.com/10.237.50.1 to node-2.dev.domain.com:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at org.elasticsearch.repositories.hdfs.HdfsRepository.<init>(Unknown Source)
while locating org.elasticsearch.repositories.hdfs.HdfsRepository
while locating org.elasticsearch.repositories.Repository
1 error
at org.elasticsearch.common.inject.internal.Errors.throwCreationExceptionIfErrorsExist(Errors.java:360)
at org.elasticsearch.common.inject.InjectorBuilder.injectDynamically(InjectorBuilder.java:178)
at org.elasticsearch.common.inject.InjectorBuilder.build(InjectorBuilder.java:110)
at org.elasticsearch.common.inject.InjectorImpl.createChildInjector(InjectorImpl.java:154)
at org.elasticsearch.common.inject.ModulesBuilder.createChildInjector(ModulesBuilder.java:55)
at org.elasticsearch.repositories.RepositoriesService.createRepositoryHolder(RepositoriesService.java:404)
at org.elasticsearch.repositories.RepositoriesService.clusterChanged(RepositoriesService.java:299)
at org.elasticsearch.cluster.service.InternalClusterService.runTasksForExecutor(InternalClusterService.java:622)
at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:784)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:231)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:194)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.ConnectException: Call From node-4.dev.domain.com/10.237.50.1 to node-2.dev.domain.com:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732)
at org.apache.hadoop.ipc.Client.call(Client.java:1480)
at org.apache.hadoop.ipc.Client.call(Client.java:1407)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy22.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy23.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2116)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1317)
at org.apache.hadoop.fs.FileSystem.isFile(FileSystem.java:1450)
at org.elasticsearch.repositories.hdfs.HdfsRepository.doGetFileSystem(HdfsRepository.java:125)
at org.elasticsearch.repositories.hdfs.HdfsRepository.access$000(HdfsRepository.java:57)
at org.elasticsearch.repositories.hdfs.HdfsRepository$2.run(HdfsRepository.java:102)
at org.elasticsearch.repositories.hdfs.HdfsRepository$2.run(HdfsRepository.java:99)
at java.security.AccessController.doPrivileged(Native Method)
at org.elasticsearch.repositories.hdfs.HdfsRepository.getFileSystem(HdfsRepository.java:99)
at org.elasticsearch.repositories.hdfs.SecurityUtils.execute(SecurityUtils.java:34)
at org.elasticsearch.hadoop.hdfs.blobstore.HdfsBlobStore.mkdirs(HdfsBlobStore.java:58)
at org.elasticsearch.hadoop.hdfs.blobstore.HdfsBlobStore.<init>(HdfsBlobStore.java:54)
at org.elasticsearch.repositories.hdfs.HdfsRepository.<init>(HdfsRepository.java:90)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.elasticsearch.common.inject.DefaultConstructionProxyFactory$1.newInstance(DefaultConstructionProxyFactory.java:50)
at org.elasticsearch.common.inject.ConstructorInjector.construct(ConstructorInjector.java:86)
at org.elasticsearch.common.inject.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:104)
at org.elasticsearch.common.inject.FactoryProxy.get(FactoryProxy.java:54)
at org.elasticsearch.common.inject.ProviderToInternalFactoryAdapter$1.call(ProviderToInternalFactoryAdapter.java:47)
at org.elasticsearch.common.inject.InjectorImpl.callInContext(InjectorImpl.java:886)
at org.elasticsearch.common.inject.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:43)
at org.elasticsearch.common.inject.Scopes$1$1.get(Scopes.java:59)
at org.elasticsearch.common.inject.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:46)
at org.elasticsearch.common.inject.InjectorBuilder$1.call(InjectorBuilder.java:201)
at org.elasticsearch.common.inject.InjectorBuilder$1.call(InjectorBuilder.java:193)
at org.elasticsearch.common.inject.InjectorImpl.callInContext(InjectorImpl.java:879)
at org.elasticsearch.common.inject.InjectorBuilder.loadEagerSingletons(InjectorBuilder.java:193)
at org.elasticsearch.common.inject.InjectorBuilder.injectDynamically(InjectorBuilder.java:175)
... 12 more
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:609)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:707)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1529)
at org.apache.hadoop.ipc.Client.call(Client.java:1446)
... 57 more
Is there any way to try establishing a connection to the repo after Elasticsearch has started?
@imotov @jbaiera
cloned from elastic/elasticsearch-hadoop#997
Elasticsearch 2.4.4 + repository-hdfs
Snapshots fail if the HDFS repo is not available when Elasticsearch starts, even though the repo was brought up while the Elasticsearch service is running.
Repro:
The API command fails with:
Full cluster log is available upon request.
Is there any way to try establishing a connection to the repo after Elasticsearch has started?
@imotov @jbaiera