-
Notifications
You must be signed in to change notification settings - Fork 5.3k
DNS lookup AUTO settings doesn't fallback to V4 when a single CNAME entry is returned #2634
Description
Title: DNS lookup AUTO settings doesn't fallback to V4 when a single CNAME entry is returned
Description:
I am just getting started with envoy, so I might be wrong here and missing something.
When working with a cluster that has a host address with a single CNAME entry returned for IPv6 lookups, envoy will not fallback to do a lookup using IPv4. From looking at the code at dns_impl.cc it looks like envoy assumes that if a success code (ARES_SUCCESS) is returned from getHostByName then it is assumed that an ip address was returned. This is not necessarily the case. A DNS request may contain only a CNAME entry with no IP. For example this happens to me with s3.amazonaws.com. Setting V4_ONLY solves the problem. Here is the dig output for IPv6 and IPv4:
$ dig s3.amazonaws.com AAAA
; <<>> DiG 9.10.3-P4-Ubuntu <<>> s3.amazonaws.com AAAA
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 34567
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; MBZ: 0005 , udp: 4096
;; QUESTION SECTION:
;s3.amazonaws.com. IN AAAA
;; ANSWER SECTION:
s3.amazonaws.com. 5 IN CNAME s3-1.amazonaws.com.
;; AUTHORITY SECTION:
s3-1.amazonaws.com. 5 IN SOA ns-1726.awsdns-23.co.uk. awsdns-hostmaster.amazon.com. 1 7200 900 1209600 86
Repro steps:
Use following envoy.yaml file:
admin:
access_log_path: "admin_access.log"
address:
socket_address:
address: 0.0.0.0
port_value: 9901
static_resources:
listeners:
- address:
socket_address:
address: 0.0.0.0
port_value: 10000
filter_chains:
- filters:
- name: envoy.http_connection_manager
config:
codec_type: auto
idle_timeout: 300s
access_log:
- name: envoy.file_access_log
config:
path: "egress_http.log"
stat_prefix: egress_http
http_protocol_options:
allow_absolute_url: true
route_config:
name: local_route
virtual_hosts:
- name: local_service
domains:
- "*"
routes:
- match:
prefix: "/"
route:
cluster: service_s3
http_filters:
- name: envoy.router
config: {}
clusters:
- name: service_s3
type: logical_dns
lb_policy: round_robin
connect_timeout: 1s
http_protocol_options: {}
# dns_lookup_family: V4_ONLY
hosts:
- socket_address:
address: s3.amazonaws.com
port_value: 443
tls_context:
sni: "s3.amazonaws.com"
Run envoy and then use curl to test out a simple request from s3:
curl -v --proxy http://localhost:10000 http://s3.amazonaws.com/my-test-bucket22/
* Trying ::1...
* connect to ::1 port 10000 failed: Connection refused
* Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 10000 (#0)
> GET http://s3.amazonaws.com/my-test-bucket22/ HTTP/1.1
> Host: s3.amazonaws.com
> User-Agent: curl/7.47.0
> Accept: */*
> Proxy-Connection: Keep-Alive
>
< HTTP/1.1 503 Service Unavailable
< content-length: 19
< content-type: text/plain
< date: Fri, 16 Feb 2018 15:41:24 GMT
< server: envoy
<
* Connection #0 to host localhost left intact
no healthy upstream
Work around: un-comment the line: dns_lookup_family: V4_ONLY in the yaml file. Then curl will succeed:
curl -v --proxy http://localhost:10000 http://s3.amazonaws.com/my-test-bucket22/
* Trying ::1...
* connect to ::1 port 10000 failed: Connection refused
* Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 10000 (#0)
> GET http://s3.amazonaws.com/my-test-bucket22/ HTTP/1.1
> Host: s3.amazonaws.com
> User-Agent: curl/7.47.0
> Accept: */*
> Proxy-Connection: Keep-Alive
>
< HTTP/1.1 200 OK
< x-amz-id-2: fnwlOvun/jf7YAeI7eMlYm50XlhMoeXzeYu3tfCXY1SbZGvFChRrD+zo1vDFS3s0eoLqhyl9a64=
< x-amz-request-id: 302CF4548DD2CF9A
< date: Fri, 16 Feb 2018 15:46:58 GMT
< x-amz-bucket-region: us-east-1
< content-type: application/xml
< server: envoy
< x-envoy-upstream-service-time: 184
< transfer-encoding: chunked
<
<?xml version="1.0" encoding="UTF-8"?>
* Connection #0 to host localhost left intact
<ListBucketResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Name>my-test-bucket22</Name><Prefix></Prefix><Marker></Marker><MaxKeys>1000</MaxKeys><IsTruncated>false</IsTruncated><Contents><Key>test.txt</Key><LastModified>2018-02-16T15:23:08.000Z</LastModified><ETag>"0b26e313ed4a7ca6904b0e9369e5b957"</ETag><Size>19</Size><StorageClass>STANDARD</StorageClass></Contents></ListBucketResult>
Would be happy to submit a PR to fix this. Just want to be sure I am not missing something basic before moving forward.