Skip to content

Support multiple IPs in nginx module #4322

@sepal

Description

@sepal

Currently the nginx module only allows to fetch one IP, but if you use a proxy, you might want to output the X-Forwarded-For header into the logs, which would result in lines containing multiple IPs. Here is an example from our log file (with random client IPs).

68.75.44.178, 172.68.146.54, 127.0.0.1 - - [15/May/2017:12:16:27 +0200] "GET /sites/default/files/styles/company_profile_cover_crop/public/1500x500_1_10.jpg?itok=RUgim2UQ&sc=297009042628d7de3f0eb50e807d29e4 HTTP/1.1" 200 92763 "https://www.startus.cc/company/finleap" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36"
221.247.242.171, 162.158.166.51, 127.0.0.1 - - [15/May/2017:12:16:27 +0200] "GET /sites/default/files/styles/company_profile_logo/public/company_logos/aaeaaqaaaaaaaawvaaaajdk3n2vkzme0lte0zjctngy3ms1inmm4lta4ntnhzwqymzvmoq.png?itok=H2B05xX0 HTTP/1.1" 200 9296 "https://www.startus.cc/company/finleap" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36"
192.228.32.190, 108.162.246.21, 127.0.0.1 - - [15/May/2017:12:16:27 +0200] "GET /jobs/24237/it-back-end HTTP/1.1" 301 5 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
137.56.184.63, 162.158.165.50, 127.0.0.1 - - [15/May/2017:12:16:27 +0200] "GET /sites/default/files/styles/company_profile_cover/public/1500x500_1_10.jpg?itok=1cNqdGYK HTTP/1.1" 200 102268 "https://www.startus.cc/company/finleap" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36"
92.222.165.172, 162.158.167.202, 127.0.0.1 - - [15/May/2017:12:16:27 +0200] "POST /jstats.php HTTP/1.0" 200 13 "https://www.startus.cc/company/finleap" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36"

In our case the first IP is the actual client IP, the second is the one from cloudflare, and the localhost address comes from varnish, which runs on the same host as nginx.

Filebeat only takes logs the localhost address from varnish i.e. 127.0.0.1, which is kind of useless. It would be cool to have all IPs to store all IPs into elastic search, but I guess it raises some questions regarding how to geocode them.
As a quick hack I tried to create a custom pattern, which fetches the first IP instead of the last, but didn't succeed. I would really appreciate it if someone could point me in the right direction.
My Idea was to fetch the first IPORHOST by also matching by a comma, but I don't get how to exclude it with the "not captured group", at least on https://grokconstructor.appspot.com it doesn't seem to work:

FIRSTIPORHOST (%{IPORHOST}(?:,))

Here are my specs:
Filebeat 5.4 (Running in docker)
OS: Debian 8

Steps to Reproduce:

  • Enable cloudflare infront of a site with nginx, cloudflare will start sending the X-Forwarded-Forheader.
  • Change the nginx.conf log_format to log the $http_x_forwarded_for variable. Here is our config:
log_format  proxy '$http_x_forwarded_for - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent"';

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions