Nginx has become one of the most popular web servers today, powering over 30% of all websites on the internet. As a high-performance server, Nginx produces detailed access logs containing valuable information about all client requests received by your websites and applications. Parsing and analyzing these access logs on a regular basis is key to monitoring the health of your web servers, identifying issues proactively and improving performance.
In this comprehensive guide, we will understand what Nginx access logs contain, why parsing them is important and look at effective tools and techniques to parse, analyze and generate insights from access logs.
Understanding Nginx Access Logs
Nginx access logs record all requests handled by your Nginx web server, saved to a log file in real-time. By default, the access logs are found at /var/log/nginx/access.log. Below is an example log entry in the default combined format:
127.0.0.1 - - [28/Feb/2023:11:15:38 +0530] "GET /index.html HTTP/1.1" 200 247 "-" "Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0"
Let‘s break down what each field here means:
- 127.0.0.1 – The client IP address
-
-
- The RFC 1413 identity of client
-
-
-
- User ID from authentication
-
- [28/Feb/2023:11:15:38 +0530] – Date, time & timezone of request
- "GET /index.html HTTP/1.1" – The request line from client
- 200 – The HTTP status code returned to client
- 247 – Size of response in bytes sent to client
- "-" – Referrer request header
- "Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0" – User agent request header
Other Common Log Formats
Nginx supports additional log formats like:
- JSON – For structured ingestion into log management tools
- Upstream – Captures upstream request time in proxy model
- Elasticsearch – Custom variables to feed into ELK stack
- GRPC – Metadata from gRPC services
Choose log formats wisely based on your downstream analysis needs.
Log File Management
Due to the verbose nature of access logs, the log files can grow very quickly depending on traffic. Best practices include:
- Log file rotation to prevent huge files. Daily/hourly log chunks.
- Log compression using gzip or zip. Saves storage space.
- Only retain last 30-90 days of access logs based on use cases. Archive older logs.
Why Parse and Analyze Access Logs?
Instead of simply logging this valuable data, parsing and analyzing access logs helps unlock many benefits like:
- Monitor overall traffic – Volume trends, top pages, browsers etc.
- Application performance – Errors, response times, latency issues etc.
- User behavior analysis – Most visited pages, usage trends etc.
- Security auditing – Anomalies, malicious requests etc.
- SEO optimization – Crawl stats, referrers data for better rankings
- Compliance – Records mandated by regulatory standards
Sample Nginx Configuration for Access Logs
Key parts of Nginx configuration related to enabling access logs:
http {
log_format main ‘$remote_addr - $remote_user [$time_local] "$request" ‘
‘$status $body_bytes_sent "$http_referer" ‘
‘"$http_user_agent" "$http_x_forwarded_for"‘;
access_log /var/log/nginx/access.log main;
server {
# Web server config
}
}
This demonstrates the log format syntax and activating access log directive. Customizations like adding $request_time possible.
Without centralized log analysis, it becomes extremely difficult to track these metrics across multiple servers. Technologies like the ELK stack have emerged just to collect, aggregate and analyze logs at scale.

Popular open source stack for logging – Elasticsearch, Logstash, Kibana
Streaming Log Analysis vs Batch Processing
Two popular models have emerged for ingesting and analyzing access logs:
Streaming
- Logs are consumed in real-time as they are written
- Enables live traffic monitoring, alerts for issues
- Requires log forwarders like Logstash or Beats
Batch
- Logs are parsed/ingested as batches on schedule
- Works well for historical analysis at lower frequency
- Tools run cron jobs to process accumulated logs
Based on use cases, a blend of streaming and batch pipelines may be required.
Parsing Access Logs with Shell Commands
Now that we understand why access log analysis matters, let‘s look at a few common techniques to parse access logs using Linux shell commands for simple analysis tasks:
1. Extract all client IP addresses
cat access.log | awk ‘{print $1}‘ | sort | uniq -c | sort -n
2. Count requests per minute
cat access.log | cut -d‘]‘ -f2 | cut -d‘ ‘ -f2 | sort | uniq -c
3. Check 404 errors
grep ‘ 404 ‘ access.log
4. Top 10 Referrers
awk ‘{print $11}‘ access.log | sort | uniq -c | sort -n | tail
However, while these basic log parsing snippets help illustrate Nginx log analysis, they only scratch the surface of getting maximum value from your access logs. For advanced analysis, we need a specialized tool.
Introducing GoAccess – Open-source Log Analyzer

GoAccess is arguably the most popular open-source, terminal-based log analyzer and interactive viewer for Nginx, Apache and other web servers. It can parse either live traffic or access logs in formats like Nginx, Apache, Amazon S3, Elastic Load Balancing etc.
Let‘s go through the key capabilities of GoAccess:
- Real-time analysis – Great for detecting immediate threats or issues
- Static or dynamic sites – Supports static and dynamic websites
- Visual reports – Terminal dashboard, JSON, HTML reports
- Nginx, Apache logs – Parses logs from most popular web servers
- Geography mapping – Identify visitor hot spots across globe
- Media types – Breakdown by HTML, CSS, JS, images etc.
- Crawler statistics – Bot traffic and SEO data
- Hundreds of metrics! – Time served, traffic sources, response codes and more!
In addition, GoAccess has good support for GeoIP location data, custom log formats, report filtering, session analysis and more. The HTML reports allow easy sharing of data with non-technical stakeholders too.
Let‘s compare some features of popular open source log analyzers:
| Tool | Nginx Support | Real-time Analysis | Custom Dashboards | Reports |
|---|---|---|---|---|
| GoAccess | Yes | Yes | No | Multiple formats |
| AWStats | Yes | No | Yes | HTML, PDF, CSV |
| Webalizer | Yes | No | No | HTML |
Installing GoAccess on Ubuntu / Debian
As a pre-requisite, GoAccess needs Nginx set up on the server with access logs enabled. To install GoAccess on Ubuntu 22.04 / Debian:
sudo apt update
sudo apt install goaccess
To install on other Linux distros like CentOS / RHEL:
sudo yum install goaccess
For features like GeoIP lookup, additional compilation flags are needed:
./configure --enable-geoip=mmdb ...
Generating GoAccess Reports for Nginx Logs
With GoAccess installed, let‘s dive into parsing a sample Nginx access log:
Note: We are using a publicly available example log available here. This contains one month of requests to a website, perfect for demonstrating GoAccess‘ capabilities.
Step 1: Launch Interactive Terminal Dashboard
Launch GoAccess on the log file:
goaccess /var/log/nginx/access.log
This brings up the interactive terminal dashboard with live parsing in progress:

The default view shows overall metrics like:
- Requests: Total requests
- Valid Requests: Success requests
- Failed Requests: Client / Server errors
- Unique Visitors: Total unique IPs
- Unique Files: Total unique URLs / pages accessed
- Static Files: JS, CSS, Images requests
- Log Size: Size of log processed
Use arrow keys to scroll vertically and horizontally to all metrics. Press Enter on any section for more details, q to quit.
Step 2: Generate HTML Report
For a more permanent report that‘s easier to share or publish, we can generate a standalone HTML report:
goaccess -a -f html -o /var/www/html/report.html /var/log/nginx/access.log
The color-coded report has been written to the path we specified above. Access it at http://your-server-ip/report.html. Here are some key sections:

The main dashboard with traffic summary, top visitors, requests etc. Drill-down further for details:
View of top URLS, HTTP status codes returned, download times etc. Helps identify slow pages.
Analyze visitors by hostnames / IPs, user agents like browsers, operating systems.

Geographic distribution of visitors across countries. Requires GeoIP module.
There are many more helpful views around traffic sources, 404 errors, crawlers, static vs dynamic content etc. that technical and non-technical teams can benefit from.
Custom Reports in GoAccess
The data views in GoAccess can also be customized significantly through configuration tweaks. Some examples:
- Add or remove entire modules like GeoIP, hosts etc.
- Set custom names for metrics e.g. Visitors as
Total Users - Exclude specific IP addresses or status codes
- Filter by date ranges or requests thresholds
- Additional styling like colors, padding etc.
This enables more focused reports around security, performance etc. Dashboards tailored to business needs.
Analyzing Nginx Logs at Scale with ELK Stack
While GoAccess works very well for single server log analysis, for large-scale log processing across 1000s of servers, pipelines like the ELK (Elasticsearch + Logstash + Kibana) are utilized.
Some helpful diagrams explaining the flow:

Nginx logs ingested via Beats / Logstash event processing pipeline into ElasticSearch datastore, with Kibana analytics and visualizations on top
Benefits include:
- Centralised logging across 100s of hosts
- Scalable datastore not bounded by single node
- Custom indexing and enrichments during ingestion
- Enterprise-grade access controls, security etc.
- Correlate across multiple data sources
Of course, operating at scale brings its own complexities.
Additional Tips for Effective Analysis
To further enhance the reports generated from access logs, keep these tips in mind:
- When installing tools like GoAccess, enable GeoIP for visitor mapping
- Generate baselines of metrics during initial monitoring weeks
- Pay attention to suspicious 404 errors and error spikes
- Set up alerts around sudden traffic changes or latency thresholds
- Compare stats across modules – Geo vs hosts vs browsers etc.
- Export JSON / CSV data to feed into external tools
- Customize main config for filtering reports, adding metrics etc.
- Obscure IP addresses before sharing samples publicly
Conclusion
They say data is power! For Nginx servers, comprehensive access logs when effectively parsed and visualized unlock many hidden trends and insights that may otherwise go unnoticed in today‘s complex web properties. By adopting tools like GoAccess and methodologies outlined here, you can really get the most out of Nginx access logs!


