This is going to be a quick “grocery list” to get a configuration of Apache -> Squid -> Tomcat going, allowing for a cache of multiple webapps at the same time.
The Common Case – Apache & Tomcat
Commonly people would have a configuration of Apache -> Tomcat serving web applications. However sometimes you would like to add that extra bit of simple caching for that webapp. Sometime it can really speed up things!!
Assuming you have Tomcat all configured and serving a webapp on http://localhost:8080/webapp and a vhost in apache which would look like:
<VirtualHost *:80> ServerName www.webapp.com LogLevel info ErrorLog /var/log/apache2/www.webapp.com-error.log CustomLog /var/log/apache2/www.webapp.com-access.log combined ProxyPreserveHost On ProxyPass /webapp http://localhost:8080/webapp ProxyPassReverse /webapp http://localhost:8080/webapp RewriteEngine On RewriteOptions inherit RewriteLog /var/log/apache2/www.webapp.com-rewrite.log RewriteLogLevel 0 </VirtualHost>
Simple! Just forward all /webapp requests to http://localhost:8080/webapp
Squid In The Middle
A simple squid configuration for us would look like:
# some boilerplate configuration for squid
acl manager proto cache_object
acl localhost src 127.0.0.1/32
acl to_localhost dst 127.0.0.0/8 0.0.0.0/32
acl localnet src 10.0.0.0/8
acl localnet src 172.16.0.0/12
acl localnet src 192.168.0.0/16
acl Safe_ports port 80
acl Safe_ports port 443
acl Safe_ports port 8080-8100 # webapps
acl purge method PURGE
acl CONNECT method CONNECT
http_access allow manager localhost
http_access deny manager
http_access allow purge localhost
http_access deny purge
http_access deny !Safe_ports
http_access allow localhost
http_access allow localnet
http_access deny all
icp_access allow localnet
icp_access deny all
http_port 3128
hierarchy_stoplist cgi-bin ?
access_log /var/log/squid/access.log squid
hosts_file /etc/hosts
coredump_dir /var/spool/squid3
# adjust your cache size!
cache_dir ufs /var/cache/squid 20480 16 256
cache_mem 5120 MB
#################################
# interesting part start here!! #
#################################
# adjust this to your liking
maximum_object_size 200 KB
# required to handle same URL with different parameters differently
# so for instance these two following URLs are treated as distict URLs, hance they will
# be cached separately
# http://localhost:8080/webapp/a?param=1
# http://localhost:8080/webapp/a?param=2
strip_query_terms off
# just for some better logging
logformat combined %>a %ui %un [%tl] "%rm %ru HTTP/%rv" %>Hs %<st "%{Referer}>h" "%{User-Agent}>h" %Ss:%Sh
# refresh_pattern is subject to change, but if you decide to cache a webapp, you must make sure it actually gets cached!
# many webapps do not like to get cached, so you can play with all sorts of parameters such as override-expire, ignore-reload
# and ignore-no-cache. the following directive will SURELY cache any page on the following webapp for 1 hours (60 minutes)
# adjust the regexp(s) below to suit your own needs!!
refresh_pattern http://localhost:8080/webapp/.* 60 100% 60 override-expire ignore-reload ignore-no-cache
Now, we need to plug apache to use the above squid configuration. Luckily it’s pretty simple, the only line you need is:
# basically every request going to http://localhost:8080/webapp, pass via squid ProxyRemote http://localhost:8080/webapp http://localhost:3128
And the whole vhost again:
<VirtualHost *:80> ServerName www.webapp.com LogLevel info ErrorLog /var/log/apache2/www.webapp.com-error.log CustomLog /var/log/apache2/www.webapp.com-access.log combined ProxyPreserveHost On ProxyRemote http://localhost:8080/webapp http://localhost:3128 ProxyPass /webapp http://localhost:8080/webapp ProxyPassReverse /webapp http://localhost:8080/webapp RewriteEngine On RewriteOptions inherit RewriteLog /var/log/apache2/www.webapp.com-rewrite.log RewriteLogLevel 0 </VirtualHost>
That’s it, now look at /var/log/squid/access.log and look for TCP_MEM_HIT and TCP_HIT. If you’re still getting TCP_MISS and the like, you’ll have to adjust your refresh_pattern in the squid configuration.
Multiple Webapps?
Not a problem if you have multiple webapps, if you want them to be cached, just add the magic line passing them through squid and the relevant squid refresh_pattern.
Don’t want a webapp to be cached? Just bypass the squid!