Varnish

The flowchart’s really helpful in visualizing how Varnish handles a
request before asking the backend.
default.vcl comes with commented out subroutines which are worth
a study.
- I commented the hell out of them in this repo.

Test code with

varnishd -C -f /etc/varnish/default.vcl

Pre-Flight

# Install the repo
rpm -ivh http://repo.varnish-cache.org/redhat/varnish-3.0/el5/noarch/varnish-release/varnish-release-3.0-1.noarch.rpm

# Install Varnish
yum -y install varnish

# Make copy of config
cp /etc/varnish/default.vcl{,.original}

Can now start a varnish daemon with varnishd and specify the start params like the documentation asks you to do.
But will use init daemons and sysconfig options on a RHEL box, which is neater.

Edit /etc/sysconfig/varnish. Changed some default options:

# Listen on all addresses, on the HTTP port
VARNISH_LISTEN_ADDRESS=
VARNISH_LISTEN_PORT=80

# Use a 250MB cache
VARNISH_STORAGE_SIZE=250M

# Use malloc
VARNISH_STORAGE="malloc,${VARNISH_STORAGE_SIZE}"

Both “file” and “malloc” use disk and memory, but in different ways.

This is the plan:

World ---> Varnish ---> Nginx ---> Application Server
           0.0.0.0:80
                        127.0.0.1:8080
                                   127.0.0.1:9090

In this case, the Nginx server is considered a “backend server” from which Varnish can request and then get data. I defined a simple one in /etc/varnish/default.vcl

# Define a simple Nginx backend
backend nginx_server {
        .host = "localhost";
        .port = "8080";
}

Restarted the varnish and nginx services, made sure they were listening on the right ports.

[root@example conf.d]# netstat -tunlp | grep 80
tcp  0   0 0.0.0.0:80        0.0.0.0:*     LISTEN   27090/varnishd
tcp  0   0 127.0.0.1:8080    0.0.0.0:*     LISTEN   26861/nginx

The site should now work exactly as before. A few notes:

There’s no caching taking place. Yet. That’s where the VCL
DSL
comes in.
Since I only have one server, I’m not interested in Varnish’s
load-balancing capabilities.

Methods

vcl_recv

The entry point. I played around with the default subroutine by simply adding this to default.vcl and restarting Varnish:

sub vcl_recv {
    # Check for a standard HTTP verb. If none used, bark.
    if (req.request != "GET") {
            error 400 "I don't understand what you want me to do.";
    }
}

After that, a simple curl -X POST nikhil.io yielded:

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
  <head>
    *<title>400 I don't understand what you want me to do.</title>*
  </head>
  <body>
    <h1>Error 400 I don't understand what you want me to do.</h1>
    <p>I don't understand what you want me to do.</p>
    <h3>Guru Meditation:</h3>
    <p>XID: 2033234310</p>
    <hr>
    <p>Varnish cache server</p>
  </body>
</html>

Nice.

vcl_hash

Varnish uses a key-value memory map. This subroutine defines how the key is generated. By default, it uses (URL or IP) + hostname.

vcl_pipe and vcl_pass

Called when return(pipe) or return(pass) are… returned from vcl_recv.

In pipe mode is the simplest “Varnish-bypass” mode, where it short-circuits the connection between the client and the backend. No caching, no logging. Varnish does nothing while client speaks to backend (still through Varnish.)

In pass mode, Varnish can look at (and manipulate) request and response data

vcl_fetch

A full restart wipes the cache: neither “file” nor “malloc”
are persistent.
You can see how Varnish translates the VCL into C code by:

varnishd -C -f /etc/varnish/default.vcl
Here’s a useful list of
actions.