Gzip Compression in nxweb

02.11.2014

Gzip compression is configured as a filter, which can be applied to all sorts of handlers.

Parameters of gzip filter

compression – integer from 0 (no compression) to 9 (maximum compression) specifying compression level. The higher the level the slower it works.

cache_dir – directory (can be absolute or relative to nxweb's workdir) to store cached data.

Gzipping static content

Static content comes from sendfile handler. Sample nxweb_config.json fragment:

{
  "prefix": "/static",
  "handler": "sendfile",
  // "vhost": ".some.host.com", // match only at this host
  // "secure_only": true, // match under https connection only
  // "insecure_only": true, // match under http (not https) connection only
  "dir": "www", // aka document root
  "memcache": true, // cache small files in memory
  "charset": "utf-8", // charset for text files
  "index_file": "index.htm", // directory index
  "filters":[
    // ...
    {"type": "gzip", "compression": 4, "cache_dir": "cache/gzip"}
  ]
}

Uploaded static files are automatically gzipped upon first access, then served from cache. Whenever original files are modified the cache is invalidated.

Gzip filter algorithm:

  1. Check if client accepts gzip encoding by analyzing corresponding HTTP request header.
  2. Determine whether content is gzippable. For static files this is determined by MIME type mapped to file's extension. Default MIME type mappings are defined in mime.c file (you can add/redefine custom mappings via nxweb_add_mime_type() function on server startup). Generally all text formats (html, js, css, xml, svg, etc.) are marked as gzippable. In case response size is less than 100 bytes the content is never gzipped.
  3. Check gzipped file cache + verify it against original file's timestamp.
  4. If there is valid gzipped file in cache serve it. Otherwise gzip the original, serve the result and store it in cache.

Gzipping dynamic content

Dynamic content comes from http_proxy handler, from python handler or from custom C handler.

{
  "prefix": null, "handler": "python",
  "dir": "cache/upload_temp", // temp dir for large uploads
  "size": 50000000 /* 50 Mb */, // max upload size
  "filters": [
    // {"type": "file_cache", "cache_dir": "cache/python"},
    // {"type": "templates"},
    {"type": "gzip", "compression": 4, "cache_dir": "cache/gzip"}
  ]
}

The algorithm is generally the same as for static content.

Gzip filter determines whether the content is gzippable by looking into response's content_type and finding corresponding MIME type.

The filter also determines whether the content is cacheable by checking if all the following conditions are met:

  • the handler has defined the cache key in on_generate_cache_key callback (both http_proxy and python handlers have)
  • the status of response is 200
  • there is no no-cache or private cache-control header in response
  • either max-age, expires or last-modified response header is set

In case the content is gzippable but not cacheable it is gzipped (but not stored) upon every request.

Comments