Understanding Webserver Caching: HTTP Caching from Cache-Control to Revalidation

1 July 2026 · jproxx

Infrastructure

The fastest request is the one that never travels the network. In its purest form the browser itself sees to that: it keeps an already-downloaded response in its own cache, and if the user calls the same resource again while the stored copy is still valid, the browser serves it straight from local storage — literally no request leaves the device. That is the heart of HTTP caching. Caches further out — proxy, reverse proxy or CDN — then merely shorten the trip to the content rather than avoiding it entirely. Either way, caching saves latency, bandwidth, server load — and real money. The HTTP standard describes a cache in RFC 9111 as “a local store of response messages and the subsystem that controls storage, retrieval, and deletion of messages in it.” That sounds simple — the subtleties are in the details, and some of them are routinely misunderstood. This post arranges the building blocks the way you actually need them in production.

Where caches live

Caches exist at every layer of the connection between browser and origin server, and they fall into two fundamental classes. A private cache belongs to a single user — typically the browser cache. Because no one else sees its contents, it may hold personalized responses tailored to that user. A shared cache, by contrast, serves many users and sits as an intermediary in the path: as a forward proxy in corporate or provider networks, as a reverse proxy directly in front of the application (such as Varnish, nginx or Apache with mod_cache), or as a globally distributed CDN node close to the user.

This distinction is not a formality; it changes behaviour. A shared cache must not store a response marked private, and responses to requests carrying an Authorization header are subject to additional restrictions. Several directives, too, address shared caches only. Overlook this and, in the harmless case, you get a configuration that does nothing — and in the bad case, a CDN serving one user’s personalized page to everyone else.

Distinct from the HTTP caching described here is server-side caching inside the application — a page cache, an object cache (often backed by Redis), or PHP’s opcode cache. Those layers relieve the origin before an HTTP response even exists; the rules below take effect afterwards, on the response that has already been produced.

Fresh or stale: the freshness model

At the core of HTTP caching is an expiry model. Every stored response has a freshness lifetime; as long as it has not been exceeded, the response is fresh and may be served without asking the origin. After that it is stale and generally has to be checked first.

The most reliable way to set the lifetime is Cache-Control: max-age=<seconds>. A common misconception lurks right here: max-age counts not from the moment the cache received the response, but from the moment the origin generated it. How old a stored response already is, is reported by the Age header that shared caches send along; it states the estimated time since the response was generated or last successfully validated at the origin. The older Expires header names an absolute expiry time rather than a duration and is considered outdated today — when both are present, max-age wins.

If no explicit information is given, the response is not automatically non-storable: HTTP is designed to cache as much as possible, and caches may then estimate a heuristic freshness. A common figure is around ten percent of the interval between the Last-Modified date and the response’s generation date — a file last changed a year ago would count as fresh for roughly 36 days. This is precisely why leaving caching to chance is risky: without deliberate headers, the heuristic decides, not the operator.

Setting Cache-Control correctly

Cache-Control is the central lever. The most important response directives:

Directive	Meaning
`no-store`	The only true opt-out from caching: no cache — private or shared — may store the response or use it for another request.
`no-cache`	Does not mean “do not store”: the response may be stored, but must be revalidated with the origin before each reuse. For an actual opt-out you need `no-store`.
`private`	Storage only in a private cache; a shared cache must discard the response. Not a confidentiality guarantee — the payload stays readable by anyone with access to the private cache.
`public`	Allows storage in shared caches too and, in particular, lifts the otherwise applicable ban on storing responses to `Authorization`-bearing requests.
`s-maxage`	A separate lifetime for shared caches only; overrides `max-age` and `Expires` there (private caches ignore it) and, per RFC 9111, carries the semantics of `proxy-revalidate`.
`must-revalidate`	A stale response must be successfully validated before reuse; if the origin is unreachable, the cache must generate an error response (504 recommended) rather than serve the stale response.
`immutable`	Signals that the response is guaranteed not to change during its freshness — the cache should then not trigger a check even on a manual reload.

Revalidation: ETag and Last-Modified

When a response is stale, it does not necessarily have to be transferred anew. Through a conditional request the cache can ask the origin whether anything has changed at all. Two validators serve this purpose, both provided by the origin with the original response:

The ETag is an opaque identifier for a specific version of a resource. The cache later sends it back in If-None-Match; if it is still current, the server answers with 304 Not Modified — with no body, so the data is not transferred again. An ETag can be a strong validator (byte-for-byte equality) or, marked with the W/ prefix, a weak one (semantically equivalent).
Last-Modified names the last change time; the cache asks with it via If-Modified-Since. Last-Modified is inherently a weak validator — its one-second resolution and the possibility of content-equal changes make it less precise than a strong ETag. Exact comparisons (such as range requests) therefore require an ETag.

A well-known pitfall concerns ETags behind a load balancer: Apache used to include the file’s inode number in the ETag (the default up to httpd 2.3.14), so the same file on different backend servers received different ETags and revalidation came to nothing. Since httpd 2.3.15 the inode has been removed from the default; anyone who has explicitly enabled it should set FileETag MTime Size.

The Vary header: cache keys with care

A cache maps a stored response to a request via its URL. When the URL is not enough — because the server delivers different versions depending on request headers — the Vary header names the decisive headers. This makes sense for Vary: Accept-Encoding (compressed and uncompressed versions) or Vary: Accept-Language.

It becomes dangerous when highly variable headers end up in the key. Vary: Cookie or Vary: User-Agent shatter the cache into countless variants, so it practically never scores a hit — the hit rate drops towards zero and the cache loses its purpose. Vary: *, finally, signals that the response depends on factors outside the request headers and makes it effectively non-storable.

Modern directives for speed and resilience

Two extensions from RFC 5861 improve perceived speed and robustness:

stale-while-revalidate=<seconds> lets the cache serve a just-expired response immediately and perform the check in the background. The user does not wait for the origin; the next request receives the refreshed version. This is a response directive only; browsers without support silently fall back to max-age, so nothing breaks.
stale-if-error=<seconds> allows a stale response to be served in an error case, when the origin answers with 500, 502, 503 or 504, or the error is generated locally. A page thus stays reachable while the backend is struggling.

A practical strategy

From all of this follows a proven pattern that keys on the kind of content.

Fingerprinted static files — CSS, JavaScript and images whose filename contains a hash of the content (app.9f3c1a.js) — can be cached for the maximum time without concern, because any change alters the filename and therefore automatically requests a new address:

Cache-Control: public, max-age=31536000, immutable

HTML documents, by contrast, change under a constant address. Here it works well to let the response be stored but check it before every use — combined with an ETag, in the unchanged case this costs only a slim 304 response instead of the full page:

Cache-Control: no-cache
ETag: "a1b2c3"

Protected, personalized responses never belong in a shared cache. For strictly confidential data no-store is right; for user-specific but browser-cacheable content, private:

Cache-Control: private, no-store

Two principles round this off. First, the old wisdom that cache invalidation is one of the hard problems in computing — the fingerprint in the filename sidesteps it elegantly for static files, because new content simply gets a new address. Second, security: private only prevents storage in shared caches, but replaces neither encryption nor access control; genuinely sensitive responses are governed with no-store.

How we handle this

Correctly set cache headers are unspectacular, but they make a noticeable difference to load time and server load. In our managed hosting we configure these layers — reverse proxy, header rules, and the separation of long-lived assets from always-fresh HTML — as part of the setup, so that sites stay fast without ever serving stale content.

Sources: RFC 9111 — HTTP Caching · RFC 9110 — HTTP Semantics · RFC 5861 — stale-while-revalidate / stale-if-error · MDN — HTTP Caching · MDN — Cache-Control · web.dev — HTTP caching

Questions about your website’s performance? Get in touch.