Common max-age Pitfalls When Working with Drupal's Page Cache

Learn how to display time-based content on your site without throwing Drupal’s page cache through the window.

If you build sites with Drupal, you’ve probably heard at some point that Drupal 8’s caching system is great. So you probably think that it shouldn’t be a big deal to have a time-based piece of content in a Drupal page that's visited by millions of visitors, and still have a reasonable caching system behind it, right? Let’s figure it out in this article.

During a recent project, we encountered the need to display some information retrieved from an external API in a Drupal page. Our Drupal site just had to fetch the most recent info from the remote system and render this data on the page.

For some more context, the site I'm referring to is for a Georgia state government agency that has several customer center offices. When users visit the office page, this little block is intended to inform them: 1) The current wait time at the specific location, and 2) The current wait times at nearby offices.

Snapshot of a wait times block from a location page on dds.georgia.gov

As you can imagine, this information is very dynamic, and we must ensure that what we display on the Drupal site is always the most up-to-date information we receive from the API. Well, maybe a 1 min difference would be acceptable since that's the smallest interval we display on the site.

The moment we start talking about time-based content dependencies, probably the first thing that comes to our minds is the max-age property of Drupal’s cache.

(✋ From now on, I will assume you have some basic level of understanding of Drupal's caching system and its relevant parts. If that's not the case, make sure you glance over the official documentation or watch one of the numerous videos about it, for example, this excellent presentation by Wim Leers at DrupalCon Vienna.)

The approach that seems most straightforward for us at this point is to create a custom block plugin and define its max-age to be the interval we want to keep the information cached in Drupal (one minute). Since block plugins implement \Drupal\Core\Cache\CacheableDependencyInterface, all we need to do seems to be to create this handy method ::getCacheMaxAge() in our block and we should be done!

namespace Drupal\foo\Plugin\Block;

use Drupal\Core\Block\BlockBase;

/**
 * Provides a block with time-sensitive content.
 *
 * @Block(
 *   id = "foo",
 *   admin_label = @Translation("Foo block"),
 *   category = @Translation("Foo custom blocks"),
 *   context = {
 *     "node" = @ContextDefinition("entity:node", required = TRUE, label = @Translation("Node"))
 *   }
 * )
 */
class FooBlock extends BlockBase {

  /**
   * {@inheritdoc}
   */
  public function getCacheMaxAge() {
    // Cache this for 1 minute.
    return 60;
  }

  /**
   * {@inheritdoc}
   */
  public function build() {
    $build = [];
    
    // Retrieve our remote data, and populate the $build render array.
    return $build;
  }

}

After rebuilding our caches so Drupal knows about our block and placing the block on the page, we are ready to check if the 60s is now reflected on the response headers:

Snapshot of page headers

The first opportunity for 🤔. Why does the max-age=31536000 if we deliberately set it to 60 in our block?

It turns out, as of today, that Drupal doesn't automatically reflect bubbled up render array max-ages to the response header max-age. This is the main point of this article. There is an issue about it on drupal.org that you can follow if you are interested in this topic and haven't already done so.

Since that issue is rather big, here's a summary of some important aspects of it below.

First of all, the max-age=31536000 we see in the example above is what was defined in "Configuration" -> "Development" -> "Performance": Cache maximum age (in other words, the config value system.performance:cache.page.max_age ). Drupal will use that system-wide max-age for all anonymous responses if we don't do anything to change that.

(For the record, if you are curious, in this particular case, we are using a very large site-wide max-age because we are using the purge module to invalidate page cache, and this is the  recommended max-age for configurations such as this one.)

How do we change the max-age in the response sent to the browser? We want 60s for this particular page.

There are some possible approaches. The one we will use is similar to @Berdir's comment on drupal.org because we want to be very precise about which pages we affect and those we don't. An alternative approach could be to explore the Cache-Control Override module, but there are some limitations to how it handles things (more on this below).

Our goal is to have a way to say to Drupal, "On this page, please use the bubbled up max-age from the render cache in the response's headers." We do that by implementing an event subscriber service such as:

namespace Drupal\foo\EventSubscriber;

use Drupal\Component\Datetime\TimeInterface;
use Drupal\Core\Cache\Cache;
use Drupal\Core\Cache\CacheableResponseInterface;
use Drupal\Core\Session\AccountInterface;
use Symfony\Component\EventDispatcher\EventSubscriberInterface;
use Symfony\Component\HttpKernel\Event\FilterResponseEvent;
use Symfony\Component\HttpKernel\KernelEvents;

/**
 * Page response subscriber to set appropriate headers on anonymous requests.
 */
class FooCacheResponseSubscriber implements EventSubscriberInterface {

  /**
   * The time service.
   *
   * @var \Drupal\Component\Datetime\TimeInterface
   */
  protected $time;

  /**
   * The current user.
   *
   * @var \Drupal\Core\Session\AccountInterface
   */
  protected $user;

  /**
   * Class constructor.
   *
   * @param \Drupal\Component\Datetime\TimeInterface $time
   *   The Time service.
   * @param \Drupal\Core\Session\AccountInterface $user
   *   Current user.
   */
  public function __construct(TimeInterface $time, AccountInterface $user) {
    $this->time = $time;
    $this->user = $user;
  }

  /**
   * {@inheritdoc}
   */
  public static function getSubscribedEvents() {
    $events[KernelEvents::RESPONSE][] = ['onResponse'];
    return $events;
  }

  /**
   * Sets expires and max-age for bubbled-up max-age values that are > 0.
   *
   * @param \Symfony\Component\HttpKernel\Event\FilterResponseEvent $event
   *   The response event.
   *
   * @throws \Exception
   *   Thrown when \DateTime() cannot create a new date object from the
   *   arguments passed in.
   */
  public function onResponse(FilterResponseEvent $event) {
    // Don't bother proceeding on sub-requests.
    if (!$event->isMasterRequest()) {
      return;
    }
    $response = $event->getResponse();

    // Nothing to do here if there isn't cacheable metadata available.
    if (!($response instanceof CacheableResponseInterface)) {
      return;
    }

    // Bail out early if this isn't an anonymous request.
    if ($this->user->isAuthenticated()) {
      return;
    }

    // Do some other crazy business logic, if necessary.

    $max_age = (int) $response->getCacheableMetadata()->getCacheMaxAge();
    if ($max_age !== Cache::PERMANENT) {
      // Here we do 2 things: 1) we forward the bubbled max-age to the response
      // Cache-Control "max-age" directive (which would otherwise take the
      // site-wide `system.performance:cache.page.max_age` value; and 2) we
      // replicate that into the "Expires" header, which is unfortunately what
      // Drupal's internal page cache will respect. The former is for the outer
      // world (proxies, CDNs, etc), and the latter for our own page cache.
      $response->setMaxAge($max_age);
      $date = new \DateTime('@' . ($this->time->getRequestTime() + $max_age));
      $response->setExpires($date);
    }
  }

}

And its respective service definition in your foo.services.yml file:

  foo.foo_cache_response_subscriber:
    class: Drupal\foo\EventSubscriber\FooResponseSubscriber
    arguments: ['@datetime.time', '@current_user']
    tags:
      - { name: event_subscriber }

Note that this service is an event subscriber because of the event_subscriber tag on its definition, and it will listen to the KernelEvents::RESPONSE event, and when that happens, we will execute the ::onResponse() method.

All that the above code is doing is taking all non-permanent (-1) max-age values from the rendering system (dynamic cache) and using that in the response headers. The advantage of doing it in a custom service is that we have very fine-grained control over where we want to do that or not.

Worth noting, we are setting both the max-age Cache-Control and Expires headers here, which are both needed to have page caching and external proxy systems discard cached versions appropriately. Using the Cache-Control Override module is currently only setting the former. There's work being done to fix that in this issue.

So, if everything works, we now expect to see the block max-age of 60s in our page headers, right?

Well, that's not what happens. If we put a breakpoint at the line:

$max_age = (int) $response->getCacheableMetadata()->getCacheMaxAge();

We observe that $max_age is 0 for the page where we have placed our block. 

Trying to understand what's going on under the hood, we realize that the max-age value that ends up in the response cacheable metadata object is the result of merging all bubbled up max-ages from all render arrays present on the page. We can see that in this process, \Drupal\Core\Cache\Cache::mergeMaxAges() will keep the minimum values when merging two values together.

  /**
   * Merges max-age values (expressed in seconds), finds the lowest max-age.
   *
   * Ensures infinite max-age (Cache::PERMANENT) is taken into account.
   *
   * @param int $a
   *   Max age value to merge.
   * @param int $b
   *   Max age value to merge.
   *
   * @return int
   *   The minimum max-age value.
   */
  public static function mergeMaxAges($a = Cache::PERMANENT, $b = Cache::PERMANENT) {
    // If one of the values is Cache::PERMANENT, return the other value.
    if ($a === Cache::PERMANENT) {
      return $b;
    }
    if ($b === Cache::PERMANENT) {
      return $a;
    }

    // If none or the values are Cache::PERMANENT, return the minimum value.
    return min($a, $b);
  }

In other words, the max-age that results from bubbling/merging cache metadata from several render arrays will be the smallest of all the max-ages, which makes sense, but doesn't help us in getting what we want. The question that comes to our mind at this point is then: "Would there be a render array on this page setting its cache max-age metadata to zero?" That is exactly what's happening.

Finding them might be somewhat tricky, depending on your project. We will start by going to places we know the problem exists (because we wrote it!).

If you have some experience writing Drupal 8 code, you have probably seen render arrays with:

// Don’t cache this.
$build['#cache']['max-age'] = 0;

The developer thought the above code was an easy shortcut to mean, "Hey Drupal, don't bother caching this piece of information." As we see now, this is usually a bad practice. While it does tell the dynamic cache that it shouldn't cache that bit of rendered markup, we can't guarantee that the developer also had the intention of preventing the whole page where this appears from being cached to anonymous users. That is precisely what happens if we propagate the bubbled up max-age cache metadata to the response headers.

That is quite a common habit, and in most cases, it has a relatively easy alternative to achieve the same. For instance, in an administration block, that was visible only to admins, we had:

  /**
   * {@inheritdoc}
   */
  public function getCacheMaxAge() {
    // No need to cache this block.
    return 0;
  }

In this case, the developer should have thought about why we think this shouldn't be cached and seek alternate solutions. For example, we know the block contents here vary per user role and page. This doesn't mean it shouldn't be cached, but that Drupal should cache each combination of user role + path separately by declaring the block's cache contexts as:

  /**
   * {@inheritdoc}
   */
  public function getCacheContexts() {
    return Cache::mergeContexts(parent::getCacheContexts(), [
      'user.roles',
      'url.path',
    ]);
  }

You should keep looking for places where max-age is being set to 0. The odds are that the intention behind it wasn't to make the whole page uncacheable, and cache contexts and/or cache tags should be used instead. Even Drupal core does that. In \Drupal\book\Plugin\Block\BookNavigationBlock, we see:

/**
   * {@inheritdoc}
   *
   * @todo Make cacheable in https://www.drupal.org/node/2483181
   */
  public function getCacheMaxAge() {
    return 0;
  }

which is something being worked on in this issue. If you use the book navigation block and are seeing a zero max-age, you probably need the patch from that issue.

If you checked the most obvious suspects and are still seeing max-age = 0 in the response object, it's probably time to roll up the sleeves and dig deeper. Unfortunately, there isn't an easy mechanism to debug this. Check some of the DX-related issues linked from here and here. Some of them refer to improvements to this, although there isn't an ultimate solution yet. In our case, we will try to figure it out by some more archaic means.

If we inspect $response->getCacheableMetadata(), we see a whole bunch of tags and contexts in the cacheability metadata for this page. Not all tags or contexts here imply a set max-age, but with a bit of luck, some of the likely offenders might be here. Look for places where these tags or contexts are defined, and chances are you might also find the problematic max-age = 0 at some point. Good luck!

A snapshot of PHPStorm showing cache tags from a render array

 

Another approach is to just set a conditional breakpoint in your IDE on the Cache::mergeMaxAges() return value so you can stop as soon as you find the first max-age=0 being defined. You can then inspect the stack trace and hopefully get to the render array/cacheable object that was sending 0 to the merge function.

Snapshot of a conditional breakpoint in PHPStorm

 

EDIT: Thanks to the contribution of Florent Torregrosa. I now know that the Renderviz Module exists, which I wished I had known way earlier! Thanks for the tip. I hope this tool helps in debugging cacheability metadata more easily. (Also, obviously thanks to Wim Leers for writing it!)

If that's still too much to figure out

A reasonable simplification, if it suits your project, is to assume that "no render array on the page will ever be able to make the entire page uncacheable for anonymous users by using the max-age metadata." If that's the case, you can make your event subscriber propagate only headers with max-age values that are > 0. When there is a max-age = 0, our code will do nothing, and Drupal will fall back to using the systemwide max-age value (i.e. system.performance:cache.page.max_age). Doing this will save you the trouble of needing to identify all places in code that might be defining max-age = 0 in render arrays, and is likely to have no drawbacks in most projects. For the record, this is another difference from the contrib Cache-Control Override module approach, and it would look like something such as:

...
    $max_age = (int) $response->getCacheableMetadata()->getCacheMaxAge();
    if ($max_age > 0) {
      $response->setMaxAge($max_age);
      $date = new \DateTime('@' . ($this->time->getRequestTime() + $max_age));
      $response->setExpires($date);
    }
  }
...

In summary

  • If you have time-based content you want to display on a page with anonymous page cache enabled, make sure you check this issue.
  • Give it a try with a custom event subscriber that propagates bubbled-up max-ages into response headers if you can.
  • Avoid using max-age=0 to skip dynamic cache of render arrays.

Acknowledgments

Many thanks to April Sides, Matt Oliveira, Jenna Tollerson and Florent Torregrosa for their help with this article.

Hero photo by Artem Maltsev on Unsplash

Published in:

Get in touch with us

Tell us about your project or drop us a line. We'd love to hear from you!