A portrait of Duncan McClean
04 Jul, 2023 4 min read

Handling Statamic static cache invalidation on large sites

In this article, we walk through the benefits and how-to when it comes to using Statamic's static caching, which can be crucial when it comes to large sites
Handling Statamic static cache invalidation on large sites

We’re heavy users of Statamic’s “static caching” feature on a lot of the sites we build.

Static Caching means that the first time a user visits a webpage, Statamic will save the generated HTML to a file (as we’re using the full caching strategy). From then onwards, any subsequent requests for that page will be served the HTML file instead of the request going through to Laravel/Statamic to handle the request.

By cutting out the need to load Laravel/Statamic and have it do the heavy lifting on page requests, it means page loads should feel rapid for end-users.

For example, checkout the Suffolk Libraries website. It was the first site we enabled static caching on & where it has the clearest affect.

It’s great having static caching enabled. However, you need to be able to clear (or fancier word: invalidate) pages when stuff happens, like content being updated in the Control Panel or when a scheduled blog post goes live.

Without it being invalidated, content or code changes won’t make their way to the live site.

One way to deal with this is to clear the whole static cache after a content update - which works great for smaller sites. However, when you’re working with larger sites with thousands of entries, that doesn’t work so well.

In this article, we’re sharing the methods we use to invalidate & warm the static cache when things change to help everything run smoothly.

Note: All of the code examples in this article are using full measure caching but the same concepts can apply to half measure caching.

Invalidating when content is updated

As part of the statamic/static-caching.php config file, you can specify which paths should be invalidated when entries are created, updated or deleted.

'invalidation' => [

    'class' => null,

    'rules' => [
        'collections' => [
            'events' => [
                'urls' => [
                    '/',
                    '/whats-on',
                    '/visit/locations-and-times/*',
                ],
            ],
            // ...
        ],
    ],

],

However, this won’t actually “warm” the pages in the static cache for you (visit the pages after the cache has been cleared to trigger them to be re-cached).

On one of our sites, we also need the page’s parent page to be invalidated & warmed because the parent loops through all of it’s children.

To handle all of this, we use a custom event listener called ForceEntryInvalidation which listens out for Statamic’s EntrySaved event.

The listener loops through any $rules that are setup for the collection & also checks to see if there’s a parent entry that needs to be invalidated.

<?php

namespace App\Listeners\StaticCaching;

use Illuminate\Contracts\Queue\ShouldQueue;
use Illuminate\Queue\InteractsWithQueue;
use Illuminate\Support\Arr;
use Illuminate\Support\Facades\File;
use Illuminate\Support\Facades\Log;

class ForceEntryInvalidation implements ShouldQueue
{
    use InteractsWithQueue;

    protected $rules = [
        'posts' => [
            'urls' => [
                '/',
                '/about/news',
            ],
        ],
    ];

    protected $deleted = [];

    /**
     * Create the event listener.
     *
     * @return void
     */
    public function __construct()

    {
        //
    }

    /**
     * Handle the event.
     *
     * @param  object  $event
     * @return void
     */
    public function handle($event)
    {
        /** @var \Statamic\Entries\Entry */
        $entry = $event->entry;

        // Delete itself
        if ($uri = $entry->uri()) {
            $this->invalidatePath($uri);
        }

		  // If it's a page & it has a parent entry, let's invalidate it & warm it back up again.
        if ($entry->collectionHandle() === 'pages' && $entry->parent()) {
            $this->invalidatePath($entry->parent()->uri());

            $this->visit($entry->parent()->uri());
        }

        // Loop through any rules
        foreach (Arr::get($this->rules, "{$entry->collectionHandle()}.urls", []) as $rule) {
            // Is this a wildcard rule?
            if (str_contains($rule, '*')) {
                $staticCachingPaths = File::glob(public_path('static').$rule.'_.html');

                foreach ($staticCachingPaths as $staticCachingPath) {
                    $this->deleted[] = $staticCachingPath;

                    File::delete($staticCachingPath);
                }

                continue;
            }

            $this->invalidatePath($rule);
        }

        Log::info("Entry {$entry->slug()} saved, paths invalidated", $this->deleted);

    }

    protected function invalidatePath(string $path = '/')
    {
        // We could have multiple paths (since we need to clear query parameters too..)
        $filePaths = array_merge(
            [public_path('static').$path],
            File::glob(public_path('static').$path.'_*.html'),
        );

        foreach ($filePaths as $filePath) {
            $this->deleted[] = $filePath;
            File::delete($filePath);
        }
    }
}

Invalidating when scheduled entries are due to be published

Currently, when an entry is scheduled to “go live”, Statamic won’t invalidate the entry in the static cache, which means the entry won’t show up until you clear the relevant pages manually.

To workaround this, we’ve created a really simple Artisan command which runs daily to check if there’s any scheduled posts for the current date and if there is, it dispatches a custom ScheduledEntryPublished event.

<?php

namespace App\Console\Commands;

use App\Events\ScheduledEntryPublished;
use Illuminate\Console\Command;
use Statamic\Facades\Entry;

class ClearCacheForScheduledEntries extends Command
{
    /**
     * The name and signature of the console command.
     *
     * @var string
     */
    protected $signature = 'steadfast:clear-cache-for-scheduled-entries';

    /**
     * The console command description.
     *
     * @var string
     */
    protected $description = 'Clears the static cache when scheduled entries are published.';

    /**
     * Create a new command instance.
     *
     * @return void
     */
    public function __construct()
    {
        parent::__construct();
    }

    /**
     * Execute the console command.
     *
     * @return int
     */
    public function handle()
    {
        $this->info('Clearing the cache for scheduled entries...');

        Entry::query()
            ->whereIn('collection', ['posts', 'recommendations'])
            ->where('date', now()->startOfDay())
            ->get()
            ->each(function ($entry) {
                event(new ScheduledEntryPublished($entry));
            });
    }
}

We’re listening to this custom event using the same event listener as shown above (ForceEntryInvalidation) so the entry itself is warmed, alongside any of the paths we’ve configured in the $rules array.

Invalidating paths manually

Whenever we need to clear a few specific pages (maybe after deploying a front-end change that only affects one or two pages), we use the Static Cache Manager addon.

It provides a really simple utility. You enter the paths you wish to be cleared, press the button and they’re cleared.

Rebuilding everything…

Sometimes you just want to clear everything and start from scratch - like when a front-end change is deployed that affects every page on the site or you’ve just moved the site onto a new server.

When we need to do this, we manually clear all files in the public/static directory.

Next, we run a command to warm all the pages. On most sites, Statamic’s php please static:warm command will do the job. However, on larger sites, we usually add a custom command which will only warm popular pages & recent articles on the site. We leave the remainder of entries to be cached when they’re visited by end users.

<?php

namespace App\Console\Commands;

use Illuminate\Console\Command;
use Illuminate\Support\Facades\File;
use Illuminate\Support\Facades\Http;
use Statamic\Facades\Entry;

class RebuildStaticCache extends Command
{
    /**
     * The name and signature of the console command.
     *
     * @var string
     */
    protected $signature = 'steadfast:rebuild-static-cache';

    /**
     * The console command description.
     *
     * @var string
     */
    protected $description = 'Rebuilds the static cache by clearing & warming popular pages.';

    protected $staticCachingPath;

    /**
     * Create a new command instance.
     *
     * @return void
     */
    public function __construct()
    {
        parent::__construct();

        $this->staticCachingPath = config('statamic.static_caching.strategies.full.path');
    }

    /**
     * Execute the console command.
     *
     * @return int
     */
    public function handle()
    {
        $this->info('Rebuilding static cache...');

        // Invalidate & warm popular pages
        foreach ($this->popularPages() as $uri) {
            foreach (File::glob($this->staticCachingPath.'/'.$uri.'*') as $file) {
                File::delete($file);
            }

            $this->visit($uri);
        }

        return Command::SUCCESS;
    }

    protected function popularPages()
    {
        $popularPages = [
            '/',
            '/whats-on',
            '/borrow',
            '/about',
            '/about/news',
            '/about/newsletter',
        ];

        // Get the 20 latest blog posts
        Entry::whereCollection('posts')->map(function ($post) {
            return [
                'uri' => $post->uri(),
                'date' => $post->date->timestamp,
            ];
        })->sortByDesc('date')->limit(20)->pluck('uri')->map(function ($uri) use (&$popularPages) {
            $popularPages[] = $uri;
        });

        return $popularPages;
    }

    protected function visit(string $url = '/'): void
    {
        $this->info("Generating: {$url}");

        $request = Http::get(config('app.url').$url);

        if ($request->failed()) {
            $this->error("Failed to generate: {$url}");
        }
    }
}

Statamic