A Patch-less Composer Workflow for Drupal Using Forks

Pumpkin patch

One of the significant advantages of using free and open-source software is the ability to fix bugs without being dependent on external teams or organizations. As PHP and JavaScript developers, our team deals with in-development bugs and new features daily. So how do we get these changes onto sites before they’re released upstream?

In Drupal 7, there was a fairly standard approach to this. Since we would commit Drupal modules to a site’s git repository, we could directly apply patches to the code and commit them. A variety of approaches came up for keeping track of the applied patches, such as a /patches directory containing copies of each patch or using drush make.

With Drupal 8’s adoption of Composer for site builds, not only did the number of third-party dependencies increase but a new best practice came along: /vendor (and by extension, /core and /modules/contrib) should not be committed to git. Instead, these are left out of the site repository and installed with composer install instead. This left applying patches in a tricky place.

I’m sure I wasn’t the only one who searched for “composer patches” and immediately found the composer-patches plugin. My first Drupal 8 work was building a suite of modules and not whole sites, so it was a while until I actually used it day-by-day. However, when that time came, our team ran into several edge cases. Here are some of them.

Some issues we ran into with patching

❗ Note that we originally did this investigation in June of 2018, so some of these issues may be different or may be fixed today.

Patching of composer.json is tricky

Sometimes, a change to a library or module also requires changes to that project’s composer.json file. Typically this is when a new API is added that the project now requires. For example, we needed to update the kevinrob/guzzle-cache-middleware library in the guzzle_cache module so we could use PHPUnit 6. Since composer-patches can only react to code after it’s installed it can’t see the updated require line in the module’s composer.json. Even if composer were modified to detect the change and rebuild the project’s dependency tree, that is a slow process and would impact composer update even more.

“Just apply the patch” assumes consistent patch formatting

It’s a fact of life that different projects have different patch standards. While the widespread adoption of git has improved things, proper prefix detection is tricky. For example, we ran into a situation where a Drupal core Migrate patch that only added new files was being added to the wrong directory. There was a configuration option we could set to override this, but it was many hours of debugging to figure it out.

Patch tools themselves (git apply and patch) don’t have great APIs for programs to use. For example, git apply has had many subtle changes over the years that break expectations when users are running old versions. Trying to script these programs for use across the broad spectrum of systems is nearly impossible. There’s an open issue to rewrite patching in PHP for composer-patches, which is a big undertaking.

Patching on top of patches

Some sites may require multiple patches to the same module or library. For example, on a recent Drupal 8 site we launched, we floated between 5 to 10 patches to Drupal core itself (with the actual set of patches changing over time). If patches conflict with each other, resolving the conflicts is a lot of work. You have to apply one patch, and then apply each successive patch resolving conflicts at each step. Then, a new patch has to be generated and included in composer.json. When an upstream release occurs, or a patch is merged, the whole process starts over. This can be especially tricky in situations where a security advisory has been published, and you’re under time pressure to get a release out the door.

Faith in rm -rf vendor web/core web/modules/contrib

When patches fail to apply correctly, it can sometimes leave local vendor code in a broken state. For a long time we had an rm in our CI build scripts even though we were trying to improve build performance by caching vendor code. Several times a week developers on our team would report having to do the same on their locals. On the one hand, at least this was an option to get things up and running. However, it was concerning that this was our solution so much of the time when we couldn’t quickly find a root cause.

You still have to fork some projects

By default, applying patches is a root-only configuration in composer.json. That means that if a Drupal module requires a specific patch in another library, it won’t be applied. It’s possible to enable patching of dependencies, but it still requires the developer to be aware that the patches are required. If you’re maintaining a module or library used by different teams, it’s much less of a support burden to fork the patched project.

If you have an existing code base, and you haven’t hit any of these issues, then it’s fine to stick with what you have. We have some teams who haven’t had any trouble with patches. But, if the above issues are familiar, read on!

What does upstream use?

After encountering the above issues, we took a step back and looked beyond the Drupal island. Certainly, we weren’t alone. What are other composer-managed PHP projects doing to solve this problem?

After some research into what PHP projects outside of Drupal, and npm projects do, most use a forking model instead of a patching model.

I was curious about patch-focused solutions for other code dependency managers. I found patch-package for npm, which is the most popular package for applying patches during npm builds. Looking at their issue queue, it seemed like that entirely independent project experienced similar classes of bugs (#11, #49, #96) as composer-patches. For example, there are bugs related to patches applying differently based on the host environment, handling multiple patches to a single package, or applying patches in libraries and not root projects.

I’m in no way trying to say that these issues can’t be fixed, but it does provide some evidence that my prior troubles with composer-patches were more due to the complex nature of the problem than the specific implementation. As someone who routinely works on projects where a code base is handed off to a client team for maintenance, solving bugs with less code (in this case, by removing a plugin) is almost always the best solution.

The Steps for Patching and Forking

Let’s suppose that we need to patch and fork the Pathauto module.

  1. Fork Pathauto to your user or organization. In most cases, forking to a public repository is fine. If it's a Drupal module, you can import the repository to GitHub using their import tool. Any other Git code host will work too.
  2. Clone from the fork you created and check out the tag you are patching against.
  3. Create a branch called patched.
  4. Apply the patch there, and file and merge a pull request to your patched branch with the changes.
    1. Optionally, you can create tags like 1.2-patch1, incrementing patch1 each time you change the patch set against a given tag.
  5. In your site, add the newly forked repository to composer.json:
    "repositories": [
     {
         "type": "vcs",
         "url": "https://github.com/lullabot/drupal-pathauto"
     },
     {
         "type": "composer",
         "url": "https://packages.drupal.org/8"
     }
    ]
    
  6. Use the new branch or tag in the require line for the project you are patching, such as dev-patched or ^1.2-patch1.
  7. Run composer update drupal/pathauto and you will get the new version.

We like to create a PATCHES.md in each fork file to keep track of the current patches, but it’s not required.

Rerolling your patch

If you haven’t already, you’ll need to add a second remote to your local git checkout. For example, if you imported Pathauto to GitHub, you’ll need to add the Drupal.org git repository with git remote add drupal https://git.drupalcode.org/project/pathauto.git. Then, to fetch new updates, run git fetch --tags drupal.

In most cases, you can simply run git merge <new pathauto tag> into your patched branch. The patch(es) previously applied will be left as-is. If there are conflicts, you can use your normal merge resolution tools to resolve them, and then update the issue with the new patch.

Removing your patch

If your patch has been merged, and it was the only change you’d made to the project, you could simply abandon the fork and remove it from your site’s composer.json.

Otherwise, in most cases you can simply merge the new tag into your patched branch and git will detect that the changed files now have identical contents. You can double check by running git diff <tag> to see exactly what changes your fork has compared to a given version.

Handling Drupal Core

There are two small cases with Drupal to be aware of.

First, the Drupal Composer project has a composer script that runs to download index.php, ROBOTS.txt, and other “scaffolding” files when you change Drupal core releases. For example, if you create a 8.7.0-patch1 tag in your fork, the scaffolding script will look for that tag on Drupal.org, which won’t exist.

The best way to handle this is with Composer inline aliases, specifying in require:

"drupal/core": "8.7.0-patch1 as 8.7.0"

Unfortunately, drupal-scaffold doesn’t check for inline aliases, but there’s an open pull request adding that support. Or, you can ignore the errors and manually download the files when updating Drupal.

A slightly trickier issue comes from how Drupal core handles release branches. Most other projects using Git, such as Symfony, will forward-merge changes from older versions to newer versions. For example, with Symfony you could checkout the 3.4 branch, and merge in 4.1 without any conflicts.

Drupal treats its branches as independent due to the patch-based workflow. A given bug may be fixed in both 8.6 and 8.7, but the actual changes might conflict with each other. For example, merging 8.7.0 into 8.6.15 leads to dozens of conflicts.

Luckily, git has tools for this situation.

$ git checkout 8.7.0 # Checkout the new version of Drupal.
$ git checkout -b update-core # Create a branch
git merge -s ours 8.6.15 # Merge branches, ignoring all changes from 8.6.15. The code on disk is identical to 8.7.0.
git merge patched # Merges just the changes left over from our patched branch into 8.7.0 - only patch related conflicts remain.

Once the transition to GitLab on Drupal.org is complete, and we start using git merges in Drupal workflows, these extra steps should go away.

Transitioning an existing project

When we moved to this workflow, we didn’t want to update every single patched dependency at once. At first, we just updated Drupal core as it was the project we had the most trouble with, and we wanted to be sure we’d see a real improvement. In fact, the site I first used this workflow on still has a few patches applied with composer-patches. For existing projects, it’s completely reasonable to transition dependencies one at a time as they need to be updated for other reasons.

Results

Over a period of several months, we incrementally replaced patches with forks as we updated dependencies. Over that time, we saw:

  • Fewer git checkouts in vendor. With patches, you need to set "preferred-install": "source" which causes Composer to use git to fetch dependencies. With forks, Composer uses the GitHub API to download a zip of the code, which is significantly faster.

  • A much faster composer install.
  • Easier peer review, because the patch is applied in the forked repository as a pull request.
  • Simplified updating of patched dependencies, as we could use all of git’s built-in tools to help with conflict resolution.
  • Forks completely solved problems our front-end developers had with composer install failing to apply patches correctly.
  • An identical workflow for Drupal modules, PHP Libraries, and node packages.
  • Reduced maintenance by sharing patch sets with multiple projects.
    • For example, we are working on a Drupal 8 project for a client that is their second Drupal 8 site. Rather than track and apply patches individually, we point Composer to the existing Drupal 8 fork from the first site. This significantly reduces update and testing time for new core releases.
  • Simplified checks for dependency updates by running composer outdated.

There were a few downsides:

  • Sometimes slower composer update especially if dependencies have lots of tags. In practice, a straight composer update on a site takes between one and two minutes.
  • Many more repositories to manage.
  • Simple patch updates with no conflicts are a little more work as you have to push to your forked repository.
  • If your team doesn’t have any private repositories with custom modules as a Composer dependency, team members running composer update may need to generate an API token. Luckily, Composer provides a one-click link to generate one.

Many thanks to James Sansbury, Juampy NR, and Eduardo Telaya for their technical reviews.

Andrew Berry

Thumbnail
Andrew Berry is a architect and developer who works at the intersection of business and technology.