The Ten Commandments of a New Drupal 8 Site for Enterprise Developers

by Andrew Berry

Over the past two years, I’ve had the opportunity to work with many different clients on their Drupal 8 site builds. Each of these clients had a large development team with significant amounts of custom code. After a recent launch, I went back and pulled together the common recommendations we made. Here they are!

1. Try to use fewer repositories and projects

With the advent of Composer for Drupal site building, it feels natural to have many small, individual repositories for each custom module and theme. It has the advantage of feeling familiar to the contrib workflow for Drupal modules, but there are significant costs to this model that only become obvious as code complexity grows.

The first cost is that at best, every bit of work requires two pull requests; one pull request in a custom module repository, and a second commit in the composer.lock in the site repository. It’s easy to forget about that second pull request, and in our case, it led to constant questioning by the QA team to see if a given ticket was ready to test or not.

A second cost is dealing with cross-repository dependencies. For example, in site implementations, it’s really common to do some work in a custom module and then to theme that work in a custom theme. Even if there’s only a master branch, there would still be three pull requests for this work—and they all have to be merged in the right order. With a single repository, you have a choice. A single pull request can be reviewed and merged, or multiple can be filed.

A third, and truly insidious cost is where separate repositories actually become co-dependent, and no one knows it. This can happen when modules are only tested in the context of a single site and database, and not as site-independent reusable modules. Is your QA team testing each project against a stock Drupal install as well as within your site? Are they mixing and matching different tags from each repository when testing? If not, it’s better to just have a single site repository.

2. Start with fewer branches, and add more as needed

Sometimes, it feels good to start a new project by creating all of the environment-specific branches you know you’ll need; develop, qa, staging, master, and so on. It’s important to ask yourself; is each branch being used? Do we have environments for all of these branches? If not, it’s totally OK to start with a single master branch. If you do have multiple git repositories, ask this question for each repository independently. Perhaps your site repository has several branches, while the new SSO module that you’re building for multiple sites sticks with just a master branch. Branches should have meaning. If they don’t, then they just confuse developers, QA, and stakeholders, leading to deployment mistakes. Delete them.

3. Avoid parallel projects

Once you do have multiple branches, it’s really important to ensure that branches are eventually merged “upstream.” With Composer, it’s possible to have different composer.json files in each branch, such as qa pointing to the develop in each custom module, and staging pointing to master. This causes all sorts of confusion because it effectively means you have two different software products—what QA and developers use, and what site users see. It also means that changes in the project scaffolding have to be done once in each branch. If you forget to do that, it’s nothing but pain trying to figure it out! Instead, use environment branches to represent the state of another branch at a given time, and then tag those branches for production releases. That way, you know that tag 1.3.2 is identical to some build on your develop branch (even if the hash isn’t identical due to merge commits).

4. Treat merge conflicts as an opportunity

I’ve heard from multiple developers that the real reason for individual repositories for custom modules is to “reduce merge conflicts.” Let’s think about the effect multiple repositories have on a typical Drupal site.

I like to think about merge conflicts in three types. First, there’s the traditional merge conflict, such as when git refuses to merge a branch automatically. Two lines of code have been changed independently, and a developer needs to resolve them. Second, there are logical merge conflicts. These don’t cause a merge conflict that version control can detect but do represent a conflict in code. For example, two developers might add the same method name to a class but in different text locations in the class. Git will happily merge these together, but the result is invalid PHP code. Finally, there are functional merge conflicts. This is where the PHP code is valid, but there is a regression or unexpected ~~behaviour~~ behavior in related code.

Split repositories don’t have much of an effect on traditional merge conflicts. I’ve found that split repositories make logical conflicts a little harder to manage. Typically, this happens when a base class or array is modified and the developer misses all of the places to update code. However, split repositories make functional conflicts drastically more difficult to handle. Since developers are working in individual repositories, they may not always realize that they are working at cross-purposes. And, when there are dependencies between projects, it requires careful merging to make sure everything is merged in the right order.

If developers are working in the same repository, and discover a merge conflict, it’s not a blocker. It’s a chance to make a friend! By discussing the conflict, it gives developers the chance to make sure they are solving the right problem, the right way. If conflicts are really complex, it’s an opportunity to either refactor the code or to raise the issue to the rest of the team. There’s nothing more exciting than realizing that a merge conflict revealed conflicting requirements.

5. Setup config management early

I’ve seen several Drupal 8 teams delay in setting up a deployment workflow that integrates with Drupal 8’s configuration management. Instead, deployments involve pushing code and manual UI work, clicking changes together. Then, developers pull down the production database to keep up to date.

Unfortunately, manual configuration is prone to error. All it takes is one mistake, and valuable QA time is wasted. Also, it avoids code review of configuration, which is actually possible and enjoyable with Drupal 8’s YAML configuration exports.

The nice thing about configuration management tooling is it typically doesn’t have any dependency on your actual site requirements. This includes:

  • Making sure each environment pulls in updated configs on deployment
  • Aborting deployments and rolling back if config imports fail
  • Getting the development team comfortable with config basics
  • Setting up the secure use of API keys through environment variables and settings.php.

Doing these things early will pay off tenfold during development.

6. Secure sites early

I recently worked on a site that was only a few weeks away from the production launch. The work was far enough along that the site was available outside of the corporate VPN under a “beta” subdomain. Much to my surprise, the site wasn’t under HTTPS at all. As well, the Drupal admin password was the name of the site!

These weren’t things that the team had forgotten about; but, in the rush of the last few sprints, it was clear the two issues weren’t going to be fixed until a few days before launch. HTTPS setup, in particular, is a great example of an early setup task. Even if you aren’t on your production infrastructure, set up SSL certificates anyway. Treat any new environments without SSL as a launch blocker. Consider using Let's Encrypt if getting proper certificates is a long task.

This phase is also a good chance to make sure admin and editorial accounts are secure. We recommend that the admin account password is set to a long random string—and then, don’t save or record the password. This eliminates password sharing and encourages editors to use their own separate accounts. Site admins and ops can instead use ssh and drush user-login to generate one-time login links as needed.

7. Make downsyncs normal

Copying databases and file systems between environments can be a real pain, especially if your organization uses a custom Docker-based infrastructure. rsync doesn’t work well (because most Docker containers don’t run ssh), and there may be additional networking restrictions that block the usual sql-sync commands.

This leads many dev teams to really hold off on pulling down content to lower environments because it’s such a pain to do. This workflow really throws QA and developers for a loop, because they aren’t testing and working against what production actually is. Even if it has to be entirely custom, it’s worth automating these steps for your environments. Ideally, it should be a one-button click to copy the database and files from one environment to a lower environment. Doing this early will improve your sprint velocity and give your team the confidence they need in the final weeks before launch.

8. Validate deployments

When deploying new code to an environment, it’s important to fail builds if something goes wrong. In a typical Drupal site, you could have errors during:

  • composer install
  • drush updatedb
  • drush config-import
  • The deployment could work, but the site could be broken and returning HTTP 500 error codes

Each deployment should capture the deployment logs and store them. If any step fails, subsequent steps should be aborted, and the site rolled back to its previous state. Speaking of…

9. Automate backups and reverts

When a deployment fails, it should be nearly automatic to revert the site to the pre-deployment state. Since Drupal updates involve touching the database and the file system, those should both be reverted. Database restores tend to be fairly straight forward, though filesystem restores can be more complex if they are stored on S3 or some other service. If you’re hosted on AWS or a similar platform, use their APIs and utilities to manage backups and restores where possible. They have internal access to their systems, making backups much more efficient. As a side benefit, this helps make downsyncs more robust, as they can be treated as a restore of a production backup instead of a direct copy.

10. Remember #cache

Ok, I suppose I mean “remember caching everywhere,” though in D8 it seems like render cache dependencies are what’s most commonly forgotten. It’s so easy to fall into Drupal 7 patterns, and just create render arrays as we always have. After all, on locals, everything works fine! But, forgetting to use addCacheableDependencies on render arrays leads to confusing bugs down the line.

Along the same lines, it’s important to set up invalidation caching early in the infrastructure process. Otherwise, odds are you’ll get to the production launch and be forced to rely on TTL caches simply because the site wasn’t built or tested for invalidation caching. It’s a good practice when setting up a reverse proxy to let Drupal maintain the caching rules, instead of creating them in the proxy itself. In other words, respect Cache-Control and friends from upstream systems, and only override them in very specific cases.

Finally, be sure to test on locals with caches enabled. Sure, disable them while writing code, but after turn them back on and check again. I find incognito or private browsing windows invaluable here, as they let you test as an anonymous user at the same time as being logged in. For example, did you just add a config form that changes how the page is displayed? Flip a setting, reload the page as anonymous, and make sure the update is instant. If you have to do a drush cache-rebuild for it to work, you know you’ve forgotten #cache somewhere.

What commandments did I miss in this list? Post below and let me know!

Header image from Control room of a power plant.