Replacing the Body Field in Drupal 8

A "body"

The body field has been around since the beginning of Drupal time. Before you could create custom fields in core, and before custom entities were in core, there was a body field. As it exists now, the body field is a bit of a platypus. It's not exactly a text field like any other text field. It's two text fields in one (summary and body), with a lot of specialized behavior to allow you to show or hide it on the node form, and options to either create distinct summary text or deduce a summary by clipping off a certain number of characters from the beginning of the body.

The oddity of this field can create problems. The summary has no format of its own, it shares a format with the body. So you can't have a simple format for the summary and a more complex one for the body. The link to expose and hide the summary on the edit form is a little non-intuitive, especially since no other field behaves this way, so it's easy to miss the fact that there is a summary field there at all. If you are relying on the truncated text for the summary, there's no easy way to see in the node form what the summary will end up looking like. You have to preview the node to tell.

I wanted to move away from using the legacy body field in favor of separate body and summary fields that behave in a more normal way, where each is a distinct field, with its own format and no unexpected behavior. I like the benefits of having two fields, with the additional granularity that provides. This article describes how I made this switch on one of my own sites.

Making the Switch

The first step was to add the new fields to the content types where they will be used. I just did this in the UI by going to admin > structure > types. I created two fields, one called field_description for the full body text and one called field_summary for the summary. My plan was for the summary field to be a truncated, plain text excerpt of the body that I could use in metatags and in AMP metadata, as well as on teasers. I updated the Manage Display and Manage Form Display data on each content type to display my new fields instead of the old body field on the node form and in all my view modes.

Once the new fields were created I wanted to get my old body/summary data copied over to my new fields. To do this I needed an update hook. I used Drupal.org as a guide for creating an update hook in Drupal 8.

The instructions for update hooks recommend not using normal hooks, like $node->save(), inside update hooks, and instead updating the database directly with a SQL query. But that would require understanding all the tables that need to be updated. This is much more complicated in Drupal 8 than it was in Drupal 7. In Drupal 7 each field has exactly two tables, one for the active values of the field and one with revision values. In Drupal 8 there are numerous tables that might be used, depending on whether you are using revisions and/or translations. There could be up to four tables that need to be updated for each individual field that is altered. On top of that, if I had two fields in Drupal 7 that had the same name, they were always stored in the same tables, but in Drupal 8 if I have two fields with the same name they might be in different tables, with each field stored in up to four tables for each type of entity the field exists on.

To avoid any chance of missing or misunderstanding which tables to update, I went ahead and used the $node->save() method in the update hook to ensure every table gets the right changes. That method is time-consuming and could easily time out for mass updates, so it was critical to run the updates in small batches. I then tested it to be sure the batches were small enough not to create a problem when the update ran.

The update hook ended up looking like this:


<?php
/**
 * Update new summary and description fields from body values.
 */
function custom_update_8001(&$sandbox) {

  // The content types to update.
  $bundles = ['article', 'news', 'book'];
  // The new field for the summary. Must already exist on these content types.
  $summary_field = 'field_summary';
  // The new field for the body. Must already exist on these content types.
  $body_field = 'field_description';
  // The number of nodes to update at once.
  $range = 5;

  if (!isset($sandbox['progress'])) {
    // This must be the first run. Initialize the sandbox.
    $sandbox['progress'] = 0;
    $sandbox['current_pk'] = 0;
    $sandbox['max'] = Database::getConnection()->query("SELECT COUNT(nid) FROM {node} WHERE type IN (:bundles[])", array(':bundles[]' => $bundles))->fetchField();
  }

  // Update in chunks of $range.
  $storage = Drupal::entityManager()->getStorage('node');
  $records = Database::getConnection()->select('node', 'n')
    ->fields('n', array('nid'))
    ->condition('type', $bundles, 'IN')
    ->condition('nid', $sandbox['current_pk'], '>')
    ->range(0, $range)
    ->orderBy('nid', 'ASC')
    ->execute();
  foreach ($records as $record) {
    $node = $storage->load($record->nid);

    // Get the body values if there is now a body field.
    if (isset($node->body)) {
      $body = $node->get('body')->value;
      $summary = $node->get('body')->summary;
      $format = $node->get('body')->format;

      // Copy the values to the new fields, being careful not to wipe out other values that might be there.
      if (empty($node->{$summary_field}->getValue()) && !empty($summary)) {
        $node->{$summary_field}->setValue(['value' => $summary, 'format' => $format]);
      }
      if (empty($node->{$body_field}->getValue()) && !empty($body)) {
        $node->{$body_field}->setValue(['value' => $body, 'format' => $format]);
      }

      if ($updated) {
        // Clear the body values.
        $node->body->setValue([]);
      }
    }

    // Force a node save even if there are no changes to force the pre_save hook to be executed.
    $node->save();

    $sandbox['progress']++;
    $sandbox['current_pk'] = $record->nid;
  }

  $sandbox['#finished'] = empty($sandbox['max']) ? 1 : ($sandbox['progress'] / $sandbox['max']);

  return t('All content of the types: @bundles were updated with the new description and summary fields.', array('@bundles' => implode(', ', $bundles)));
}
?>

Creating the Summary

That update would copy the existing body data to the new fields, but many of the new summary fields would be empty. As distinct fields, they won't automatically pick up content from the body field, and will just not display at all. The update needs something more to get the summary fields populated. What I wanted was to end up with something that would work similarly to the old body field. If the summary is empty I want to populate it with a value derived from the body field. But when doing that I also want to truncate it to a reasonable length for a summary, and in my case I also wanted to be sure that I ended up with plain text, not markup, in that field.

I created a helper function in a custom module that would take text, like that which might be in the body field, and alter it appropriately to create the summaries I want. I have a lot of nodes with html data tables, and I needed to remove those tables before truncating the content to create a summary. My body fields also have a number of filters that need to do their replacements before I try creating a summary. I ended up with the following processing, which I put in a custom.module file:


<?php
use Drupal\Component\Render\PlainTextOutput;

/**
 * Clean up and trim text or markup to create a plain text summary of $limit size.
 *
 * @param string $value
 *   The text to use to create the summary.
 * @param string $limit
 *   The maximum characters for the summary, zero means unlimited.
 * @param string $input_format
 *   The format to use on filtered text to restore filter values before creating a summary.
 * @param string $output_format
 *   The format to use for the resulting summary.
 * @param boolean $add_ellipsis
 *   Whether or not to add an ellipsis to the summary.
 */
function custom_parse_summary($value, $limit = 150, $input_format = 'plain_text', $output_format = 'plain_text', $add_ellipsis = TRUE) {

  // Remove previous ellipsis, if any.
  if (substr($value, -3) == '...') {
    $value = substr_replace($value, '', -3);
  }

  // Allow filters to replace values so we have all the original markup.
  $value = check_markup($value, $input_format);

  // Completely strip tables out of summaries, they won't truncate well.
  // Stripping markup, done next, would leave the table contents, which may create odd results, so remove the tables entirely.
  $value = preg_replace('/(.*?)<\/table>/si', '', $value);

  // Strip out all markup.
  $value = PlainTextOutput::renderFromHtml(htmlspecialchars_decode($value));

  // Strip out carriage returns and extra spaces to pack as much info as possible into the allotted space.
  $value = str_replace("\n", "", $value);
  $value = preg_replace('/\s+/', ' ', $value);
  $value = trim($value);

  // Trim the text to the $limit length.
  if (!empty($limit)) {
    $value = text_summary($value, $output_format, $limit);
  }

  // Add ellipsis.
  if ($add_ellipsis && !empty($value)) {
    $value .= '...';
  }

  return $value;
}
?>

Adding a Presave Hook

I could have used this helper function in my update hook to populate my summary fields, but I realized that I actually want automatic population of the summaries to be the default behavior. I don't want to have to copy, paste, and truncate content from the body to populate the summary field every time I edit a node, I'd like to just leave the summary field blank if I want a truncated version of the body in that field, and have it updated automatically when I save it.

To do that I used the pre_save hook. The pre_save hook will update the summary field whenever I save the node, and it will also update the summary field when the above update hook does $node->save(), making sure that my legacy summaries also get this treatment.

My pre_save hook, in the same custom.module file used above, ended up looking like the following:


<?php

use Drupal\Core\Entity\EntityInterface;

/**
 * Implements hook_entity_presave().
 *
 * Make sure summary and image are populated.
 */
function custom_entity_presave(EntityInterface $entity) {
  
  $entity_type = 'node';
  $bundles = ['article', 'news', 'book'];
  // The new field for the summary. Must already exist on these content types.
  $summary_field = 'field_summary';
  // The new field for the body. Must already exist on these content types.
  $body_field = 'field_description';
  // The maximum length of any summary, set to zero for no limit.
  $summary_length = 300;

  // Everything is an entity in Drupal 8, and this hook is executed on all of them!
  // Make sure this only operates on nodes of a particular type.
  if ($entity->getEntityTypeId() != $entity_type || !in_array($entity->bundle(), $bundles)) {
    return;
  }

  // If we have a summary, run it through custom_parse_summary() to clean it up.
  $format = $entity->get($summary_field)->format;
  $summary = $entity->get($summary_field)->value;
  if (!empty($summary)) {
    $summary = custom_parse_summary($summary, $summary_length, $format, 'plain_text');
    $entity->{$summary_field}->setValue(['value' => $summary, 'format' => 'plain_text']);
  }

  // The summary might be empty or could have been emptied by the cleanup in the previous step. If so, we need to pull it from description.
  $format = $entity->get($body_field)->format;
  $description = $entity->get($body_field)->value;
  if (empty($summary) && !empty($description)) {
    $summary = custom_parse_summary($description, $summary_length, $format, 'plain_text');
    $entity->{$summary_field}->setValue(['value' => $summary, 'format' => 'plain_text']);
  }
}  
?>

With this final bit of code I’m ready to actually run my update. Now whenever a node is saved, including when I run the update to move all my legacy body data to the new fields, empty summary fields will automatically be populated with a plain text, trimmed, excerpt from the full text.

Going forward, when I edit a node, I can either type in a custom summary, or leave the summary field empty if I want to automatically extract its value from the body. The next time I edit the node the summary will already be populated from the previous save. I can leave that value, or alter it manually, and it won't be overridden by the pre_save process on the next save. Or I can wipe the field out if I want it populated automatically again when the node is re-saved.

Javascript or Presave?

Instead of a pre_save hook I could have used javascript to automatically update the summary field in the node form as the node is being edited. I would only want that behavior if I'm not adding a custom summary, so the javascript would have to be smart enough to leave the summary field alone if I already have text in it or if I start typing in it, while still picking up every change I make in the description field if I don’t. And it would be difficult to use javascript to do filter replacements on the description text or have it strip html as I'm updating the body. Thinking through all the implications of trying to make a javascript solution work, I preferred the idea of doing this as a pre_save hook.

If I was using javascript to update my summaries, the javascript changes wouldn't be triggered by my update hook, and the update hook code above would have to be altered to do the summary clean up as well.

Ta-dah

And that's it. I ran the update hook and then the final step was to remove my now-empty body field from the content types that I switched, which I did using the UI on the Content Types management page.

My site now has all its nodes updated to use my new fields, and summaries are getting updated automatically when I save nodes. And as a bonus this was a good exercise in seeing how to manipulate nodes and how to write update and pre_save hooks in Drupal 8.

Published in

If you enjoyed this Article, you may also enjoy...

Karen Stevenson

Thumbnail
Karen is one of Drupal's great pioneers, co-creating the Content Construction Kit (CCK) which has become Field UI, part of Drupal core.