Simplify your code with Drupal 7's Database API

Or, how I learned to stop worrying and love object oriented database APIs

The release of Drupal 7 has brought a lot of changes for developers: new features, updated APIs, deprecated modules, and more. One of the biggest changes is Drupal 7's new database API, an object-oriented system that fundamentally alters the way most developers will build database queries in Drupal. In this article, we'll take a look at how the new API dramatically simplified one contrib module's complex code, and how to spot similar opportunities in your own code. First, a bit of background. The new database API (known as DB:TNG) is built atop PHP's native PDO engine, allowing the Drupal project to leverage existing drivers for database systems like Oracle, MSSQL, and SQLite with much less work. In addition, it uses an object-oriented approach to building complex SQL queries that delights some developers while baffling and frustrating others. Rather than writing a SQL query in raw text and calling a helper function to execute the query, developers create a "query" object, call methods on it to add fields, filters, sorts, and other query elements, then call the query object's "execute" method.

While it may not sound like a big difference, it's a real shift for those who are used to hand-writing their SQL. Take a look at the following example:

  
/* Drupal 6 query */
$sql = "SELECT n.nid, n.title, n.uid FROM {node} n WHERE n.type = 'article'";
$results = db_query_range($sql, 0, 50);

/* Drupal 7 query */
$query = db_select('node', 'n');
$query->condition('n.type', 'article', '=')
      ->fields('n', array('nid', 'title', 'uid'))
      ->range(0, 50);
$result = $query->execute();
  

Both of these code snippets are doing the same thing: pulling the id, title, and author id from columns in the Node table that match certain criteria, then sorting the results and ensuring only 5 matches are returned. In this example (a simple select query), the old-school hand-written query is the clear winner for brevity. In addition, there is a degree of overhead to the new object-oriented approach. Running raw queries is still possible in Drupal 7 with the db_query() function, and in most cases it's more efficient

However, the advantage doesn't always tip in favor of raw text queries. Some modules need to build a large number of similar queries over and over; instead of maintaining a hand-written copy of each version of the query, they have internal helper functions that build up the query based on parameters passed in. This is especially common in modules that expose their own APIs to other developers: providing helper functions to retrieve data protects other devs from future database changes, but it also means that the API developer has to do the hard work of writing their own miniature query-builder. That's the situation I found myself in when building VotingAPI. Take a look at the following function from the Drupal 6 version of the module:


  
/**
 * Select individual votes from the database.
 */
function votingapi_select_votes($criteria = array(), $limit = 0) {
  $anon_window = variable_get('votingapi_anonymous_window', 3600);
  if (!empty($criteria['vote_source']) && $anon_window > 0) {
    $criteria['timestamp'] = time() - $anon_window;
  }
  $votes = array();
  $result = _votingapi_select('vote', $criteria, $limit);
  while ($vote = db_fetch_array($result)) {
    $votes[] = $vote;
  }
  return $votes;
}

/**
 * Internal helper function constructs WHERE clauses. Don't use unless you're me.
 */
function _votingapi_query($table = 'vote', $criteria = array(), $alias = 'v.') {
  $criteria += array(
    'vote_id' => NULL,
    'vote_cache_id' => NULL,
    'content_id' => NULL,
    'content_type' => NULL,
    'value_type' => NULL,
    'value' => NULL,
    'tag' => NULL,
    'uid' => NULL,
    'timestamp' => NULL,
    'vote_source' => NULL,
    'function' => NULL,
  );

  $query = '';
  $args = array();
  if (!empty($criteria['vote_id'])) {
    _votingapi_query_builder($alias . 'vote_id', $criteria['vote_id'], $query, $args);
  }
  elseif (!empty($criteria['vote_cache_id'])) {
    _votingapi_query_builder($alias . 'vote_cache_id', $criteria['vote_cache_id'], $query, $args);
  }
  else {
    _votingapi_query_builder($alias . 'content_type', $criteria['content_type'], $query, $args, TRUE);
    _votingapi_query_builder($alias . 'content_id', $criteria['content_id'], $query, $args);
    _votingapi_query_builder($alias . 'value_type', $criteria['value_type'], $query, $args, TRUE);
    _votingapi_query_builder($alias . 'tag', $criteria['tag'], $query, $args, TRUE);
    _votingapi_query_builder($alias . 'function', $criteria['function'], $query, $args, TRUE);
    _votingapi_query_builder($alias . 'uid', $criteria['uid'], $query, $args);
    _votingapi_query_builder($alias . 'vote_source', $criteria['vote_source'], $query, $args, TRUE);
    _votingapi_query_builder($alias . 'timestamp', $criteria['timestamp'], $query, $args);
  }
  return array('query' => $query, 'args' => $args);
}

/**
 * Internal helper function constructs individual elements of WHERE clauses.
 * Don't use unless you're me.
 */
function _votingapi_query_builder($name, $value, &$query, &$args, $col_is_string = FALSE) {
  if (!isset($value)) {
    // Do nothing
  }
  elseif ($name === 'timestamp') {
    $query .= " AND timestamp >= %d";
    $args[] = $value;
  }
  elseif ($name === 'v.timestamp') {
    $query .= " AND v.timestamp >= %d";
    $args[] = $value;
  }
  else {
    if (is_array($value)) {
      if ($col_is_string) {
        $query .= " AND $name IN (" . db_placeholders($value, 'varchar') . ")";
        $args = array_merge($args, $value);
      }
      else {
        $query .= " AND $name IN (" . db_placeholders($value, 'int') . ")";
        $args = array_merge($args, $value);
      }
    }
    else {
      if ($col_is_string) {
        $query .= " AND $name = '%s'";
        $args[] = $value;
      }
      else {
        $query .= " AND $name = %d";
        $args[] = $value;
      }
    }
  }
}
  

Whew. That code does the yeoman's work of taking in a bundle of different criteria, building a safe SQL query from them, and returning the results. It's what VotingAPI uses to retrieve all of a user's votes, all of the votes on a particular piece of content, the average and total number of votes on a given node, and more. While it's a bit ugly, it handles all of the different permutations of requests pretty smoothly. Still, it's an awful lot of code. While what it's doing is conceptually simple, the fact that it has to build all of those different queries out of text makes it an exercise in doing efficient string manipulation without exposing developers to SQL injection attacks. When I ported VotingAPI to Drupal 7, I was staggered to discover that this kind of task is ridiculously simple with DB:TNG. Take a look at the same function in the Drupal 7 version of VotingAPI:


  
function votingapi_select_votes($criteria = array(), $limit = 0) {
  $anon_window = variable_get('votingapi_anonymous_window', 3600);
  if (!empty($criteria['vote_source']) && $anon_window >= 0) {
    $criteria['timestamp'] = REQUEST_TIME - $anon_window;
  }
  $query = db_select('votingapi_vote')->fields('votingapi_vote');
  foreach ($criteria as $key => $value) {
    $query->condition($key, $value, is_array($value) ? 'IN' : '=');
  }
  if (!empty($limit)) {
    $query->range(0, $limit);
  }
  return $query->execute()->fetchAll(PDO::FETCH_ASSOC);
}
  

That code is doing all of the same work that the Drupal 6 version is doing. It's taking in a set of filter criteria, creating a new query, adding filters as needed, executing the query, and returning the results. However, DB:TNG's object oriented approach and the convenience of its 'condition' method eliminate huge swaths of special-case code. It handles the difference between checking for a column being equal to one value vs the column being inside a set of values; it properly escapes strings; and it handles building query elements for different data types automatically. The result is a simple, tidy function that's much easier to debug if something goes wrong. Obviously, not all modules require this sort of internal query-building function. If you're building simple hand-coded queries, it's probably more efficient and more familiar to write the SQL manually and continue using db_query(). However, if you find yourself building quite a large number of similar queries into your modules, or you build queries with many different sets of filter criteria, do yourself a favor and take a spin through the Database API reference, the online documentation, and the Example Module's database sample code. While its object-oriented syntax takes some getting used to, it can do quite a bit of heavy lifting for you.

Get in touch with us

Tell us about your project or drop us a line. We'd love to hear from you!