Get updates and news:
Want to get Lullabot article, videocast, and podcast announcements delivered right to your in-box? Let us know your email address (we won't share it) and we'll let you know when anything exciting happens.

Hiding content from Drupal's search system

Drupal offers a variety of ways to integrate with the built-in search system, from connecting with third-party search systems to adding information to standard node content. In addition, Drupal's search system respects the access permissions on each piece of content -- users who can't access a particular node will never see it in the results of their searches.

What happens, though, if you want users to be able to access some content (user bio nodes, for example) if they navigate to it directly, but don't want that content to appear in search results? Drupal doesn't offer any way to do that by default, but your custom module can use the same behind-the-scenes hooks used by the security system to control exactly what search results are presented to users.

The magic happens in hook_db_rewrite_sql(). Whenever Drupal code calls the db_rewrite_sql() function to pull information from the database, other modules can use hook_db_rewrite_sql() to intercept the SQL call and add additional filters to the query. That's how modules like Organic Groups restrict access to content based on group membership: when queries pull information from the node table, it compares the groups the node is associated with to the groups the current user belongs to.

Intercepting the queries used by the search system takes a bit of extra work, though. We'll take a look at some example code and see how the funky bits work.

function your_module_db_rewrite_sql($query, $primary_table, $primary_field, $args) {
  if ($query == '' && $primary_table == 'n' && $primary_field = 'nid' && empty($args)) {
    $excluded_types = variable_get('your_module_types', array());
    if (!empty($excluded_types)) {
      $where = " n.type NOT IN ('". join("','", $excluded_types) ."') ";
      return array('where' => $where);
    }
  }
}

Modules that implement hook_db_rewrite_sql() receive a couple important pieces of information about each query. The most important is the 'primary table' parameter -- you don't want to add a SQL WHERE filter intended for nodes when the primary table is 'user', for example. Due to some curious code inside the node module, however, the query that's passed in is treated as empty. While that's a pretty big violation of Drupal's own coding standards, it makes it easy to intercept just the node search queries.

So, we first check to see whether the incoming query is empty, the table is 'n', and the primary field is 'nid'. If those conditions are matched, the rest is easy: we grab a list of node types that we want to hide from the search results and build a WHERE condition that hides them. That's it!

To make things a bit cleaner, we can add some configuration options.

function your_module_search($op = 'search') {
  if ('admin' == $op) {
     $form = array();
     $form['your_module_types'] = array(
       '#type'           => 'select',
       '#multiple'       => TRUE,
       '#title'          => t('Exclude Node Types'),
       '#default_value'  => variable_get('your_module_types', array()),
       '#options'        => node_get_types('names'),
       '#size'           => 9,
       '#description'    => t('Node types to exclude from search results.'),
     );
     return $form;
  }
}

function your_module_form_alter($form_id, &$form) {
  if ('search_form' == $form_id) {
    $excluded_types = variable_get('your_module_types', array());
    $types = array_map('check_plain', node_get_types('names'));
    foreach($excluded_types as $excluded_type) {
      unset($types[$excluded_type]);
    }
    $form['advanced']['type']['#options'] = $types;
  }
}

What do the two code snippets above do? The first one -- hook_search() -- adds an extra form field to the search administrative settings page. It allows site admins to choose which kinds of content should be hidden. The original snippet we used to do the SQL rewrite will use that list of types to build its WHERE query.

The second snippet alters the advanced form that users see when they search for content on your site. Normally, it shows a list of all the site's content types and lets users choose what they want to see. This implementation of hook_form_alter(), though, removes options from that list if the admin has listed the content types as hidden. That ensures that users will never see options that are impossible to use.

That's it! The same technique can be used to hide content from specific users, content posted on Wednesdays, or any other criteria that's needed.

Comments

Views_fastsearch

An alternative to this which is probably simpler in a lot of cases (although not quite as cunning!) is to use views_fastsearch module. This is faster than the regular search, and allows you to add all the regular views cleverness, like node type filters and so on. You can use exposed filters in place of advanced search too (and arguments for 'section' searches).

That said, I can't wait for the day when we have views2 in core, and a hoo_query_alter that works with a nice, self documenting query array (or object!)...thenthe 'query altering' approach will become much easier and insanely powerful!

Search config

The search config module allows you to do this by removing content types from the search index. Using hook_db_rewrite_sql() seems like a nice option too.

great idea.

great idea.

Your idea originally!

The genesis of the article was the feature in node_search_exclude you wrote (I'm sure you recognized it from the discussion we had the other week). While the discussion we had made it apparent that it had to be solved via nodeapi on the client's site, the SQL rewrite solution seemed like it was worth exploring...

Brilliant

We've had this issue in our site's queue for a few months, and we just hadn't gotten around to looking into it. Thanks to Jeff for the insight into node search queries, and thanks to canen for pointing out the search_config module. Problem solved in 5 minutes!

I wrote a module to do this

It does it in a different way, but seems to be working for me and a few others. I need to rename the module, but other than that it's pretty stable.

--Andrew

hiding search from your content

great site which is much appreciated.

I missed the search at the top right as there is some kind of javascript? drop down to display the search box. Not sure why you use this as it doesn't save any screen real estate and it just makes it confusing for people coming to your site. Just an idea.

Regards

Andy
Edinburgh

the empty $query - bug

I looked into this ages ago and filed a bug against that empty query. see http://drupal.org/node/54622

Module

The name of it is a bit confusing but it does the same as Jeff's code above:

http://drupal.org/project/search_block

search_block

I've been using search_block on healthnews.com and it's worked very well. It's simple to manage and doesn't seem to break (touch wood).

Hey, thanks for this--I was

Hey, thanks for this--I was so thrilled when a lullabot post came up in my search for "drupal search access control content" exactly what I needed to know :-)

call a hack a hack!

this fix may work, but it is weird voodoo that makes no sense. blank queries? nonexistent tables? no thanks, i'd rather hack core and find a solution that's legible.

what is my guarantee here that i'm only intercepting node search queries? how do i know that someday some other module won't want to use an unrelated query that gets rewritten here, causing a mysterious failure?

if we don't want a given node type to be searched, shouldn't we start with not indexing it?

I am wondering if there are

I am wondering if there are ways to hide certain content from the Google bot as well with some similar hack.

Another idea

For us we didn't want this to be user controlled and wanted some content to never show up. A simple way is to delete rows from the search_index table in your cron handler (after the normal search cron handling).

delete from search_index where sid in (select nid from node where type in ('profile', 'folder', 'group', 'webform')

The ideal situation is to never have the hidden content in the index in the first place. I hope D7 has this built in as its a nice feature and a potential performance boost for the search.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <blockquote> <h2> <h3>
  • Lines and paragraphs break automatically.
  • Use <!--pagebreak--> to create page breaks.
  • You may post code using <code>...</code> (generic) or <?php ... ?> (highlighted PHP) tags.
  • Web page addresses and e-mail addresses turn into links automatically.

More information about formatting options