How Does RDF Work in Drupal 7?

Incorporating structured data into your site

I decided to take advantage of the new, built-in, RDF support in Drupal 7. I spun up a new site, enabled the RDF module, and starting digging around the administration panel to see where I could control what it was doing. And I found ... nothing. It turns out that adding RDF support to your site is more than a matter of simply enabling the RDF module. To get the best use out of it, you need a little additional information, and most likely another module or two.

The Drupal community has long discussed the need to make Drupal sites a vital part of the Semantic Web by adding structured data to the html. There are various ways to do that, using RDFa or microformats or microdata, and there has been lots of discussion about which would be better. Drupal 7 was finally released with RDFa support built in, and that functionality is extended by several contributed modules.

If you want to learn more about what RDF and structured data are all about, please dig into the resources listed below. For this article I'm assuming you already have at least a basic familiarity with structured data but want to figure out how to integrate these concepts into Drupal 7.

[Edit: There seems to be some confusion about what this article is intended to do. This article is not a "review" of the RDF modules, most of which are still in development. It is an outline of the solutions that exist at the moment and an attempt to help people understand 1) That nothing much will happen by just turning on the core RDF module, and 2) Where to start looking to learn more and add more functionality to your site.]

Core RDF Module

Let's start with a fresh Drupal 7 installation and enable the RDF module and create a couple articles. If we examine the source code for those articles we see the following:

  
<div class="content">
<div id="node-104" class="node node-article node-promoted node-full clearfix" about="/drupal/node/104" typeof="sioc:Item foaf:Document"> 
<div class="meta submitted">
<span property="dc:date dc:created" content="2011-08-16T08:16:17-05:00" datatype="xsd:dateTime" rel="sioc:has_creator">Submitted by <a href="/drupal/user/1" title="View user profile." class="username" xml:lang="" about="/drupal/user/1" typeof="sioc:UserAccount" property="foaf:name">admin</a> on Tue, 08/16/2011 - 08:16</span>    
</div>
<div class="content clearfix">
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even" property="content:encoded"><p>Suspendisse eget tristique dolor. Sed lobortis dictum purus non vulputate! Suspendisse nisi felis, pellentesque non vulputate adipiscing, euismod a justo. Nam lorem eros, consequat ut dapibus quis, faucibus vel enim. Aliquam erat volutpat. Nulla facilisi. Sed mattis risus quis libero auctor sit amet volutpat ligula rutrum! Cras sem nisl, tempor eu lacinia eget, pharetra ut massa.</p>
</div>
</div>
</div>
<div class="field field-name-field-tags field-type-taxonomy-term-reference field-label-above clearfix"><h3 class="field-label">Tags: </h3><ul class="links"><li class="taxonomy-term-reference-0" rel="dc:subject"><a href="/drupal/taxonomy/term/4" typeof="skos:Concept" property="rdfs:label skos:prefLabel">RDF</a></li><li class="taxonomy-term-reference-1" rel="dc:subject"><a href="/drupal/taxonomy/term/5" typeof="skos:Concept" property="rdfs:label skos:prefLabel">Drupal</a></li></ul>
</div>  
</div>
  

We can see that semantic information has been added to the source markup. For instance, the whole page has been marked as:

  
typeof="sioc:Item foaf:Document"
  

The 'Submitted by' link has acquired the following meta information:

  
property="dc:date dc:created" content="2011-08-16T08:16:17-05:00" datatype="xsd:dateTime" rel="sioc:has_creator"
  

And the taxomony field has:

  
typeof="skos:Concept" property="rdfs:label skos:prefLabel"
  

If we look in the administration area and try to figure out where this is controlled, we won't see anything. It is entirely controlled in code, there is no UI for a site administrator to alter it. And we will also see that only some basic core fields have meta information, but most new, custom, fields that we add to the node form have none.

If we want to build on this or give control to site administrators, we will need some custom code or contributed modules. At this point we have to decide what we are trying to accomplish and which standards we want to support.

Schema.org

If we are mostly concerned about making sure that Google and Yahoo understand our markup, the newest solution would be to comply with the standards at Schema.org.

Fortunately there is a Drupal module for that, the Schema.org module. If we enable that module along with the core RDF module, and examine the source code of the page, we won't see any changes. But if we go to the administration area for the content type there are some new options for the site administrator.















Article | Clean Datetest 7-1.jpg

To do something intelligent with this, we can go to http://schema.org/docs/full.html where we will see a list of the 'Types' that are available. So we could mark this as a 'type' of 'Blog' or 'Book' or 'Event'. The title property depends on the type, but generally would be 'name'.

We can then set properties on individual fields by clicking on the 'Edit' link for the field in the 'Manage Fields' screen for this content type. At the bottom of the field edit screen we will see a new textfield where we can input the name of the property that this field should represent within the context of the type of content we set in the previous step.















Body | Clean Datetest 7.jpg

After these changes, if we view the source of our article, we now see new 'schema' attributes in the markup, both at the top level, for the document, and on the body field where we set the property to 'description':

  
<div class="content">
<div id="node-104" class="node node-article node-promoted node-full clearfix" about="/drupal/node/104" typeof="schema:Event sioc:Item foaf:Document">
<div class="meta submitted">
<span property="dc:date dc:created" content="2011-08-16T08:16:17-05:00" datatype="xsd:dateTime" rel="sioc:has_creator">Submitted by <a href="/drupal/user/1" title="View user profile." class="username" xml:lang="" about="/drupal/user/1" typeof="sioc:UserAccount" property="foaf:name">admin</a> on Tue, 08/16/2011 - 08:16</span>    </div>
<div class="content clearfix">
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even" property="schema:description content:encoded"><p>Suspendisse eget tristique dolor. Sed lobortis dictum purus non vulputate! Suspendisse nisi felis, pellentesque non vulputate adipiscing, euismod a justo. Nam lorem eros, consequat ut dapibus quis, faucibus vel enim. Aliquam erat volutpat. Nulla facilisi. Sed mattis risus quis libero auctor sit amet volutpat ligula rutrum! Cras sem nisl, tempor eu lacinia eget, pharetra ut massa.</p>
</div></div></div><div class="field field-name-field-tags field-type-taxonomy-term-reference field-label-above clearfix"><h3 class="field-label">Tags: </h3><ul class="links"><li class="taxonomy-term-reference-0" rel="dc:subject"><a href="/drupal/taxonomy/term/4" typeof="skos:Concept" property="rdfs:label skos:prefLabel">RDF</a></li><li class="taxonomy-term-reference-1" rel="dc:subject"><a href="/drupal/taxonomy/term/5" typeof="skos:Concept" property="rdfs:label skos:prefLabel">Drupal</a></li></ul>
</div>
</div>  
</div>
  

Sometimes this is not enough control. For instance, if we mark our content type as an 'Event' and look up the Schema.org properties for an event, we see we need to differentiate between the start date and the end date, but a Date field is comprised of both a start and end date. In that case we need to set the properties at a deeper level than the field level. Now that we have 'renderable arrays' in Drupal 7, one easy way to do this is to alter the renderable array to add '#attributes' in the desired locations. When the array is rendered, the attributes will be added to the markup. (The Date module has been rewritten to adjust its attributes automatically if the Schema.org module is enabled, but other fields may need to be adjusted manually.)

Extended RDF Support

For more tools to support RDFa we can enable RDF Extensions, along with the RDFx UI that comes with it. If we do that and go to admin/structure/types and edit the content type, we will see how a site administrator can control the RDF information on this content type and on the fields in it. In much the same way that the Schema.org module worked, we will have places where we can set the property for the content type as a whole and the fields within it on the Content Types administration screens.















Basic page | Clean Datetest 7-1.jpg















Body | Clean Datetest 7-1.jpg

Anyone who is not a RDF expert will probably have to do a fair bit of research to find the appropriate values to inject into these boxes. The answers will depend on what type of data we have and which RDF vocabularies we want to implement.

Microdata Module

The Microdata module takes a similar approach to build on the core RDF API to enable microdata. As with the other modules described above, once this module is enabled, each field has a place on the field settings form where we can set microdata criteria.















Repeat | Clean Datetest 7.jpg

The microdata information is then stored in the database, in a table created by this module. The module documentation points out that this might be problematic on sites that want this information in code, but also provides some hooks that can be used by other modules or custom code. The project page states that the API is still changing, so the details of how this works may change.

As with the RDF Extensions module, it will fall to us to research microformats to determine what vocabularies are available and what values are necessary.

Which Way to Go?

So just as there are multiple standards for handling structured data on the Web, there are multiple ways to implement them in Drupal.

Using RDF or Microdata to its fullest implies spending some time understanding the various vocabularies and standards that are available. One nice thing about the Schema.org module is that we can do all our research in one place. As they say in their FAQ:

In creating schema.org, one of our goals was to create a single place where webmasters could go to figure out how to mark up their content, with reasonable syntax and style consistency across types. This way, webmasters only need to learn one thing rather than having to understand different, often overlapping vocabularies. A lot of the vocabulary on schema.org was inspired by earlier work like Microformats, FOAF, GoodRelations, OpenCyc, etc.

Which is best? That depends on what we are trying to accomplish:

  • If we just want to get some basic RDF functionality in place, with no concerns about extending it to custom fields, the core RDF module may suffice.
  • If we want to do enough to have the results utilized by Google and Yahoo, the Schema.org module is relatively easy to understand and implement.
  • For more complex RDF or Microdata applications, the RDFx and Microdata modules will provide more flexibility and most likely richer results.

Resources

Drupal Modules

RDF Extensions - adds additional functionality to that provided by core, including a UI to control RDF settings by content type and extra APIs and serialization formats. This is the goto module if you need more complex functionality than what core provides or you just want to give site administrators some control over the process.

Some other modules extend the RDF Extensions module for even more options: SPARQL, VARQL, and RESTful Web Services.

The Schema.org module - implements the Schema.org standards using the RDF API provided by core. It has a UI to let an administrator attach properties to content types and fields (so you can say the node type is an 'Event' and set event properties on each field on that content type).

The Microdata module - an alternative that provides a UI for setting microdata information on each field. The microdata information is stored in a table in the database and the module overrides the core RDF preprocessors, so would be incompatible with any module that depends on them.

The Rich Snippets module is an attempt to make Google Rich Snippets working better. It hasn't seen any recent activity and may effectively be deprecated by the Schema.org or Microdata modules.

Other Resources

Drupal's RDF Handbook Page.

Drupal's Semantic Web Group - lots of discussion about RDF and structured data.

The future of structured data in HTML: RDFa, Microdata and microformats - A great article outlining the competing formats/strategies and what it means to a Drupal site.

An article from IBM - The Semantic Web, Linked Data and Drupal, Part 1: Expose your data using RDF that describes a method to create a complex field using the Field Collection module and give its elements semantic information.

Get in touch with us

Tell us about your project or drop us a line. We'd love to hear from you!