by Karen Stevenson on August 31, 2011 // Short URL

How Does RDF Work in Drupal 7?

Incorporating structured data into your site

I decided to take advantage of the new, built-in, RDF support in Drupal 7. I spun up a new site, enabled the RDF module, and starting digging around the administration panel to see where I could control what it was doing. And I found ... nothing. It turns out that adding RDF support to your site is more than a matter of simply enabling the RDF module. To get the best use out of it, you need a little additional information, and most likely another module or two.

The Drupal community has long discussed the need to make Drupal sites a vital part of the Semantic Web by adding structured data to the html. There are various ways to do that, using RDFa or microformats or microdata, and there has been lots of discussion about which would be better. Drupal 7 was finally released with RDFa support built in, and that functionality is extended by several contributed modules.

If you want to learn more about what RDF and structured data are all about, please dig into the resources listed below. For this article I'm assuming you already have at least a basic familiarity with structured data but want to figure out how to integrate these concepts into Drupal 7.

[Edit: There seems to be some confusion about what this article is intended to do. This article is not a "review" of the RDF modules, most of which are still in development. It is an outline of the solutions that exist at the moment and an attempt to help people understand 1) That nothing much will happen by just turning on the core RDF module, and 2) Where to start looking to learn more and add more functionality to your site.]

Core RDF Module

Let's start with a fresh Drupal 7 installation and enable the RDF module and create a couple articles. If we examine the source code for those articles we see the following:

<div class="content">
<div id="node-104" class="node node-article node-promoted node-full clearfix" about="/drupal/node/104" typeof="sioc:Item foaf:Document">
<div class="meta submitted">
<span property="dc:date dc:created" content="2011-08-16T08:16:17-05:00" datatype="xsd:dateTime" rel="sioc:has_creator">Submitted by <a href="/drupal/user/1" title="View user profile." class="username" xml:lang="" about="/drupal/user/1" typeof="sioc:UserAccount" property="foaf:name">admin</a> on Tue, 08/16/2011 - 08:16</span>   
</div>
<div class="content clearfix">
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even" property="content:encoded"><p>Suspendisse eget tristique dolor. Sed lobortis dictum purus non vulputate! Suspendisse nisi felis, pellentesque non vulputate adipiscing, euismod a justo. Nam lorem eros, consequat ut dapibus quis, faucibus vel enim. Aliquam erat volutpat. Nulla facilisi. Sed mattis risus quis libero auctor sit amet volutpat ligula rutrum! Cras sem nisl, tempor eu lacinia eget, pharetra ut massa.</p>
</div>
</div>
</div>
<div class="field field-name-field-tags field-type-taxonomy-term-reference field-label-above clearfix"><h3 class="field-label">Tags: </h3><ul class="links"><li class="taxonomy-term-reference-0" rel="dc:subject"><a href="/drupal/taxonomy/term/4" typeof="skos:Concept" property="rdfs:label skos:prefLabel">RDF</a></li><li class="taxonomy-term-reference-1" rel="dc:subject"><a href="/drupal/taxonomy/term/5" typeof="skos:Concept" property="rdfs:label skos:prefLabel">Drupal</a></li></ul>
</div> 
</div>

We can see that semantic information has been added to the source markup. For instance, the whole page has been marked as:

typeof="sioc:Item foaf:Document"

The 'Submitted by' link has acquired the following meta information:

property="dc:date dc:created" content="2011-08-16T08:16:17-05:00" datatype="xsd:dateTime" rel="sioc:has_creator"

And the taxomony field has:

typeof="skos:Concept" property="rdfs:label skos:prefLabel"

If we look in the administration area and try to figure out where this is controlled, we won't see anything. It is entirely controlled in code, there is no UI for a site administrator to alter it. And we will also see that only some basic core fields have meta information, but most new, custom, fields that we add to the node form have none.

If we want to build on this or give control to site administrators, we will need some custom code or contributed modules. At this point we have to decide what we are trying to accomplish and which standards we want to support.

Schema.org

If we are mostly concerned about making sure that Google and Yahoo understand our markup, the newest solution would be to comply with the standards at Schema.org.

Fortunately there is a Drupal module for that, the Schema.org module. If we enable that module along with the core RDF module, and examine the source code of the page, we won't see any changes. But if we go to the administration area for the content type there are some new options for the site administrator.

To do something intelligent with this, we can go to http://schema.org/docs/full.html where we will see a list of the 'Types' that are available. So we could mark this as a 'type' of 'Blog' or 'Book' or 'Event'. The title property depends on the type, but generally would be 'name'.

We can then set properties on individual fields by clicking on the 'Edit' link for the field in the 'Manage Fields' screen for this content type. At the bottom of the field edit screen we will see a new textfield where we can input the name of the property that this field should represent within the context of the type of content we set in the previous step.

After these changes, if we view the source of our article, we now see new 'schema' attributes in the markup, both at the top level, for the document, and on the body field where we set the property to 'description':

<div class="content">
<div id="node-104" class="node node-article node-promoted node-full clearfix" about="/drupal/node/104" typeof="schema:Event sioc:Item foaf:Document">
<div class="meta submitted">
<span property="dc:date dc:created" content="2011-08-16T08:16:17-05:00" datatype="xsd:dateTime" rel="sioc:has_creator">Submitted by <a href="/drupal/user/1" title="View user profile." class="username" xml:lang="" about="/drupal/user/1" typeof="sioc:UserAccount" property="foaf:name">admin</a> on Tue, 08/16/2011 - 08:16</span>    </div>
<div class="content clearfix">
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even" property="schema:description content:encoded"><p>Suspendisse eget tristique dolor. Sed lobortis dictum purus non vulputate! Suspendisse nisi felis, pellentesque non vulputate adipiscing, euismod a justo. Nam lorem eros, consequat ut dapibus quis, faucibus vel enim. Aliquam erat volutpat. Nulla facilisi. Sed mattis risus quis libero auctor sit amet volutpat ligula rutrum! Cras sem nisl, tempor eu lacinia eget, pharetra ut massa.</p>
</div></div></div><div class="field field-name-field-tags field-type-taxonomy-term-reference field-label-above clearfix"><h3 class="field-label">Tags: </h3><ul class="links"><li class="taxonomy-term-reference-0" rel="dc:subject"><a href="/drupal/taxonomy/term/4" typeof="skos:Concept" property="rdfs:label skos:prefLabel">RDF</a></li><li class="taxonomy-term-reference-1" rel="dc:subject"><a href="/drupal/taxonomy/term/5" typeof="skos:Concept" property="rdfs:label skos:prefLabel">Drupal</a></li></ul>
</div>
</div> 
</div>

Sometimes this is not enough control. For instance, if we mark our content type as an 'Event' and look up the Schema.org properties for an event, we see we need to differentiate between the start date and the end date, but a Date field is comprised of both a start and end date. In that case we need to set the properties at a deeper level than the field level. Now that we have 'renderable arrays' in Drupal 7, one easy way to do this is to alter the renderable array to add '#attributes' in the desired locations. When the array is rendered, the attributes will be added to the markup. (The Date module has been rewritten to adjust its attributes automatically if the Schema.org module is enabled, but other fields may need to be adjusted manually.)

Extended RDF Support

For more tools to support RDFa we can enable RDF Extensions, along with the RDFx UI that comes with it. If we do that and go to admin/structure/types and edit the content type, we will see how a site administrator can control the RDF information on this content type and on the fields in it. In much the same way that the Schema.org module worked, we will have places where we can set the property for the content type as a whole and the fields within it on the Content Types administration screens.

Anyone who is not a RDF expert will probably have to do a fair bit of research to find the appropriate values to inject into these boxes. The answers will depend on what type of data we have and which RDF vocabularies we want to implement.

Microdata Module

The Microdata module takes a similar approach to build on the core RDF API to enable microdata. As with the other modules described above, once this module is enabled, each field has a place on the field settings form where we can set microdata criteria.

The microdata information is then stored in the database, in a table created by this module. The module documentation points out that this might be problematic on sites that want this information in code, but also provides some hooks that can be used by other modules or custom code. The project page states that the API is still changing, so the details of how this works may change.

As with the RDF Extensions module, it will fall to us to research microformats to determine what vocabularies are available and what values are necessary.

Which Way to Go?

So just as there are multiple standards for handling structured data on the Web, there are multiple ways to implement them in Drupal.

Using RDF or Microdata to its fullest implies spending some time understanding the various vocabularies and standards that are available. One nice thing about the Schema.org module is that we can do all our research in one place. As they say in their FAQ:

In creating schema.org, one of our goals was to create a single place where webmasters could go to figure out how to mark up their content, with reasonable syntax and style consistency across types. This way, webmasters only need to learn one thing rather than having to understand different, often overlapping vocabularies. A lot of the vocabulary on schema.org was inspired by earlier work like Microformats, FOAF, GoodRelations, OpenCyc, etc.

Which is best? That depends on what we are trying to accomplish:

  • If we just want to get some basic RDF functionality in place, with no concerns about extending it to custom fields, the core RDF module may suffice.
  • If we want to do enough to have the results utilized by Google and Yahoo, the Schema.org module is relatively easy to understand and implement.
  • For more complex RDF or Microdata applications, the RDFx and Microdata modules will provide more flexibility and most likely richer results.

Resources

Drupal Modules

RDF Extensions - adds additional functionality to that provided by core, including a UI to control RDF settings by content type and extra APIs and serialization formats. This is the goto module if you need more complex functionality than what core provides or you just want to give site administrators some control over the process.

Some other modules extend the RDF Extensions module for even more options: SPARQL, VARQL, and RESTful Web Services.

The Schema.org module - implements the Schema.org standards using the RDF API provided by core. It has a UI to let an administrator attach properties to content types and fields (so you can say the node type is an 'Event' and set event properties on each field on that content type).

The Microdata module - an alternative that provides a UI for setting microdata information on each field. The microdata information is stored in a table in the database and the module overrides the core RDF preprocessors, so would be incompatible with any module that depends on them.

The Rich Snippets module is an attempt to make Google Rich Snippets working better. It hasn't seen any recent activity and may effectively be deprecated by the Schema.org or Microdata modules.

Other Resources

Drupal's RDF Handbook Page.

Drupal's Semantic Web Group - lots of discussion about RDF and structured data.

The future of structured data in HTML: RDFa, Microdata and microformats - A great article outlining the competing formats/strategies and what it means to a Drupal site.

An article from IBM - The Semantic Web, Linked Data and Drupal, Part 1: Expose your data using RDF that describes a method to create a complex field using the Field Collection module and give its elements semantic information.

Karen Stevenson

Senior Drupal Architect

Want Karen Stevenson to speak at your event? Contact us with the details and we’ll be in touch soon.

Comments

Adam S

An example of RDF in a proof of concept website

Take the example of a small to medium business that wants a brochure style website. A lot of businesses are catching on that it simple doesn't pay to spend a lot of money on a website. Businesses like restaurants generate much more traffic from websites like yelp.com or facebook.com. However, with applications of RDF we will be able to change that perception.

With RDF there is less guessing and more knowing. For example, Google is just a tool that returns a list of best guesses. Now Google has the ability to know which translates into things like search results being presented to the client with star ratings attached or, more importantly, images being indexed by Google showing top of image searches because Google now knows exactly what information is attached to or what the image is about.

Back to the case of the restaurant website. With RDF we can now build tools that can centralize the entry point of information about a restaurant and precipitate any changes throughout the internet. Businesses don't mind having websites that look very similar to other businesses. Sony, for example, is referring people to facebook.com/sony on TV ads rather than their website. What we can do now with Drupal is create several templates for different types of businesses that can be built easily and then make a directory of those websites or other tools like a job board where the business's information is and jobs available are maintained in one place which communicate with each other and Google through rich snippets and RDF.

What this means is much less guessing and a lot more knowing.

Google has a tool that allows people to check how the Google bot sees RDF and other rich snippets markup. I'm including the link with an example from my recently released experimental website so people can see how RDF looks in the wild. (link)

Google also provides examples of how to markup to h-cards, reviews of businesses, event dates and product reviews on e-commerce websites.

Reply

Corbacho

Google +1 button microdata

This post is a great summary. Thanks Karen! I'm maintaining the Google+1 button module, so this came in the right moment.

Just yesterday, Google +1 button released the feature of sharing in your stream when clicking on the Google +1 button.

The way to control what information is shown in that snippet is with +Snippet. See full doc here

It seems Google has already chosen: schema.org. From this FAQ
Why microdata? Why not RDFa or microformats?

Historically, we’ve supported three different standards for structured data markup: microdata, microformats, and RDFa. Instead of having webmasters decide between competing formats, we’ve decided to focus on just one format for schema.org.

Reply

Lin Clark

I think there is some

I think there is some confusion with this. Google didn't choose schema.org over anything else... Google is one of the creators of schema.org, along with Bing and Yahoo. Schema.org is not an alternate to RDFa or microdata.

Schema.org just defines a set of terms. Schema.org terms can be placed either using RDFa or microdata, but AFAIK Google only plans to process microdata.

The Schema.org module for Drupal places terms using RDFa. Google does not process this (however, Yahoo and Bing have indicted that they plan to process RDFa, and may already). These things can be tested with Google's testing tools, though they may not surface in the actual live search results for a while.

I can't stress enough how important it is to test your markup in testing tools such as Google's... even if a module developer assures you that this is The Right Way (TM). There is unfortunately a lot of FUD and politicking going on in this space, so it can be hard to tell what to do.

If anyone has questions regarding these standards that they think I can help answer, I'm usually available via groups.drupal.org or on IRC.

Reply

Lin Clark

I'm the maintainer (or

I'm the maintainer (or co-maintainer) of most of these modules. I'm usually available on IRC if you have questions about them.

A few notes:

  • I will be posting a screencast about microdata module soon. It looks like you were using an earlier version of the module, the UI changed before DrupalCon.
  • I talk about the problems you talk about with attribute placement for compound fields in Microdata in Drupal: challenges for field formatters. IMHO, the best way to solve this problem is to rely on the Entity Property API. This support is built in to microdata and I think should (once finalized there) be moved over to RDFx.
  • "As with the RDF Extensions module, it will fall to us"... not exactly. In microdata, there is an API for modules to define default mappings. I have already been working with some field formatter developers and plan to be available to all contrib field formatter developers to help with this.
  • IIRC, there are no plans to actively maintain VARQL (no commits since the first week it was posted in Dec). It was an alternate approach to SPARQL Views, using the original D6 code as the base. I incorporated the ideas into SPARQL Views for D7. SV has a number of tutorials and is being actively developed.
  • Microdata module keeps the RDFa from being inserted in the HTML, but not the RDF mapping itself. I plan to make removing the RDFa optional, but most people using microdata do not want RDFa.
Reply

KarenS

This article was written from

This article was written from the perspective of someone who is most definitely *not* a RDF expert. I ended up doing an extraordinary amount of research and was still pretty totally confused about what I should be doing, either as a site builder or as a module developer. So what I hoped to convey was at least a path forward for people who want to figure this out.

Your article looks great (I did not find that in my research). I made the best effort I could to understand how to incorporate microdata into the Date module. It's expecting a lot to assume that all module maintainers will be able to do that kind of research and make the right conclusions, so hopefully people who *do* understand this will be actively posting patches on the issue queues to make it easier.

My comment about it falling to the site builder to understand what is going on is certainly true at the moment. Hopefully more and more parts of this will 'just work' in the future without requiring everyone to be a microdata expert.

Reply

Lin Clark

I do not think it makes sense

I do not think it makes sense to review the module as if it were complete or was meant for widespread use yet. Microdata module just went into API review at DrupalCon (big thanks to JohnAlbin, Deciphered, jhodgdon and arianek for their feedback).

I've tried to make as clear as possible that it isn't ready for use yet (though it is now starting to get close for basic use cases).

This is still an early stage project. It is not complete and the API may change. It should not be used on production sites at this point.

I've also already reached out to a number of field formatter developers, posting patches in their queues.

I am very accessible in IRC and through d.o... if you have any questions, please feel free to ask.

Reply

KarenS

I'm not 'reviewing' these

I'm not 'reviewing' these modules at all, I'm just saying these are the alternative ways to go ATM. I pointed out that the API is still changing. If it makes you feel better I will add additional disclaimers, but the point of this whole article is that it is not at all obvious even where to start if you want to do anything with RDF. I don't think there is anywhere I say that 'XYZ' is the fully-baked solution to anything. I'm just trying to tell people that nothing magic will happen by just turning on the core RDF module, that they need to start looking at these other modules for more information.

Reply

gmclelland

Thank you for sharing this

Thank you for sharing this information. It definitively helps in understanding how this technology integrates with Drupal.

Bookmarked

Reply

Lin Clark

Open Graph Protocol uses

Open Graph Protocol uses RDFa. However, Facebook has very particular requirements on where in the page the RDFa can be placed.

I attended a Facebook BoF at DrupalCon Chicago where I explained to one of the Facebook module developers how to use Render API to insert the Facebook RDFa (unfortunately, you can't really use the RDF API for this because of Facebook's limitation on placement). He hadn't yet ported his module to D7 yet, I don't know whether he has yet.

Reply