RDFa with schema.org codelab: Book - embedded types

By Dan Scott,

About this codelab

In this codelab, you're going to take the structured data in your catalogue page from simple string values to embedded types with properties. You will use the schema.org vocabulary and express it via RDFa attributes.

Audience: Beginner

Prerequisites: To complete this codelab, you will need a basic familiarity with HTML. The exercises can be found in codelab.zip, with the solutions found in the rdfa_exercises subdirectory. There are frequent checkpoints through the code lab, so if you get stuck at any point, you can use the checkpoint file to resume and work through this codelab at your own pace.

Embedded types

So far you have described the page using a single type and a handful properties. However, when you added the @property="author" attribute, the expected value for the property (the range) was not a simple text string; it was supposed to be an entity of either the Person or the Organization type.

In this exercise, you will add several embedded entities to the page to conform to the vocabulary definition and make your structured data even more useful.

Continue working with the HTML file that you have been editing so far, or for a fresh start, copy ../1_book/step2/check_c.html into a new file.

Define the Person entity

Your @property="author" attribute needs to define a Person entity to satisfy the expected value of author. Simply add the @typeof="Person" attribute to the same HTML element so that you are, in one step, defining the author attribute for the overall Book entity, while simultaneously starting a new Person entity scope.

Check your markup
<!DOCTYPE html>
...
<body vocab="http://schema.org/" typeof="Book">
...
    <tr valign="top">
      <th>Autor:</th>
      <td class="recordAuthor" property="author" typeof="Person">
        <a href="/Author?lookfor=%22Herzberg%2C+Agnes+M%22">Herzberg, Agnes M</a>
      </td>
    </tr>
    <tr valign="top">
      <th>Weitere Autoren:</th>
      <td class="recordSecAuthor" property="author" typeof="Person">
        <a href="/Author?lookfor=%22Andrews%2C+David+F%22">Andrews, David F</a>
      </td>
    </tr>
...

Define basic properties of the Person entity

Now that you have defined a Person entity, you can define specific properties for it.

Declare that the person's name is the name property of the Person entity.

Tip: Remember that you might need to add <span> tags to create a new scope for the properties that you want to add.

Check your markup
<!DOCTYPE html>
...
<body vocab="http://schema.org/" typeof="Book">
...
    <tr valign="top">
      <th>Autor:</th>
      <td class="recordAuthor" property="author" typeof="Person">
        <span property="name">
          <a href="/Author?lookfor=%22Herzberg%2C+Agnes+M%22">Herzberg, Agnes M</a>
        </span>
      </td>
    </tr>
    <tr valign="top">
      <th>Weitere Autoren:</th>
      <td class="recordSecAuthor" property="author" typeof="Person">
        <span property="name">
          <a href="/Author?lookfor=%22Andrews%2C+David+F%22">Andrews, David F</a>
        </span>
      </td>
    </tr>
...

Declare that the same Person is both the author and copyrightHolder

Copyright is an important subject for both creators and organizations and individuals seeking to reuse or republish work, so naturally schema.org includes a copyrightHolder property that you can apply. In this case, however, the author and the copyrightHolder are one and the same, and you have already used the @property attribute.

To define multiple property values for the same attribute, simply include the values as a whitespace-delimited list. In this case, edit the HTML to declare @property="author copyrightHolder" and check your work in one or more structured data validators.

Note: These are still relatively early days for structured data validators, and their output varies for more esoteric cases like multi-valued attributes. For example, the Structured Data Linter recognizes the second value for copyrightHolder but generates a "blank node" identifier for it, whereas Google's Structured Data Testing Tool only recognizes the last value of the multi-valued attribute. To complicate matters further, the search engines recognize that their tools have bugs that differ from what their actual production parser understands... so don't be overly alarmed if it seems like your markup is not being recognized by the testing tool.

Use the @resource attribute to group assertions for an entity

Sometimes your HTML document does not group all of the content in such a way that you can cleanly keep all of the attributes for a given instance of an entity within a single scope. In these cases, you may be able to use the @resource attribute to logically group the properties for that instance.

For example, when you added the @typeof="Person" declaration for the author, the name of the author was separated from your existing Person instance by the <a> element in the middle. The new scope that that <a> introduces makes it a bit more difficult to mark up the familyName and givenName of the author.

To resolve the problem, add a @resource attribute to your existing Person declaration. The value of the new attribute should be unique on this page; use #author1 for the sake of simplicity.

Then add a wrapping <span> element around the name of the author inside the a element, including a @resource attribute with a value of #author1 to match what you added above. This creates a new scope for the existing entity, such that any properties declared within this new scope will be added to that entity.

Now add another <span> element inside the newly scoped #author1 resource, and declare it to be the name property. For bonus points, you can nest the givenName and familyName properties inside of the name property.

Check your markup
<!DOCTYPE html>
...
<body vocab="http://schema.org/" typeof="Book">
...
    <tr valign="top">
      <th>Autor:</th>
      <td class="recordAuthor" property="author copyrightHolder" typeof="Person" resource="#author1">
          <a href="/Author?lookfor=%22Herzberg%2C+Agnes+M%22">
            <span resource="#author1">
              <span property="name">
                <span property="familyName">Herzberg</span>,
                <span property="givenName">Agnes M</span>
              </span>
            </span>
          </a>
        </span>
      </td>
    </tr>
...

Describe the publisher as an Organization type

So far we have not provided any value for the publisher property, which tends to be important for creative works. The publisher documentation shows that the expected range is Organization, which in turn has child types such as Corporation.

  1. Define a new Corporation entity with the name of the publisher as the name property.
  2. Add a location property for the Corporation entity. Notice that the expected range is a type of either Place or PostalAddress. Use a PostalAddress entity, filling in the addressLocality and addressRegion properties.
Check your markup
<!DOCTYPE html>
...
<body vocab="http://schema.org/" typeof="Book" resource="#book">
...
    <tr valign="top">
      <th>Veröffentlicht:</th>
      <td>
        <span property="publisher" typeof="Corporation">
          <span property="location" typeof="PostalAddress">
            <span property="addressLocality">New York</span> [u.a.]
          </span> : <span property="name">Springer</span>,
        </span>
        <time property="datePublished">1985</time>
      </td>
    </tr>
...

Bonus exercise: Add a second author of type Person

There is a second author for this book, David Andrews, that should be reflected in the machine-readable markup.

  1. Define the new Person entity with the name, givenName, and familyName properties.
Check your markup
<!DOCTYPE html>
...
<body vocab="http://schema.org/" typeof="Book">
...
    <tr valign="top">
      <th>Weitere Autoren:</th>
      <td class="recordSecAuthor" property="author copyrightHolder" typeof="Person" resource="#author2">
          <a href="/Author?lookfor=%22Andrews%2C+David+F%22">
            <span resource="#author2">
              <span property="name">
                <span property="familyName">Andrews</span>,
                <span property="givenName">David F</span>
              </span>
            </span>
          </a>
        </span>
      </td>
    </tr>
...

Checkpoint: Your HTML page should now look like ../1_book/step3/check_d.html

Lessons learned

In this exercise, you learned:

Next codelab: Book - strings to things

About the author

Dan Scott is a systems librarian at Laurentian University.

Informational resources

  • RDFa Lite (W3C Recommendation) - a marvel of technical writing, this is a specification written as a concise, extremely useful tutorial
  • schema.org - the source for the vocabulary types and definitions, although the examples all use microdata or JSON-LD instead of RDFa Lite
  • RDFa Primer (W3C Working Group Note) - a more in-depth RDFa tutorial that covers properties beyond RDFa Lite; the additional examples may help clarify how RDFa Lite works (really, you don't need anything beyond RDFa Lite!)
  • Heath, Tom; Bizer, Christian. Linked data: Evolving the Web into a Global Space - a book (freely available on the web) that goes into depth to cover the principles, patterns, and best practices for publishing linked data on the web

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.