schema.org, Wikidata, Knowledge Graph: strands of the modern semantic web

Posted on Sun 12 February 2017 in Linked Open Data

My slides from Ohio DevFest 2016: schema.org, Wikidata, Knowledge Graph: strands of the modern semantic web

And the video, recorded and edited by the incredible amazing Patrick Hammond:

In November, I had the opportunity to speak at Ohio DevFest 2016. One of the organizers, Casey Borders, had invited me to talk about schema.org, structured data, or something in that subject area based on a talk about schema.org and RDFa he had seem me give at the DevFest Can-Am in Waterloo a few years prior. Given the Google-oriented nature of the event and the 50-minute time slot, I opted to add in coverage of the Google Knowledge Graph and its API, which I had been exploring from time to time since its launch in late 2014.

Alas, the Google Knowledge Graph Search API is still quite limited; it returns quite minimal data in comparison to the rich cards that you see in regular Google search results. The JSON results only include links for an image, a corresponding Wikipedia page, and for the ID of the entity. I also uncovered errors that had lurked in the documentation for quite some time; happily, the team quickly responded to correct those problems.

So I dug back in time and also covered Freebase, the database of linked and structured data that had both allowed individual contributions and which had made its database freely available--until it was purchased by Google, fed into the Knowledge Graph, and shut down. Not many people knew what we had once had until it was gone (Ed Summers did, for one), but such is the way of commercial entities.

In that context, Wikidata looks something like the Second Coming of an open (for contribution and access) linked and structured database, with sustainability derived financially from the Wikimedia Foundation and structurally by its role in underpinning Wikipedia and Wikimedia Commons. Google also did a nice thing by putting resources into adding the appropriately licensed data they could liberate from Freebase: approximately 19 million statements and IDs.

The inclusion of Google Knowledge Graph IDs in Wikidata means that we can use the Google Search API to find an entity ID, then pull the corresponding richer data from Wikidata for that ID to populate relationships and statements. You can get there from here! Ultimately, my thesis is that Wikidata can and will play a very important role in the modern (much more pragmatic) semantic web.