Wikidata, Canada 150, and music festival data

Posted on Fri 02 June 2017 in Linked Open Data

Following my workshop at the Wikipedia/Canadian Music preconference, I had the opportunity to present with Stacy Allison-Cassin on the subject of Wikidata, Music, and Community: Leveraging Local Music Festival Data to a more general audience of music librarians--most of whom had never heard of Wikidata--on why we were advocating the use of Wikidata as one of the repositories of data about Canadian music festivals.

Our central argument was that, rather than focusing on directly enhancing our own local data repository silos (for example, library catalogues, digital exhibits), libraries and archives should invest their limited resources in enriching Wikidata, a centralized data repository, to maximize the visibility of those entities and the reusability of that data in the world at large… and then pull that data back into our local repositories to enrich our displays and integration with the broader world of data.

Having heard from colleagues at the Evergreen conference in April that they were tired of hearing about the promise of linked data and wanted to see some actual demonstrable value for users, I showed a proof of concept that I had implemented for Laurentian University's catalogue. Any record recognized as a "music album" adds a musical note to the primary contributor's name; clicking that note queries Wikidata for a band or musician with a matching name and displays a subset of available data, such as a description, an image, a link to their website, etc. In the following image you can see the result of pulling up a record for the fine Canadian band Men Without Hats and clicking on the musical note:

Catalogue record for Men Without Hats displaying an image of the band and data such as their website URL

It is a simple example: the user experience could be greatly improved, and it would be far better if we used the Wikidata entity ID as the authority control value in the underlying records to avoid any ambiguities in the cases of bands or musicians that have identical names, but for a quick hack put together over a few hours, I'm pretty happy with the results. The code is available, of course :)

Stacy and I began with a high-level overview of Wikidata, noting that it is:

  • Like Wikipedia, but machine & human-readable & writable
  • Focuses on entities, with statements of fact about those entities backed up by references
  • Open for participation: no organizational barriers such as having to be an OCLC member to contribute to LCNAF
  • Open for use: all data is CC0 licensed (dedicated to the public domain) thus requiring no special acknowledgements, etc on the part of the user of the data

As an example of how Wikidata supports Wikipedia, I highlighted how authority control used to be accomplished in Wikipedia articles via manually-coded lists of authority references for a given person, but now that job can be delegated to the Wikidata entity counterpart via the {{Authority control}} macro to dynamically generate an authority list, helping both humans & machines. The multilingual nature of the data means that those lists no longer need to be manually updated in every language variant! But of course there is still plenty of labour-saving development to be done: for example, the Infobox musical artist in Wikipedia is still maintained manually.

Stacy discussed some comparisons of the musical genres in Wikidata versus the Library of Congress vocabulary (in short: the quantity of genres is certainly there, but some work linking the vocabularies would be beneficial), and highlighting how we have been experimenting with structuring music festivals in Wikidata. Even with a recent example like the Northern Lights Festival Boréal 2015, we found only 25% of the performers already had corresponding entities in Wikidata. That leaves a lot of room for us to improve the visibility of Canadian musicians during the Canada 150 Wikipedia edit-a-thons--and with Wikidata's notability policy that allows new entities to be added if they fulfill a structural need by making statements made in other items more useful, we believe this is a positive way forward.

In the hopes that others may find our presentation useful, Stacy and I offer our slides under the CC-BY-SA 4.0 license.