Wednesday, April 2. 2014
Last week I had the fantastic experience of returning to the code4lib conference for the first time since 2008, and as a speaker to boot.
The title of my talk was Structured data NOW: seeding schema.org in library systems. I had given two talks the prior week on a substantially similar subject (about teaching Koha, Evergreen, and VuFind how to express schema.org structured data via RDFa), but all three conferences had very different audiences. I felt great about my talks at LibTechConf and the Evergreen International Conference, but those were one hour long and 45 minutes long respectively. code4lib, on the other hand, schedules 20 minute slots; it is a veritable crucible for speakers. I remixed and rewrote my code4lib talk obsessively leading up to the conference, and ultimately ended up adding content to my overall message, which was obviously the wrong direction to take things... but before this audience of my peers, I felt an absolute need to explain why I had chosen to spend much of the past year and a half focused on RDFa and schema.org. And that ultimately led to having to cut a significant amount out of the actual delivery, which meant that the audience didn't get the takeaway message that I actually wanted to deliver. One peer, in fact, described it as "a good refresher on microdata" which was almost exactly what I had wanted to avoid doing (microdata vs. RDFa aside) for this audience!
All caught up? Good! Now let's pretend that I had about ten more minutes; here's roughly what I wanted to impart:
Structured library information: given that schema.org offers the Library type, and library systems often contain information such as the hours of operation, contact information, physical address, and branch relationships, we can teach our library systems to express that as structured data. And good news, Evergreen (as of the 2.6 release) will do exactly that! So if you remember all the way back to the start of the presentation where I was pointing at various map services that had differing levels of knowledge about our libraries often requiring different social media accounts, publishing your data out in an openly accessible, standard format should make it possible for those map services (including OpenStreetMap) to do a better job of reflecting our presence in the world.
Thought experiment: Now that we're publishing our holdings in a commonly understood Offer format, and linking those holdings to the library that holds them, and (in the case of Evergreen) providing information about those libraries, when can we stop batch uploading MARC at irregular intervals just to create union catalogs? In fact, wouldn't we be able to build ILL systems that can do a much better, more competitive job once we're making this information openly available on the web?
Sitemaps: Of course, to tell search engines and crawlers what pages are of interest and when they have been updated, you have to offer a sitemap. Fortunately, Koha, VuFind, and Evergreen (to a lesser extent) all support generation of sitemaps today.
Quick union catalogues: As a proof of concept, I proved that we can build union catalogues on the backs of existing general search engines by creating a Google Custom Search Engine (CSE) that tied together the holdings of two different VuFind instances along with an Evergreen instance under a single search box. It is as ugly as sin, but it took me all of about ten minutes to cobble together; Google had already crawled all of the pages, so I just had to tell it what hostnames and URL patterns I cared about. The CSE even gives you some limited support for directly querying the underlying structured data. Later on, Sean Aery from Duke gave a lightning talk that showed off how they had taken exactly this approach to provide a search interface for their finding aids and digital collections and made it beautiful!
Quick union catalogues: in progress: As a firm believer in the importance of decentralization, I pointed at a simple in-progress Python script that would crawl sitemaps and extract structured data from all of the indexed pages. My intention was to provide a complete indexed solution with a simple web frontend, but I got a bit bogged down in first updating the Fedora packages for several of the dependencies, then tackling some bugs in the upstream libraries themselves. More to be done here!
Hmm. Well maybe I didn't miss conveying as much as I had feared. On the bright side, there was a great deal of interest in the SchemaBibEx "best practices and recommendations" documentation that I had promised we were working on... and today Richard Wallis described some of his work in this area. So that's a good thing. And even if some of the audience walked away from my talk with just an introduction to RDFa and schema.org, that put them in an extremely good position to be able to enjoy and understand Sean's subsequent lightning talk.
Oh, and my admission of being a semantic web dropout (due to the complexity of content negotiation and heterogeneous vocabularies and billions of triples and RDF/XML) ended up being a perfect setup for the immediately following talk Next Generation Catalogue - RDF as a Basis for New Services by Anne-Lena Westrum and Asgeir Rekkavik from the Oslo Public Library, who basically said "Semantic web? Oh yeah, we can totally do that!" and proceeded to show their MARC2RDF and RDF2MARC workflows. Very cool stuff (and delightful scheduling by the conference program committee!)
Saturday, March 22. 2014
Yesterday at the 2014 Evergreen International Conference I presented Structured library data: holdings, libraries, and beyond--a talk about the work I've done specifically with Evergreen and making some of the connections with Koha and VuFind's capabilities. Lots of attendees seemed happy with the talk and the direction that we're going with Evergreen, and have hope for the future relevance of our libraries' resources within normal search engines, as well as all of the possibilities opened up by exposing this open data about our libraries (locations, hours, branch relationships, contact informatoin) and their resources in a much more consumable form.
There was so much energy in the room, I could have talked for another hour... I love the Evergreen community!
Thursday, March 20. 2014
Two things of note:
It has been fun and invigorating to hear the responses of those who are seeing the results and direction of this work for the first time! More thoughts to come...
Friday, February 21. 2014
Over at the Metadata Matters blog, Diane Hillman wrote Why Are We Waiting for the ILS to Change?, asking (in the context of the difficulties libraries experience in making their systems work with RDA):
What I saw underlying that conversation was the assumption that the only way change could happen was if the ILS’s themselves changed; in other words if the ILS vendors decided to lead rather than follow. The situation now is that system vendors say they’ll build RDA compliant systems when their customers ask for them, and libraries say that they’ll use ‘real’ RDA when there are systems that can support it. This is a dance of death, and nobody wins.
I took this as a jumping-off point to discuss the state of linked data support in library systems and discovery software and posted the following comment (currently awaiting moderation):
Who's waiting? Sweden's LIBRIS took essentially the approach you suggested back in 2007, and Bibliothèque Nationale de France and Deutsche Nationalbibliothek have also followed similar paths.
Jumping from RDA to linked data might be a bit of a stretch, but the lack of movement by proprietary vendors in particular hit a sore point that I developed during some of our early W3 Schema.org Bibliographic Extension Community Group discussions. I had asked if anyone else was trying to actually implement what we were discussing. A response from one of the proprietary software representatives was "No, we're waiting to see what develops..." -- which is exactly the attitude that leads to the "dance of death" that Diane described. It can also lead to decisions that are suboptimal, ambiguous, or unimplementable because nobody actually tried to put theory into practice.
Thankfully, a small investment of effort into modifying open source systems to serve as reference implementations can provide a significant amount of insight into flaws or possibilities with otherwise theoretical directions, as well as delivering practical benefits to everyone who uses that software if those modifications are accepted by the parent projects. Here's hoping that the more agile options like Koha, Evergreen, VuFind, and Blacklight continue to push the evolution of their proprietary competitors.
(Page 1 of 43, totaling 172 entries) » next page
This work is licensed under a Creative Commons Attribution-Share Alike 2.5 Canada License.