Entries tagged as coding
Monday, February 3. 2014
Back in August, I mentioned that I taught Evergreen, Koha, and VuFind how to express library holdings in schema.org via the
The language for some of the terminology may seem a little overly commercial right now, but the next iteration of the schema.org standard will adopt language that more broadly supports non-commercial activities... and this broadening of a number of schema.org definitions is also an outcome of the Schema BibEx community efforts. I'm pretty happy with the results of the group over the last six months! Hopefully this sheds some long-overdue light on some of the results of our efforts, and helps other systems adopt our group's recommended practices for exposing metadata via schema.org.
On , I'll be participating in Laurentian University's Research Week lightning talks. Unlike most five-minute lightning talk events in which I've participated, the time limit for each talk tomorrow will be one minute. Imagine 60 different researchers getting up to summarize their research in one minute each, and you have what is likely to be a brain-melting hour. Should be fun!
Here's a rough draft of what I'm planning to say (which, when read at an even cadence with decent intonation, comes out to exactly one minute:)
What would you understand if you read the _entire_ world wide web?
Thursday, January 30. 2014
Tuesday was not the greatest day, but at least each setback resulted in a triumph...
First, the periodical proposal
for schema.org--that I have poured a good couple of months of effort
into--took a step closer to reality when Dan Brickley announced
on the public-vocabs list that he had created a test build that
incorporated the RDFS that I had written up. Excitement rapidly turned to
horror, though, as I realized that I had made a classic copy/paste error, in
which I had changed the displayed name of the
Luckily, after I fixed the RDFS, Dan was able to put together a revised test build later that day that actually reflected our intentions. So that can continue moving forward...
Second, our Evergreen instance started acting up rather badly. All of the connections to the database server were being gobbled up, and we were scrambling to figure out why. While I'm on sabbatical I'm not really supposed to be involved in the day-to-day operations, but when a core service stops running it's okay for research to wait for a little bit... Eventually I tracked down a fix for a potential denial of service problem (Search result rendering can crush the system) that hadn't been merged into our production system (the fix came out after the start of my sabbatical), and shortly after I put that into production we were back up and running.
Third, after the Evergreen problem was resolved, Bill Dueber pinged me
innocently on IRC. He had run into a problem with File_MARC; when serializing
MARC as MARC-in-JSON format, fields with a subfield
Friday, January 24. 2014
The following is an email that I sent to the MARC mailing list on January 24, 2014 that might be of interest to those looking to provide better support for linked data in MARC (hopefully as just a transitional step):
In the spirit of making it possible to express linked data in MARC for any data field, would it be worthwhile exploring the possibility of defining subfield $0 as valid for all data fields, and then relaxing the definition such that in the absence of a specific MARC Organization Code or Source Identifier code, it would be understood to be the default of Source Identifier "(uri)" (that is, a URI)?
Right now the mechanism for fields that can be controlled by authority records would be to either figure out the mapping between the MARC Organization code or Source Identifier code and some URI (if the subfield 0 directly identifies the source of the authority record or source identifier), or (in many systems) look up the local authority record that controls the field, then look up the source for the authority record (again having to use localized logic for the MARC Organization code / Source Identifier code).
The current limitations are that:
The alternative that I'm proposing--to allow $0 on any field, and to assume a default Source Identifier of "(uri)" in the absence of any explicit identifier--would enable systems to provide links for entities that are currently uncontrolled. For example, field 264 (producers/publishers/distributors/manufacturers) are currently not controllable fields. If the proposal was accepted, however, when systems generate record detail pages, they could include structured data that identifies the producer/publisher/distributor/manufacturer.
I will certainly acknowledge that it's not a perfect proposal as-is, as for a 264 you would most likely want to provide a link for subfield $b and a separate link for subfield $a, whereas for many other fields you're providing a link for the entire combination of subfields -- but it would be a step forward from where we are now.
An extension, then, would be to provide some optional means of defining for which subfield(s) the $0 provides a controlling link; for example, using square-brackets and 1-indexed positional integers you could do something slightly horrible like:
264 #1 $a Cambridge : $b Elsevier, $c 2013 $0 http://dbpedia.org/resource/Cambridge,_Massachusetts $0 http://dbpedia.org/resource/Elsevier
The advantage here is that you can maintain the existing punctuation but have tightly defined linked entities that you can then express when you publish information about this record elsewhere--and you have a ready handle for pulling in more information about any of the linked resources within your MARC-based systems--without having to subsequently do string clean-up and entity matching, etc. And this gives us, perhaps, a way forward from MARC to something else that is more focused on linked data.
Note: I want to thank Karen Coyle for first getting me to think about this problem with her blog post Linked Data First Steps & Catch-21
Thanks for any consideration that might go into this informal proposal,
(Page 1 of 6, totaling 21 entries) » next page
This work is licensed under a Creative Commons Attribution-Share Alike 2.5 Canada License.