Authorities in Evergreen: an Amsterdam trip report
Posted on Mon 19 July 2010 in Libraries
As part of the informal partnership between the International Institute of Social History (IISH) and Project Conifer, I was pleased to be able to spend the last two weeks in Amsterdam, working side-by-side with one of the Institute's developers, Ole Kerpel, on augmenting the support for MARC21 authorities in Evergreen. To prepare for the work session, I had posted a blueprint for the authorities work on the Evergreen Launchpad instance and circulated the list of requirements we had been asked to address to the broader Evergreen development community. We were fortunate to have the attention of Mike Rylander on the proposal, who not only supplied suggestions for how to implement some of the items, but also committed significant code contributions to the effort that greatly assisted our efforts. Here is a summary of the goals we accomplished in the current development branch of Evergreen (targeted for the 2.0 release), followed by a list of the outstanding items and my finger-in-the-air estimate of how much more time it would take to accomplish each of the tasks:
Accomplishments
Controllable control numbers
While not, strictly speaking, a requirement for authority control in and of itself, the ability to ensure that the behaviour of the 001/003/035 fields all conformed to the MARC21 specifications was an important requirement for IISH. They plan to provide external access to their authority and bibliographic records, so making the official identifier fields linkable based on the underlying record ID was an important aspect of the work. We implemented this feature as an optional database-level trigger to ensure that the control numbers and control number identifiers are always perfectly in sync with the internal identifier of the particular system on which the records are stored.
Links
Where having Mike Rylander participate in your review process pays off, part one... Before I even arrived in Amsterdam, Mike implemented a tricky database trigger that tracks the links between a given bibliographic record and the authority records to which it links. The links are tracked at the database level, as well as directly in one or more 0 subfields in each field that is controlled by an authority record. Yes, a given field in a bibliographic record can be controlled by two authority records and it all works. Nice, Mike!
Syncs
Where having Mike Rylander participate in your review process pays off, part two... Mike also implemented the bulk of the logic for automatically updating bibliographic records that are linked to a given authority record when that authority record is modified. Yes, folks, when you add a death date to an authority record, it will automatically appear in the corresponding bib records.
Control an uncontrolled set of bibliographic records
You may have dealt with library systems in the past that use some sort of string matching to implement authority support. As noted above, Evergreen is not like that. However, this means that many of us, when migrating to Evergreen, have bibliographic records lacking the 0 subfields that are required for full authority support. Towards that end, I wrote a script that will walk through a set of bibliographic records, search for matching authority records for each controllable field in each bibliographic record, and add the required 0 subfields to the bibliographic records. It certainly won't be a fast solution, but you should only need to do it once, and it worked on the limited test cases that we had ready at hand.
Teach the MARC editor about authority records
The MARC editor knew all about fixed fields for bibliographic records, and provided a handy grid for editing those fields. However, it didn't even know how to recognize authority records, and presented a fixed field grid that was absolutely meaningless. I spent a chunk of time laboriously transcribing the fixed field rules from MARC documentation into the MARC editor and now the MARC editor presents a reasonable fixed field grid for your editing convenience.
Merge authority records
Something that often happens in a library is that two authority records are created that identify the same thing. Eventually somebody notices the problem and wants to merge the authority records together. Towards this end, I added a database-level stored procedure that supports the merging of authority records, such that the linked bibliographic records will automatically point to the winning authority record.
Authority browse interfaces
Where having Mike Rylander participate in your review process pays off, part the third... Mike also implemented basic browse interfaces that presents a series of authority records in MARCXML format matching your requested authority type (author, title, subject, topic) and the matching substring at the /opac/extras/browse and /opac/extras/startwith URL entry points. While still raw at this point, these can provide the basis for classic authority browse interfaces for those who desperately desire them.
Remaining to-do items
Note that any estimates are based on how long I think it would take me to implement, based on my own familiarity with MARC and Evergreen and all things Perl and JavaScript and PostgreSQL, and provided with the granularity of no less than one day. Actual implementation times may vary, of course; if related work items are worked on consecutively, then it is likely to take less time to achieve than if the items are tackled sporadically.
Add an authority in the flow
When you're working in the MARC Editor and you find that there is no match for an entry that you really think should be controlled, IISH wants to make it easy for a cataloguer to add an authority record for that entry. We thought that there might be two options that we would want to expose - a direct "create an authority record from this field" option that takes no further input, and a "create an authority record from this field and open it in another MARC editor to let me tweak it" option. Estimate: 2 person days
Highlight controlled fields
This is really a two-part problem. First, for uncontrolled fields, we want to teach the Validate button to offer the kind of automatic matching that the script does and add the required 0 subfield. Second, we want to highlight fields that are explicitly controlled by authority records with a subfield differently from fields that simply match an authority record, but which are not controlled by it. Estimate: 1 person day
Simplify authority record selection
This two-part requirement would mask many of the fields that are currently offered as options when you right-click on an uncontrolled subfield to display matching authority records. For example, it is a little weird to offer a "See from" heading to a cataloguer; we're trying to avoid adding new records with those headings, right? Heh. Second, we want to introduce the ability to invoke the authority browse list in this interface so that the cataloguer can see a given set of headings in context and select the heading to apply from there. Estimate: 2 person days
Delete authority record
There is currently no cataloguer-friendly way to delete authority records. We need to expose a list of authority records (probably reusing that browse list again) and make it possible for cataloguers to delete an authority record. When that record is deleted, all bibliographic records that link to it need to have their links removed - and ideally, the cataloguer would be able to tell how many bibliographic records link to that authority before the delete takes place. Estimate: 1 person day
Edit and merge authority records
Although the database-level support now exists for merging authority records, we need to expose a means for cataloguers to select the authority records that they want to edit or merge. This could just be a slightly evolved version of the "Delete" interface. Estimate: 1 person day
Expose authority records via SRU/Z39.50/crawlable interface
One of the goals of the IISH is to be able to share their authority records with other institutions. One of the standard methods is SRU + Z39.50 server support; we should be able to build on the SRU/Z39.50 server support for bibliographic records in Evergreen to provide a basic solution for authority records. Interest has also been expressed in having a crawlable implementation that would give the linked data crowd something to play with. Estimate: 2 person days for an SRU/Z39.50 server, 1 person day for a very basic crawlable linked-data implementation
In summary - hurray for Mike Rylander for helping us out to such an extent, and many thanks, again, to IISH for giving me an opportunity to focus on Evergreen development for an extended period of time, and to Laurentian University for supporting my efforts. I hope that between Ole and myself that it will be possible to finish the rest of these work items prior to the Evergreen 2.0 release. It has been exhilarating to see far Evergreen's authority support has come in less than a month, and given a little more time I suspect that Evergreen's authority support will be the envy of other library systems.