UPDATE 2010-03-10 See More granular identifier indexes for your Evergreen SRU / Z39.50 servers for some recommended enhancements to the target parser and Evergreen's identifier index capabilities
Laurentian University is part of the Ontario Council of University Libraries (OCUL), and a user of the centrally hosted Ontario Scholars Portal SFX link resolver, so one of the things we needed when we migrated to Evergreen was a target parser for our link resolver. This is the target associated with Search the library catalogue that is the last resort when the resolver fails to turn up any full-text resources for a given OpenURL - so hopefully it won't need to be invoked too often, as we have a very rich set of full-text electronic resources at Laurentian University.
The code
Here is a quick implementation of a target parser that generates search URLs based on ISSN, ISBN, book title, or journal title. Pretty impoverished from an OpenURL perspective, but it maintains the same level of functionality from our previous system. In TargetParser/Evergreen/Conifer.pm I created a target parser called Evergreen::Conifer that implements a subset of the Parsers::TargetParser API for SFX as follows:
package Parsers::TargetParser::Evergreen::Conifer;
use Parsers::TargetParser;
use base qw(Parsers::TargetParser);
use strict;
sub getHolding {
my ($this,$genRequestObj) = @_;
my $objectType = $genRequestObj->{'objectType'};
my $ISBN = $genRequestObj->{'ISBN'};
my $eISBN = $genRequestObj->{'eISBN'};
my $ISSN = $genRequestObj->{'ISSN'};
my $eISSN = $genRequestObj->{'eISSN'};
my $CODEN = $genRequestObj->{'CODEN'};
my $bookTitle = $genRequestObj->{'bookTitle'};
my $journalTitle = $genRequestObj->{'journalTitle'};
# Canonical search results URL for simple searches:
# http://laurentian.concat.ca/opac/en-CA/skin/lul/xml/rresult.xml?rt=keyword&tp=keyword&t=0895-2779&l=105&d=2&f=&av=
my $svc = $this->{svc};
my $egHost = $svc->parse_param('eg_host');
my $egLocale = $svc->parse_param('eg_locale');
my $egSkin = $svc->parse_param('eg_skin');
my $egOrgUnit = $svc->parse_param('eg_org_unit');
my $egDepth = $svc->parse_param('eg_depth');
my $path = "http://${egHost}/opac/${egLocale}/skin/${egSkin}/xml/rresult.xml?l=${egOrgUnit}&d=${egDepth}";
my $searchString = '&rt=keyword&tp=keyword&t=';
if (defined($ISSN)) {
if ($ISSN =~ m/x/i) {
# Current indexer doesn't deal well with ISSNs containing an X, so break it up
$ISSN =~ s/^(\d{4})-?(\d+)x/$1 -$2 x/i;
$searchString .= $ISSN;
} else {
$searchString .= "\"$ISSN\""; # format 9999-9999 for MARC
}
}
elsif (defined($ISBN)) {
# Evergreen doesn't force ISBNs to be stripped of hyphens, so take whatever
$searchString .= "\"$ISBN\"";
}
elsif (defined($journalTitle)) {
# Restrict searches to title index, with bibliographic level = s
$searchString .= "ti:${journalTitle}&bl=s";
}
elsif (defined($bookTitle)) {
# Restrict searches to title index, with bibliographic level = m
$searchString .= "ti:${bookTitle}&bl=m";
}
return ($path . $searchString);
}
1;
And here's the help that I added to the corresponding Conifer.hlp file:
General Information
Target - LOCAL_CATALOGUE_EVERGREEN_CONIFER
Service - getHolding
Parser - Evergreen::Conifer
Information needed in the Target Service:
In the PARSE_PARAM field, replace the following information:
eg_host = $$LOCAL_CATALOGUE_SERVER
eg_locale = Locale (en-US, en-CA, fr-CA, etc)
eg_skin = algoma, default, lul, nohin, uwin
eg_org_unit = 103, 1, etc
eg_depth = 0, 1, 2, 3, etc
Findings and wishlists
While it's quite easy to set up Evergreen as a searchable resource, thanks to its straightforward URL syntax, one of the things that leaps out at me is that Evergreen, by default, has no identifier index for limiting searches by ISBN / ISSN / LCCN / OCLCnum. Ideally, we would disable full-text indexing on this index so that we can more accurately search for ISSNs that include an x. Right now we have to split ISSNs with an "x" into constituent parts and generate searches on those parts, which results in false hits from across the database. This would also be useful for limiting Z39.50 searches.
I would also like to teach Evergreen about ISBN-10/ISBN-13 equivalence, to broaden the search while maintaining precision. And I would like to automatically normalize ISSN and ISBN formats so that I don't have to worry about whether a cataloguer entered hyphens or not - and the same for incoming search terms.
Finally, to support services like xISBN that search for multiple formats and editions of a given work by generating a shotgun blast of ISBNs for all known representations, I would love to teach Evergreen how to accept a list of identifiers as search input.
Don't ask me when these things will happen, though; if it requires work from me, it will probably be 2010 before any of it happens.
Every bibliographic store should come out of the box indexing these, and doing normalization of the ones which require it -- which are actually all of them. ISBN and ISSN need to be normalized with regard to hyphens. ISBN needs to be normalized to ISBN-13 in the index. LCCN needs to be normalized according to LCCN normalization rules. And OCLC number needs to be normalized to ignore any 'ocm' 'ocn' '(OCoLC)' type prefixes which may be there but are essentially meaningless.
I'm probably going to have to add some of that to Blacklight too.
http://dev.gapines.org/opac/extras/unapi?id=tag:open-ils.org,2008:isbn/0439136350&format=marcxml
http://dev.gapines.org/opac/extras/unapi?id=tag:open-ils.org,2008:isbn/0148-401x&format=marcxml
The "isbn" record type is a (mis-labeled) ISxN index, searching 020a, 024a (for transitional ISBN-13s) and 022a, and it's fine with x's (and -'s, which are a bigger problem for FTS). Change the format to "opac" and it will redirect you to the first record in the result set.
And the there's supercat:
http://dev.gapines.org/opac/extras/supercat/retrieve/mods/isbn/0439136350
http://dev.gapines.org/opac/extras/supercat/retrieve/mods/isbn/0364-023x
This uses the (mis-labeled) ISxN lookup as well. It does not do OPAC redirection, though. For that you'll have to stick to unAPI.
Normalization isn't there yet, and to do that we should really just store configured identifier fields (also configured with sanitizers) and point at that. For now, it's using the raw MARC, so cataloging mistakes will cause problems.
I guess the big thing that this points at is that there are tons of handy stuff hidden in Evergreen, waiting to be discovered and/or documented. If only one had the tuits...
--miker
And amen to the need for discovery / documentation / tuits...