SFX target parser for Evergreen and some thoughts about searching identifiers

Posted on Mon 29 June 2009 in Libraries

UPDATE 2010-03-10 See More granular identifier indexes for your Evergreen SRU / Z39.50 servers for some recommended enhancements to the target parser and Evergreen's identifier index capabilities

Laurentian University is part of the Ontario Council of University Libraries (OCUL), and a user of the centrally hosted Ontario Scholars Portal SFX link resolver, so one of the things we needed when we migrated to Evergreen was a target parser for our link resolver. This is the target associated with Search the library catalogue that is the last resort when the resolver fails to turn up any full-text resources for a given OpenURL - so hopefully it won't need to be invoked too often, as we have a very rich set of full-text electronic resources at Laurentian University.

The code

Here is a quick implementation of a target parser that generates search URLs based on ISSN, ISBN, book title, or journal title. Pretty impoverished from an OpenURL perspective, but it maintains the same level of functionality from our previous system. In TargetParser/Evergreen/Conifer.pm I created a target parser called Evergreen::Conifer that implements a subset of the Parsers::TargetParser API for SFX as follows:

package Parsers::TargetParser::Evergreen::Conifer;use Parsers::TargetParser;use base qw(Parsers::TargetParser);use strict;sub getHolding {  my ($this,$genRequestObj) = @_;    my $objectType = $genRequestObj->{'objectType'};  my $ISBN = $genRequestObj->{'ISBN'};  my $eISBN = $genRequestObj->{'eISBN'};  my $ISSN = $genRequestObj->{'ISSN'};  my $eISSN = $genRequestObj->{'eISSN'};  my $CODEN = $genRequestObj->{'CODEN'};  my $bookTitle = $genRequestObj->{'bookTitle'};  my $journalTitle = $genRequestObj->{'journalTitle'};  # Canonical search results URL for simple searches:  # http://laurentian.concat.ca/opac/en-CA/skin/lul/xml/rresult.xml?rt=keyword&tp=keyword&t=0895-2779&l=105&d=2&f=&av=  my $svc = $this->{svc};  my $egHost = $svc->parse_param('eg_host');  my $egLocale = $svc->parse_param('eg_locale');  my $egSkin = $svc->parse_param('eg_skin');  my $egOrgUnit = $svc->parse_param('eg_org_unit');  my $egDepth = $svc->parse_param('eg_depth');  my $path = "http://${egHost}/opac/${egLocale}/skin/${egSkin}/xml/rresult.xml?l=${egOrgUnit}&d=${egDepth}";  my $searchString = '&rt=keyword&tp=keyword&t=';  if (defined($ISSN)) {    if ($ISSN =~ m/x/i) {      # Current indexer doesn't deal well with ISSNs containing an X, so break it up      $ISSN =~ s/^(\d{4})-?(\d+)x/$1 -$2 x/i;      $searchString .= $ISSN;    } else {      $searchString .= "\"$ISSN\"";      # format 9999-9999 for MARC    }  }   elsif (defined($ISBN)) {    # Evergreen doesn't force ISBNs to be stripped of hyphens, so take whatever    $searchString .= "\"$ISBN\"";  }  elsif (defined($journalTitle)) {    # Restrict searches to title index, with bibliographic level = s    $searchString .= "ti:${journalTitle}&bl=s";  }  elsif (defined($bookTitle)) {    # Restrict searches to title index, with bibliographic level = m    $searchString .= "ti:${bookTitle}&bl=m";  }  return ($path . $searchString);}1;

And here's the help that I added to the corresponding Conifer.hlp file:

General Information

Target - LOCAL_CATALOGUE_EVERGREEN_CONIFER

Service - getHolding

Parser - Evergreen::Conifer

Information needed in the Target Service:

In the PARSE_PARAM field, replace the following information:

eg_host = $$LOCAL_CATALOGUE_SERVER

eg_locale = Locale (en-US, en-CA, fr-CA, etc)

eg_skin = algoma, default, lul, nohin, uwin

eg_org_unit = 103, 1, etc

eg_depth = 0, 1, 2, 3, etc

Findings and wishlists

While it's quite easy to set up Evergreen as a searchable resource, thanks to its straightforward URL syntax, one of the things that leaps out at me is that Evergreen, by default, has no identifier index for limiting searches by ISBN / ISSN / LCCN / OCLCnum. Ideally, we would disable full-text indexing on this index so that we can more accurately search for ISSNs that include an x. Right now we have to split ISSNs with an "x" into constituent parts and generate searches on those parts, which results in false hits from across the database. This would also be useful for limiting Z39.50 searches.

I would also like to teach Evergreen about ISBN-10/ISBN-13 equivalence, to broaden the search while maintaining precision. And I would like to automatically normalize ISSN and ISBN formats so that I don't have to worry about whether a cataloguer entered hyphens or not - and the same for incoming search terms.

Finally, to support services like xISBN that search for multiple formats and editions of a given work by generating a shotgun blast of ISBNs for all known representations, I would love to teach Evergreen how to accept a list of identifiers as search input.

Don't ask me when these things will happen, though; if it requires work from me, it will probably be 2010 before any of it happens.