Classification scheme-aware call number sorting in Evergreen

Posted on Mon 09 August 2010 in Libraries

As a librarian who works at a library that primarily uses the Library of Congress classification scheme, I have been interested for a long time in teaching Evergreen to be aware of call number schemes other than Dewey. The problem, in a nutshell, is that Evergreen simply applies an alphabetical sort against the the uppercased version of the call number when generating call number browser displays - resulting in LC call numbers that sort incorrectly, like:

K 215 .E53 W37 1997
K 22 .U748 v.18

When the subject recently came up on the open-ils-general mailing list, I decided to follow up with some code. So, as of this weekend, Evergreen trunk now has a generalized infrastructure for generating sort keys for call numbers. The broad strokes of the current implementation are:

The classification scheme is set the level of the call number.
Classification schemes are defined in the asset.call_number_classification table with a pointer to a database function to call to generate a normalized sort key for the given call number.
Three classification schemes are available out of the box:
- Generic (the default) - a simple normalization approach that produces reasonable results in the absence of special rules for Cutters, etc
- Dewey (DDC) - a normalization routine taken from the Koha C4::ClassSortRoutine::Dewey Perl module
- Library of Congress (LC) - a normalization routine that simply wraps Bill Dueber's excellent Library::CallNumber::LC Perl module
and adding more classification schemes is just a matter of adding another row to the asset.call_number_classification table and the appropriate sortkey-generating database function.

Note that this is the first time, to my knowledge, that Koha code has been adopted directly by Evergreen. I included attribution for the copyright holders in both the Generic and Dewey normalization functions. I wrote the Generic implementation in Evergreen from scratch shortly after taking a look at Koha's approach, so in some corners my work would be considered a "derived work". Koha's Dewey normalization function was (somewhat surprisingly) the only open-source implementation that I could find for Dewey, so it made perfect sense to me to adopt that for use in Evergreen. Many thanks to Koha for their use of the GPL v2 or later licence!

There are still some limitations and low-hanging fruit that I hope to address in the near future:

Right now you can only manipulate classification schemes via SQL. The Holdings Maintenance dialogue needs to give cataloguers the ability to set the classification scheme for each call number, because I'm sure they don't want to drop down to the command line. This setting should probably be sticky during a given session, so that if they're processing a cart of government docs, they won't have to change the scheme from the default to CODOC for each item.
Speaking of defaults, each library needs to be able to define a default classification scheme - so your consortium can have a Dewey library and an LC library and a SUDOC library, and their preferences won't trample each other. This can just be a simple org-unit setting.
Following on Mike Rylander's advice, the asset.call_number_classification table should gain a new column that lists the field/subfield combinations used to find the appropriate call number (if any) for each scheme in a given bibliographic record. Then the Holdings Maintenance dialogue can offer the appropriate call number based on the classification scheme.