Fun with Evergreen and SQL: representative record samples

Posted on Thu 04 March 2010 in Libraries

Let's pretend your national library asked you to submit a set of records with holdings representing all of the various formats in your library system. Let's also pretend that you're really lucky and you're running Evergreen. Here's what you would do to get one example of each combination of item type, item form, bibliographic level, literary form, cataloguing form, and video recording format into a scratch table for a given library (ID = 103) in your system:

CREATE TABLE scratchpad.osul_export (record BIGINT); INSERT INTO scratchpad.osul_export   SELECT record FROM (    SELECT DISTINCT ON (mrd.item_type, mrd.item_form, mrd.bib_level, mrd.lit_form, mrd.cat_form, mrd.vr_format)         mrd.record, mrd.item_type, mrd.item_form, mrd.bib_level, mrd.lit_form, mrd.cat_form, mrd.vr_format     FROM biblio.record_entry bre       INNER JOIN asset.call_number acn ON acn.record = bre.id       INNER JOIN asset.copy ac ON ac.call_number = acn.id      INNER JOIN metabib.rec_descriptor mrd ON mrd.record = bre.id     WHERE bre.deleted IS FALSE AND acn.deleted IS FALSE AND ac.deleted IS FALSE AND acn.owning_lib = 103    ORDER BY mrd.item_type, mrd.item_form, mrd.bib_level, mrd.lit_form, mrd.cat_form, mrd.vr_format  ) AS formats  ORDER BY record;

And then, because you were asked to provide a total of 2000 records for this representative sample, you might fill up the remaining 1800 records as follows:

INSERT INTO scratchpad.osul_export  SELECT bre.id   FROM biblio.record_entry bre    INNER JOIN asset.call_number acn ON acn.record = bre.id    INNER JOIN asset.copy ac ON ac.call_number = acn.id    INNER JOIN reporter.super_simple_record rsr ON rsr.id = bre.id  WHERE bre.deleted IS FALSE AND acn.deleted IS FALSE AND ac.deleted IS FALSE AND acn.owning_lib = 103    AND bre.id NOT IN (      SELECT record        FROM scratchpad.osul_export    ) AND substring(bre.id::text from (length(bre.id::text)) for 1)::int = 8    AND bre.id % 17 = 0  ORDER BY rsr.author DESC  LIMIT 1800;

... which, of course, gives you the records with a record ID ending in '8' and (to whittle it down further) records where record ID modulo 17 is 0 - and finally, just the first 1800 records ordered by author name in descending order.

All of this will give you 2000 record IDs in scratchpad.osul_export that you can then extract into a text file and feed into Evergreen's Open-ILS/src/support-scripts/marc_export script to dump the MARC records with holdings in the 852 field from your system. Beautiful, eh?