Belated Access 2006 notes: Saturday, Oct. 14th

Posted on Tue 31 October 2006 in Libraries

Final entry in publishing my own hastily jotted Access 2006 conference notes--primarily for my own purposes, but maybe it will help you indirectly find some real content relating to your field of interest at the official podcast/presentation Web site for Access 2006. Contents include:

consortial updates from ASIN, Quebec, COPPUL, and OCUL
Thom Hickey's updates on OCLC's Virtual International Authority File (VIAF) and WorldCat Identities
Clifford Lynch's keynote

Consortium update

ASIN, Slavko Manojlovich

ASIN Overview

17 atlantic academic libraries
300 - 18,000 students
2 unilingual francophone sites
Sirsi, Ex Libris, and Innovative

Why our users hate us

choose format over subject
learn multiple database interfaces
citations presented in confusing formats

Addressing the problems

a la carte user authentication
EZProxy servers
SingleSearch federated search tool over 400 resources (including 100+ open access)
1Cate OpenURL resolvers
Relais ILL
Refwork/Refshare

Principles

Click, don't type
when you have it, show it
when you don't have it, make it easy to get
focus on appropriate links rather than click counts
let the user determine the appropriate copy from the available formats

Stephen Sloan

Missing ingredient -- enabling subject choice for users, rather than format
working with SirsiDynix on a consortium version of EPS Rooms CMS
production version to be available in 1st quarter of 2007
Rooms is basically a portal environment, with different defaults/scoping for each subject (so that single search

Outstanding challenges

Federated search connectors based on screen scraping will break
Citations from certain resources cannot be linked to Resolver
Cookie pushing in a public environment
Implementation of the NISO Metasearch standard to improve federated searching

Recognizing our differences

Local customization of interfaces
Emulating local default search options--everyone use EBSCO, but everyone has configured different behaviour
Relying on local expertise at each site

COPPUL, Carmen Kazakoff-Lane

COPPUL Overview

ANTS: Using Open Source, Social Software (in the COPPUL consortium)
sharing and updating animated tutorials that were believed to be a better option than long information literacy tutorials
make it easy to locate and use these tutorials (central location and explicit copyright / reuse statement)
Make sharing easy and desirable through quality standards, help, and the allowance for local customization

How does it work?

Project is hosted at http://brandonu.ca/Library/coppul
ask each institution to take responsibility for a certain set of databases so that they can be updated along with the user interface
wiki enables institutions to update database list with status of development, whenever they create a tutorial, or add a new database to the list
rss feeds enable you to track which tutorials have been updated or created
tutorials are housed within a single institutional repository, licensed under CC licenses with options to the creators
Other organizations (like LU) are welcome to participate!

Guy Teasdale, Laval Universite

Quebec Digital Infrastructure: The Year in Review

Main players

BAnQ - Bibliotheque et archives nationales du Quebec
CREPUQ - Conference of Rectors and Principals of Quebec Universities
Erudit
Museums
Quebec Gov.
SRC and other media

BAnQ

BNQ started in 1967
April 2005, opening of la Grande Bibliotheque of the BNQ
Jan 2006 - Merger of ANQ and BNQ; mandate to acquire and disseminate collections
October 2006 - Second meeting on digital national library
1996 - beginning of digitization activities
2003 - permanent digitation program
3.2 million pages of digital materials (newspapers, etc) currently in the collection; 62000 images

Meanwhile in the World

Dec 2004 - Google print project: 15 M ebooks by 2010
Jan 2005 - CEO of BND Jeanneney react in Le Monde "Quand Google defie l'Europe", results in the proposal for the Creation of European Digital Library
2010 European DL expect 6M books
Fevrier 2006 Franco network of digital libraries was formed (including France and Quebec)

Meanwhile in Canada

Quebec is participating in Alouette Canada, hoping that nobody is reinventing the wheel

Erudit

18000 scholarly articles from 48 journals
150000 backfiles projected
Erudit schema adopted by www.persee.fr and www.cens.cnrs.fr = franc interoperability
$3,000 annually to join

OCLC, Thom Hickey

Virtual International Authority File (VIAF)

Link national authority records
Build on their authority work
Move towards universal bibliographic control, while allowing local variations to exist
Deutsche Nationalbiblithek, LoC, and OCLC -- hoping for the BNF (French) national file
OCLC is responsible for the actual coding for the project

Matching variations

In the LCNAF and PND authority files:
- Same name, same person
- Same name, different people
- Different names, same person
- Missing person in one file

Enhancing the authorities

Bibliographic record -> Derived authority -> Enhanced authority
Authority record -> Enhanced authority

Weaker attributes

Only one of birth/death dates
Subject area of works
Format
Language
Publisher
Partial title match

Even weaker attributes

Date of publication
Country
Role
Format

Compute it

Standard approach:
- Generate keys and data
- Load information into a database
- Index it
- Extract fields needed
Map/reduce approach (adopted from Google)
- Split the database up
- Run parallel jobs against those pieces of the database
- Bring information together through map/reduce

Map/Reduce

Map
- Read in source file (e.g MARC21)
- Write out key + data
Reduce
- Read in array of data for each unique key
- Write out key + data

Map/Reduce implementation

Written in Python
Uses ssh and XML-RPC for control and communication
Map/Reduce seems to add around 10% overhead
Earlier implementation ran on a 48 CPU cluster
Current VIAF cluster is a 12 CPU cluster on 4 nodes
Running Linux and 64-bit Python (no need to worry about 2GB memory limit)

VIAF matching code

17 modules
1,100 lines of code
600 lines of configuration