Access 2006 notes: October 12

Posted on Thu 19 October 2006 in misc

My continuing summaries from Access 2006. Thursday, October 12th was the first "normal" day of the conference featuring the following presentations:

` <>`__

` <>`__Open access, open source, content deals: who pays?

` <>`__Leslie Weir, VP/President-Elect of CARL

Open access mandates from research funders such as CERN, and near mandates from research funders such as SSHRC, are encouraging

` <>`__University of Ottawa:

  • disappointing experience with BioMed Central which touts itself as an open access journal (following the "author pays" model)
  • UofO had subsidized their researchers until this year, when BioMed demanded a 15x increase in their funding
  • but researchers are now campaigning for a return of the subsidies

` <>`__Open source modules as an alternative to the traditional commercial ILS

  • Libraries have been early adopters of technology, starting with mainframes, through PCs and Internet access
  • ILS reflects the need for integrated information
  • While our OPACs were originally cutting-edge, they've been relatively unchanged since their inception compared to rich catalogues offered by commercial retailers
  • Ex Libris touts Primo as a way for libraries to unlock the potential of their information
  • McMaster announced on September 6th plans to adapt the same search and browse technologies underlying Home Depot's catalog system (Endeca) to provide faceted search and browse

` <>`__Role of Scholar's Portal in OCUL vs. individual OPACs:

  • Should we implement a shared ILS?
  • If so, should it be a vendor's ILS, a collaborative open source ILS, a new layer of discovery tools over top of a traditional ILS?

` <>`__Content deals for consortial arrangements:

  • Canadian libraries invest heavily in consortial access to e-resources
  • OCUL invests approximately $20 million annually for 20 university members
  • Problem is how to divide the costs:
  • OCUL in 2002/2003 tried to adopt a five-year plan for a "user-pay" model based on the amount of use of each product
  • Now in the fourth year of the plan, many institutions were experiencing 20% increases (capped maximum) year over year that was not sustainable, while others were experiencing significant decreases
  • Budgetary certainty has quickly become a stronger requirement for the formula, while continuing to refine the balance of ability to pay vs. use
  • CKRN began with a grant for $59 million for traditional 'hard' sciences
  • On November 4th CKRN is going before an international panel to propose a $48 million project to move into social sciences and humanities

` <>`__How to deal with the administrator who questions the value of library services in the age of Google?

  • OCLC has determined that the vast majority of the information that is available on the Internet is not freely available
  • Libraries have a role in providing the keys to that information
  • Accepting that not all roads lead through the library and integrating with other services like Google Scholar will help lead users to the information they need
  • Google has raised the bar; it's no longer okay to tell our users "These are the 22 interfaces you need to learn to find the information you need"

` <>`__Our Ontario: Yours to Recover

` <>`__Art Rhyno, Walter Lewis

` <>`__Knowledge Ontario

  • Knowledge Ontario grew out of the Ontario Digital Library and is now consituted by six projects; Our Ontario (search.ourontario.ca) reflects community content
  • Hope to use Lucene as the "digital hammer" to search community content (pulling from newspaper records, historical society data, etc)
  • Excellent at merging indexes, so potential for libraries to use their regular workstations to index in parallel
  • Excellent separation between indexing/searching layer and presentation layer
  • OurOntario includes digitized books (searchable via additional metadata), fonds (indexing EAD), images from Images Canada (stripped-down DC), TEI documents
  • AlouetteCanada is a layer on top of the same Lucene backend using SOLR to present facets

` <>`__Great Lakes Images

  • Great Lakes Images presents pictures with Google Map integration to show where the pictures were taken
  • Includes images donated from individuals, questions raised by KnowledgeOntario staff, and comments from users that actually correct or enhance the metadata for the records
  • Museums etc. often don't make the high-res version of the image available for free, and that's fine with KnowledgeOntario

` <>`__Local stories

  • Full community newspapers that are ignored by most aggregators or services
  • Essex Free Press includes PDF directly from the production system, with fully searchable text that is extracted from the PDF

` <>`__Improving the Catalogue Interface using Endeca

` <>`__Tito Sierra, NCSU Libraries

  • NC State was the first library to use Endeca as a front end to the library system
  • Endeca is a software company in Cambridge, MA that is a software provider for Home Depot, Indigo, and others
  • McMaster has just announced that they will be using Endeca
  • Project details: http://www.lib.ncsu.edu/endeca

` <>`__Motivation:

  • Improve the user experience when using the library catalogue keeping the backend system the same
  • Exploit the existing authority infrastructure possible within the MARC record format

` <>`__Why Endeca?

  • Relevant result ranking (previous system provided "most recently added to the catalog" approach)
  • Faceted browsing narrows search results, or starting with a browse:
  • Facets include language, LC classification, subject genre, subject access, author, most popular (based on circulation data)
  • You can deselect a facet you chose previously at any given time
  • Performance and speed
  • Search "comforts":
  • Spell correction (auto-correct)
  • "Did you mean..." (prompts user)
  • Stemming
  • Sort options (publication date is most popular)

` <>`__Relevance ranking controlled by order of ranking modules:

  • Original query match
  • Phrase match
  • Field match (tiered -- multiple submodules)
  • Number of fields matched
  • Weighted frequency
  • Publication date descending
  • Circulation date descending

` <>`__Interface was built from scratch on top of the Endeca search software and indexing:

  • Benefit is that NCSU has local control over the UI to enhance it based on user feedback
  • URL is persistent -- so you can copy and paste the URL resulting from a search and share it with others

` <>`__Features not supported:

  • Work level aggregations / roll-up (no ability to show the item rather than the expression)
  • Customization
  • Folksonomies
  • Recommendations

` <>`__Technical overview:

  • Co-exists with SirsiDynix Unicorn ILS and Web2 online catalog
  • Still uses Web2 authority search
  • Export MARC records
  • Endeca parses the MARC records on a nightly basis
  • Resources: 5 IT staff, 1 cataloging librarian, 1 reference librarian
  • Timeline: License in spring 2005, production in summer 2006

` <>`__Implementation challenges:

  • Some feedback suggests too many facets have been exposed
  • Relevance ranking is hard
  • Faceted navigation across logical groupings (like item type of audio-visual for CDs, tapes, etc)

` <>`__Usage statistics:

  • Most popular facets:
  • Subject topic
  • LC classification
  • Format
  • Keyword search (default) is most popular by far

` <>`__Usability test compared 5 users on old system with 5 on new system

  • New catalog was proven to be easier and faster on 9 out of 10 tasks

` <>`__Reflections

  • Post-launch, new interface required backend data cleanup as many interesting variations were exposed by the facets
  • Search and discovery were "broken" with the old interface

` <>`__Lightning talks:

` <>`__John Derno, UVic

` <>`__Building a backup catalogue in a week and a half

  • Facing a four-day Voyager upgrade & outage
  • Considered WorldCat, but it only allows 16 simultaneous users
  • Zebra indexed 1million records in 40 minutes on a reasonable development box (1GB RAM)
  • Basic PHP interface through PHP/Yaz to talk to Zebra via Z39.50
  • Major challenge was documentation of indexdata.dk technology, e.g. building queries

` <>`__Godmar Back and Annette Bailey, Virginia Tech

` <>`__MAJAX or "Look Ma, no server"

  • OpenURL services provide a link to check catalog for searches that don't work
  • Desire to integrate live OPAC information (MARC records, holdings records) into web pages in your organization's domain
  • JavaScript library provides an AJAX interface to III Millenium catalog
  • Does not require any changes to Millenium installation

` <>`__Jeremy Frumkin, OSU

` <>`__LibraryFind

  • attempt to build a federated search tool that is not slow
  • the approach: cache search results
  • version 1 built with PHP & MySQL, will be thrown away with version 2 (RoR)
  • layout deliberately designed on Google interface, user testing shows that the adwords column gets ignored (no surprise there)
  • does describe related terms
  • linked titles are available in full text, link for AJAX-y link resolver that no users clicked
  • show/hide affordance also wasn't used
  • Jeremy and Mr. T (including full photoshopping of Jeremy with a Mr. T haircut)

` <>`__David Fiander, UWO

` <>`__How I spent my summer as an Evergreen developer

  • Georgia Public Library consortium decided that it would be cheaper to devote 6 people to developing their own ILS rather than continue to pay a vendor
  • System did slow down the day it launched due to the immense load
  • Development team works entirely online (email and chatrooms), with sporadic meetings
  • David wrote the code that enables the self-checkout terminals as a separately available open source module
  • Koha would like to offer self-checkout, David will probably adapt the module for them

` <>`__Terry Reese, OSU

` <>`__Intro to MarcEdit

  • Written in C#, works well on Windows, reasonably well under Mono on Linux and Mac
  • Converts between MARC and various XML formats
  • Converts between MARC-8 and Unicode
  • Includes a Z39.50 client
  • Designed to be programmable rather than forcing you to use the GUI
  • Possible cataloging tool replacement for SmartPort / WorkFlows?

` <>`__Marc Jordan, SFU

` <>`__"Shameless self-promotion"

  • Mark wrote a book Putting content online: a practical guide for libraries, Chandos Publishing
  • Wanted to make it fully open access, in the belief that making it available would encourage sales of more copies
  • Two sample chapters are online

` <>`__Walter Lewis, Windsor

` <>`__What do we do with content that's too big to fit on the screen?

  • Frequently occurs with maps, high-res photos
  • First option is to provide a small version of the photo, but that doesn't give proper treatment to the real image
  • Digitization engine behind is Mr. Sid (proprietary, LizardTech) -- gives a Google Maps-like ability to drag around
  • Mr. Sid is geographically aware, so it knows which way is North on a map
  • You can index X-Y offsets that would make it possible to deliver historical map pointering
  • JPEG2000 standard includes some x-y axis capabilities

` <>`__Towards a Canadian Digital Information Strategy

` <>`__Susan Haigh, LAC

More information -- http://collectionscanada.ca/cdis

` <>`__Why the need for a national strategy?

  • current efforts are fragmented, uncoordinated, under-resourced
  • incapable of preserving what we create
  • inconsistent across the provinces
  • weak in comparison to other countries

` <>`__Why now?

  • enough accrued experience
  • volume of digital content is rising sharply
  • methodology is becoming clearer, storage costs are coming down

` <>`__Who or what can it serve?

  • research, elearning, community building, e-democracy, public accountability

` <>`__Canada's Scientific Infostructure (Csi): Universal, seamless and permanent access to information for Canadian research and innovation

` <>`__Lucie Molgat, Director Csi Program, CISTI

` <>`__What is CISTI?

  • Canada's national science library
  • Has experience providing seamless access for NRC users across Canada
  • Publisher of 15 journals

` <>`__Csi vision

  • National information and infrastructure network to provide seamless and permanent access to full-text digital content

` <>`__Csi components:

  • Infrastructure, repository, and services and tools
  • Using an SOA to enable partner projects
  • Csi's repository currently has 5.7 million articles licensed from providers, will have 7 million by spring 2007

` <>`__AlouetteCanada

` <>`__Brian Bell, AlouetteCanada

  • http://alouettecanada.ca
  • Single-line summary: provide views over top of a comprehensive collection of regional and local digital repositories
  • Addendum: variety of methods to encourage more content, including making a Web interface available for small regional organizations, Digimobiles for sending out trainers of the trainers and technology to sites that need assistance, and harvesting / normalization of existing repositories

` <>`__CIDL

` <>`__Bill Maes, CIDL -- http://www.collectionscanada.ca/cidl/

  • "We are bereft of content"
  • YouTube didn't need ten years of vision statements and meetings about infrastructure to get up and run
  • CIDL protocol from 10 years ago is fairly applicable today, but needs to include more than just libraries
  • One of the major problems so far has been a lack of content
  • Need more products like OurRoots with actual content
  • CIDL encourages AlouetteCanada to be inclusive -- to get everybody in the game -- but don't wait for another year to go by; DO IT NOW.
  • While CARL has been an important part of startup support, AlouetteCanada needs to distance itself from CARL to encourage more partners to join in the effort
  • But another goal is to get Canadians to actually view this content, and making access to the content is only part of the problem; getting people interested at even a fraction of the rate that Canadians have been downloading YouTube videos is critical.

` <>`__Progress and Prospects for European Digital Libraries (TEL – European initiative)

Ron Davies

` <>`__MACS (mapping subject headings across languages)

  • English (LC), French (Rameau), and German (SWD)
  • Currently have 30,000 mappings in total, 3,000 mappings across all three languages

` <>`__BS 8723 revises UK/ISO standards for thesauri

  • Includes guidance on mapping between different structured vocabularies and ontologies
  • Includes advice on standardizing electronic exchange