The Legend of Zero

ORIGINALLY PUBLISHED ON February 17. 2016 at http://blogs.brandeis.edu/libsys/2016/02/17/the-legend-of-zero/

The scene: A library on a stormy night. Four undergrads huddle in the info commons, working on a project.

Sophia: Let’s search the catalog for stuff to back up our idea.
Raj: What if we don’t find anything? Our idea is pretty obscure
Sophia scoffs: Of course we’ll find SOMETHING.
Ben: I heard there’s a place, in the darkest reaches of the web, where you can search but you won’t find anything.
Greg: That’s just an old librarian’s tale. It’s not true. That couldn’t happen.

Suddenly the lights go out!
A beam of light pierces the darkness. The students look up and gasp.
A disembodied bespectacled face hovers, illuminated by a flashlight. It is the old Gen-X librarian!
She cackles: Well, ACK-tually, it’s true. In the olden days it wasn’t uncommon to search the catalog and find ZERO HITS!
(more gasping from the students; lightning; thunder)

And it really was true. “No results” or “Zero Hits ” or however we labeled it, wasn’t that uncommon. It happened frequently enough we coded our opacs to keep statistics on it, and people put quite a bit of thought into how to solve the Zero Results Problem.

Enter the Next Gen Catalog.

One of the key selling points of so-called Next Gen Catalogs was the promise of never seeing zero results. A searcher would always get some results – and the more results the better.

That promise has never been realized, though. Typos occur, some searchers aren’t great at spelling, and if you’re searching for something obscure enough you might not get any results back. But in general, yes, you do get a big chunk of hits that you can explore further to find what you need. The prevailing theory seems to be that if you return enough results the searcher can use facets to winnow the results and this will inevitably lead to something useful.

What if, though, this embarrassment of riches we are presenting to searchers isn’t wholly helpful?
A deluge of hits can be overwhelming.
The hits we get may be irrelevant because we’re not choosing the right terms to search on.
The art of known-item-searching has suffered. Sometimes the searcher knows exactly what they seek and has trouble locating it in a pile of 10,000 results.

How do we solve this new problem we’ve introduced?

How do we serve:

  • The explorer who benefits from the serendipitous discovery of material they didn’t know to look for
  • The known-item searcher who just needs the thing they need to appear in the first couple of hits
  • The bad typist
  • The experienced searcher who doesn’t yet know the vocab of their new quest
  • The novice searcher who has no idea where or how to begin

All with the same system?

I don’t have any mind-blowing paradigm-shifting ideas, but I did have occasion to stop by the exhibits at ALA MidWinter in Boston to visit a few people I know from my misspent youth as a vendor.

I saw a couple of interesting features of EBSCO’s EDS platform and asked a few probing questions.

Attempts to address both zero results and irrelevant results

Did You Mean?
EDS brings up something as a “Results may also be available for” suggestion nearly every time a search would result in 0 hits, and sometimes when a search does bring up hits. E.g. it suggests I might mean golfers when I search for goobers, even though I find plenty of hits about peanuts. The current version of this feature in the field performs poorly when given a misspelling like guestss. EBSCO has a newer version being deployed soon, I believe, so it will be interesting to see if it handles these types of typos better. Unsurprisingly I was able to stump it when putting in cat-like-typing gibberish. “ghksajhaekjdsdklk” stumps even Google.

Auto-complete
This is probably the feature I most wish Primo had. (I understand some form of it is available but only for Primo SaaS customers). EDS has two levels of “auto-complete” – their Popular Terms suggestions and their Publications suggestions.
Popular Terms are culled from previous searches by all EBSCO customers, and can change from day to day. If other people are searching for something it will appear in the list of Popular Terms. E.g. typing in obama might bring up a list of suggestions starting with the string obama – e.g. obamacare and obama, barack. When I first saw the auto-complete feature I hoped it was drawing directly from the metadata, so that a searcher wouldn’t be prompted to search for something they’d get no hits for (Infor’s Iguana product does auto-complete this way), but no such luck. Since EDS is searching such a big database, I am not sure how often this would be an issue for searchers, but I think it gives a false promise that something exists when it appears in auto-complete.
Publications – if a searcher enters a string which exactly matches the title of a publication, that publication will appear in the suggestions area. Seeing this was a real Hallelujah moment! It solves one of my colleague’s biggest frustrations with Primo (and next-gen catalogs in general) – we call it the Times of London problem. In our Primo catalog it is difficult to locate the resource you want when you search for Times of London. In EDS, however, if you begin typing times o… Times of London appears in the Publications suggestion. Click the suggestion and the publication we want is the first hit. Time magazine is similarly easy to get to. Nature is still a bit elusive, since it has so many permutations, but still easier to locate than in Primo. I’m almost afraid to publish this blog post lest my colleague find out the answer to one of her biggest frustrations is out there and I can’t deliver it (yet).
timesoflondon-300x170

Attempts to address insufficient or irrelevant results

Placards – this brilliant little feature of EDS was what really piqued the interest of the systems team. Placards are context-sensitive boxes/areas that appear when a search meets defined criteria. E.g. someone searches for “library hours” a placard can display the library’s hours, or a link to the library homepage, or… you get the idea. You can write code to link to an external subject guide, based on a searcher’s keywords. My favorite placard I’ve seen so far (I got the impression it comes standard with EDS but I haven’t actually fact-checked that so don’t quote me) appears when your search is an exact match to a publication indexed in EDS. A search box appears allowing you to search immediately within that publication. The next logical step in my mind is to make sure that type of search box also appears if someone searches for JSTOR, which happens a lot. A lot a lot!
Primo could be tricked into doing something similar, using tools like javascript, IF it’s code you can imbed in the footer.html. It just looks like a lot less work and kludgery (it’s a word, I swear) in EDS, and theoretically less prone to breaking during every upgrade.

These 3 features ideally would be available in any discovery system. Vendors have been flirting with the known-item-searching flaw in post-OPAC systems for years. I think EBSCO is moving in the right direction to solve some aspects of the problem in EDS. I’d really like to see all vendors acknowledge the problem and work to solve it.

I’m very interested to see what comes down the pike for these and other features in Primo, EDS and other discovery systems.

I’ll conclude my post-ALA musings with my wish list for ALL discovery systems:

  • Some form of Did You Mean that guides searchers toward choosing good search terms. We know they don’t really want to ask us for help. So how do we help them help themselves?
  • Customizability/context sensitivity of automated assistance
    • choice of auto-complete and Did You Mean source(s) – local indexes, recent/popular searches by others, one-offs defined by the library (libguides, webpages, local subjects, local authors, ??)
    • Interaction of auto-complete and placard-type features with user profile and preferences (e.g. demographics, field of study, enrolled courses in an LMS)
  • Fuzzy search, stemming, query expansion, tolerant search
  • And my biggest wish of all (probably a good topic for a future post) TRUE MODULARITY. Portability of all the hard work we put into our discovery system and/or ILS. If we’ve put 6 months of sweat into getting a discovery system to behave in ways that are useful to our community, we need to be able to take that work with us if we change ILSes.
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *