Once upon a time, we all looked up addresses and phone numbers in a phone book. We probably all still have them.
You can still use print to find a wide variety of information, but nowadays, you most likely look things up online more often. What happens between you typing your search query and the results appearing on the screen?
Online searching gives you two options: a search engine or a database (such as an online library catalog). We probably all use search engines more often. In fact, apart from library web sites, most of us will never encounter an online database. As a result, database searching is less intuitive. But databases will find information that is not available online.
I have explained the differences between using a search engine and using a database in previous posts. But what do you get back when you perform a search? Where does the information come from?
When you look something up in print, you know without giving it much thought that someone wrote or compiled it. It’s hard to miss the names of the author, editor, publisher, or other people responsible. We can more easily forget the human effort behind online search results.
What does a search engine do?
Type a keyword or phrase in your browser’s search bar. Almost instantly, ten or so links appear on the screen. Lots more pages of links follow. The top of the screen will tell you that your search returned thousands or millions of search results. What determines what you see first?
Search engines rely on algorithms. Google defines an algorithm as “the computer processes and formulas that take your questions and turn them into answers.”
Algorithms combine numerous clues within a web page. They enable the computer to make a realistic guess about what you want.
Google’s algorithm takes more than 200 factors into account. Each has a different weight. The math required to program the algorithm makes most people’s head spin.
The algorithm begins with the keyword you typed in the search bar. Primitive search engines treated it like a string of characters. If they found the search term several times on a page, they concluded that the page was relevant. Nowadays, they can recognize synonyms and multiple phrases with similar meanings.
So with a search engine people write the algorithms. People constantly analyze searchers’ keystrokes to assess how well the algorithm works. People constantly monitor ways webmasters try to game the system to appear at the top of the results. But all the human involvement amounts to nothing more than sophisticated guesswork.
Something called a spider visits (crawls) every public server and gathers bits of information from every page. It places these bits in an index. The search engine itself doesn’t look at the pages. It compares the index and algorithm to make its guess about what you want. Then it gives you its millions of links.
You follow a link to the actual page you want to read from there.
What does a database do?
Online library catalogs and other databases may look at first the same as the familiar search engine. If you enter your keyword search the same way you’d use a search engine, you might not get any results at all. Yet if you know how to use it, the database will return much more precise results.
When you look at a specific item in your search results, you will see a description of a book, article, or some other item, but not the item itself.
My mother was a librarian. She did not work outside the home after I was born, but she volunteered as librarian at our church and whatever elementary school I or my siblings attended. I often saw her sitting at the kitchen table with a stack of books and a typewriter making catalog cards.
It embarrasses me now to admit it, but it never occurred to me until after I moved to Chicago to work on my doctoral dissertation that someone actually wrote the descriptions typed on the catalog cards. (Or nowadays, on the screen.)
Each description requires original research. Most of the time, the librarian performs the research using five basic sources:
- the title page of the book (or other designated “chief source of information” for other media)
- the cataloging rules
- the classification scheme
- the subject authority file
- the name authority file
Occasionally a cataloger must consult any number of other resources to verify some of the required information.
A cataloging description presents some of the information in two different ways. First, it directly transcribes the title, author, etc. from the item. Second, it uses controlled vocabulary from the name and subject authority files. You can easily distinguish the controlled vocabulary, which appears as a hotlink, from transcription, which does not.
From the name authority file, the cataloger supplies a unique form of name to bring together everything by or about a person. It doesn’t matter how many ways the name might actually appear on a title page. With or without a middle initial for just one possible variant.
It doesn’t matter if a person’s name changes. Elizabeth Barrett, for example, published poetry before she married Robert Browning. Magic in the background of your search will bring all her work before and after she became Elizabeth Barrett Browning under one heading.
Governments, businesses, and other entities all have names. Titles are names, too, by the way. These names can show more variations than personal names. Controlled vocabulary puts all variants together.
It also doesn’t matter how many different people have the same name. John Adams the President, John Adams the modern composer, and your high school classmate John Adams who records folk songs each have separate authorized headings.
An online library catalog or other database puts all these pieces of description into “fields.” It provides labels for them like title, author, personal name, subject, ISBN, etc. You can search each field separately or combine them.
You can do a keyword search in a database. That may be the easiest way to find that first item that exactly suits your needs. Then you can use the links to find everything in the database associated with a name or subject.
So a search engine uses a complex mathematical formula to look at a gigantic index. Aside from the formula, it works without human intervention. It returns a list of actual pages, which you may or may not consider relevant.
A database comprises fields of information entered by a human being who has carefully examined an item. Your search will not give you the item (unless it happens to be available online).
Instead, you will see a detailed and accurate description. You use the description to determine how well the item will meet your needs. And then you can use it to find descriptions of other equally relevant items.