September/
 October
1998

Copyright 1998 EDUCAUSE. From Educom Review, Volume 33, Number 5, p. 22-29. Permission to copy or disseminate all or part of this material is granted provided that the copies are not made or distributed for commercial advantage, the EDUCAUSE copyright and its date appear, and notice is given that copying is by permission of EDUCAUSE. To disseminate otherwise, or to republish, requires written permission. For further information, contact Jim Roche at EDUCAUSE, 4840 Pearl East Circle, Suite 302E, Boulder, CO 80301 USA; 303-939-0308; e-mail: jroche@educause.edu





Surfing with a Purpose
Process and strategy put to the test on the Internet


by Keith Gresham

"Since we have no choice but to be swept along by [this] vast technological surge, we might as well learn to surf."
--Michael Soule, in Conservation for the 21st Century, David Western and Mary C. Pearl, New York: Oxford University Press, 1989.

Searching the Internet, I've come to conclude, is a hybrid skill, part art and part science. While perched high upon Soule's metaphorical wave, success or failure largely depends upon a combination of finesse, experience, technique, aptitude, inventiveness, experimentation and good luck, all combined with an ability to proceed in a clear, methodical direction. It is, simply, all about learning to surf with a purpose.

Academic librarians who teach database searching and information retrieval have long focused their students' attention on strategies that emphasize processes rather than tool-based specifics. Individual search tools may come and go, students are taught, but the basic processes or concepts of information retrieval remain constant over time. The ability to understand and apply these processes is particularly critical when the search tools in question are Web-based search engines and related Internet tools.


CONCEPTUALIZING THE PROCESS: THE STRATEGY

The degree of disorganization of the Internet in general makes it far more likely that users of search engines or other Internet search tools often settle for what they can get rather than what they need. To improve the odds of locating what is needed, a clear definitive plan of attack is required. Having a plan, or a search strategy, makes all the difference in whether an Internet search ends in success or frustration. The foundation of a successful search strategy is built upon an understanding and application of the online search process. This process consists of six basic steps:

1. Determine the type of information you need.
The Internet, and particularly the Web, provides access to vast amounts of information from a variety of sources. Much of this information is useful, but much more is of questionable value or accuracy. Also recognize that much of the news and information published by newspapers, magazines and research journals is not made freely available on the Web. Try to determine what type of information (news articles, government reports, industry statistics, or professional papers, for example) best meets your need, and then try to pinpoint those specific businesses, organizations, or governmental bodies that might best and most reliably provide the type of information you hope to obtain.

2. Create a list of potential search terms.
Search terms are those specific words or phrases that best describe the major concepts of your topic. These terms may also include particular keywords that describe the type of information you identified as needing in the first step of the process. Identify both synonyms and broader and narrower terms for each concept. The full-text nature of Internet sources and the lack of controlled vocabulary to describe these sources require that you possess the widest array of potential search terms as possible.

3. Choose specific search tools that will retrieve
the type of information you want.

As with many complex tasks, success in searching the Internet depends upon selecting the right tool for the right job. Using a tool like AltaVista to locate a list of museums within a specific state is like using pliers to remove a screw from wood. You may eventually succeed in the task, but with substantially more effort and frustration than was probably necessary. When searching for information on the Internet, there are three main types of Web-based search tools to consider. General search engines are search tools that retrieve a wide variety of Internet sources that happen to contain your search words anywhere in the full-text of the source. Searches using these tools typically retrieve many thousands of sources, not all of them relevant to the topic. AltaVista and HotBot are two of the most frequently used general search engines. Specialty search engines are similar to general search engines but will only retrieve specific types of Internet sources. Some specialty search engines only retrieve news articles freely available on the Web, others search only medical or business-oriented sites, while still others search for relevant Usenet newsgroup messages and Web-based discussion forums. NewsBot, DejaNews and ForumOne are all examples of specialty search engines. Subject directories are search tools that organize selected Internet sites into broad-based hierarchical subject categories. Search results vary depending upon whether or not your search terms appear in the category headings or the title and description of a particular Internet site. Subject directories generally do not search the full-text of Internet sources. Yahoo!, frequently referred to incorrectly as a search engine, is probably the best known Internet subject directory.

4. Construct a search statement and conduct your search.
A search statement, the words you actually type into the computer when searching, consists of selected search terms for each topic concept identified earlier. Depending on the search tool you have chosen, you may have an option to search by individual key words, by exact phrases, or by Boolean search expressions. Be as specific as possible in constructing the statement, and take advantage of any advanced features that allow limiting by date, language or type of site.

5. Evaluate your results.
Even the most carefully constructed search statement will often retrieve irrelevant results or result lists with thousands of hits. If you don't locate useful information within the first 15 retrieved sites, you should consider moving on to the next step of the process. If the search tool does retrieve information relevant to your topic, ask yourself if the information meets your specific needs, whether the information seems current enough to be useful, and whether the producer of the information or Web site is a reliable source.

6. Revise your search statement if necessary
(and it probably will be).

Construct new search statements using different combinations of search words or phrases. Each new combination will retrieve slightly different results. Add or remove relevant limiting options to see if any new results emerge. If after five revisions you still aren't finding useful information, consider switching to a different search tool and repeat the search process over again.



THEORY INTO PRACTICE: 5 TEST CASES

Developing strategic processes and conceptual frameworks are all good and fine in the abstract, but will they float when caught in the shifting tides that characterize the Web? Using a general research topic supplied by the editor of Educom Review, I set upon a course of action to find out. What follows are five focused test cases that demonstrate how the search process carried out using specific Internet search tools can retrieve recent information on work-related or profession-related accidents and injuries.


Test Case One

1. Type of information: Extremely recent news articles and Web sites about on-the-job injuries and accidents.

2. Potential search terms: Employment, workplace, work-related, injuries, accidents, strains, stresses.

3. Search tools:
AltaVista
(http://www.altavista.digital.com) HotBot (http://www.hotbot.com) Northern Light (http://www.northernlight.com). These three general search engines are reported to possess the largest databases of all the available Web-based search tools. As a result, retrieval lists tend to be quite large unless specific limiting techniques are used. A full range of basic and advanced searching options are available with all three tools. In addition to searching for information on Web pages, Northern Light also retrieves full-text newspaper and magazine articles from its database and makes them available for online purchase.

4. Search statements:
"workplace injuries" limited to sites created between June 1 and June 25, 1998

5. Evaluate results:
AltaVista, long the general search engine with the widest name recognition, retrieved 10 items with this search statement. With the exception of a news article from PCWorld Online on pending legislation to reduce the incidence of carpal tunnel syndrome in the nation's workforce, the retrieved sites were all advertising a specific product or service. Four of the retrieved items were press releases and advertisements (disguised as news stories) from a workers compensation insurance company, and three others were from law firms specializing in personal injury lawsuits. The two remaining items were advertisements from a manufacturer of non-skid safety coatings and a company selling a product that uses electric current to treat repetitive stress injuries.

HotBot, considered by many searchers to contain the largest database and be the most user-friendly of the search engines, retrieved 466 hits. Although the limit-by-date feature did not work properly, the first 10 hits to appear on the retrieval list were largely relevant. The list included the National Coalition on Ergonomics (an anti-ergonomics regulation lobbying group), the Occupational Safety and Health Administration, the Bureau of Labor Statistics, two foreign government labor regulatory agencies, two law firms, and the non-skid safety coating manufacturer.

Northern Light, a relative newcomer to the search engine game, retrieved 20 citations and abstracts to news articles from its fee-based file. The articles, ranging in price from $1-$4 for the full text (citations and abstracts are free), were from such sources as the Detroit News, the Berkeley Journal of Employment and Labor Law, the St. Petersburg Times, the Greensboro News & Record, and the Business Wire news service. Removing the limit-by-date request in the Northern Light search expanded the results to also include traditional, freely available Web pages.

6. Revise search:
"workplace injuries" AND "repetitive motion"
To reduce the large number of results in HotBot to a more manageable size, the search was revised to include search terms on a specific occupational injury. This revision cut the retrieval size down to 25 hits and revealed the existence of CTDNews, an online newsletter about workplace repetitive stress injuries. Other results included articles from the San Francisco Chronicle, the Business Journal of Sacramento, the Washington Post, and the Associated Press wire service.


Test Case Two

1. Type of information: News articles written in the past week.

2. Potential search terms: Workplace accidents, workplace injuries, occupational injuries.

3. Search tools:
NewsBot (http://www.newsbot.com), NewsIndex (http://www.newsindex.com) TotalNews (http://www.totalnews.com). These specialty search engines are similar to general search engines, but they search specifically for news articles made freely available on the Web by online newspapers, broadcast news sites, and Web-based news services.

4. Search statement:
"workplace injuries" OR "occupational injuries"

5. Evaluate results:
When limited to articles from the most recent seven days, NewsBot returned a single article from Reuters news service about a multimillion-dollar lawsuit against Digital Equipment Corporation over workplace injuries allegedly caused by the use of its computer keyboards.

With its less sophisticated search engine, NewsIndex retrieved more articles than NewsBot, but many of these were irrelevant to the topic. Useful articles that did emerge included a Los Angeles Times News Service article on workplace violence in post offices, an Associated Press wire story on high rates of hand and arm injuries in the telecommunications industry, and a U.S. Newswire press release about a recent nominee to the federal Occupational Safety and Health Review Commission.

The TotalNews search engine does not allow the use of Boolean connectors with phrases, so each phrase in the search statement was tried individually. Neither phrase produced any results.

6. Revise search:
"workplace accidents" Entering the revised statement in TotalNews proved successful and retrieved an Associated Press story about on-the-job accident rates for substance abusers and a Reuters story on the causes of workplace accidents in France.


Test Case Three

1. Type of information: Government statistics on work-related accidents and injuries.

2. Potential search terms: Statistics, data, workplace, occupational, accidents, injuries.

3. Search tools:
AltaVista (http://www.altavista.digital.com) HotBot (http://www.hotbot.com) Northern Light (http://www.northernlight.com)

4. Search statement:
occupational AND injuries AND statistics

5. Evaluate results:
With this basic search statement, all three search engines generally returned far too many results (8,000+) to be very useful. However, at the top of HotBot's retrieval list was the Bureau of Labor Statistics' Safety and Health Statistics Home Page, a wealth of statistical data compiled by the federal government. This site included numerous statistical tables and data sets on annual fatal and non-fatal occupational injuries and accidents broken down by specific industry and by selected characteristics.

6. Revise search:
To narrow the search to just those sites created or produced by a governmental body, the search statement in all three search engines was revised to limit search results to those whose addresses end in the domain ".gov". In all three cases, the revised search narrowed the retrieval list down to under 2,000 hits. The results toward the top of the lists from both HotBot and Northern Light revealed many relevant sites, including major workplace statistics sites from OSHA, the Centers for Disease Control and Prevention, and the states of Florida, Alaska and Wisconsin. The relevancy of the results in AltaVista was not as good.


Test Case Four

1. Type of information: Reports and working papers from occupational safety organizations.

2. Potential search terms: Worker, workplace, safety, accidents, organizations, associations.

3. Search tools:
Yahoo! (http://www.yahoo.com), LookSmart (http://www.looksmart.com). With their topical category arrangements, Internet subject directories like Yahoo! and LookSmart can be extremely useful for quickly locating the Web sites of research, professional and nonprofit organizations and associations involved in similar types of work. These Web sites, in turn, frequently provide quick access to the reports, working papers, essays and press releases of the particular organizations.

4. Search statement:
workplace safety organizations

5. Evaluate results:
Entering the search statement in Yahoo! revealed the topical category "Health:Workplace:Organizations". Included in this category were links to the useful home pages of the Occupational Safety and Health Administration, the American Industrial Hygiene Association, and the Institution of Occupational Safety and Health, a leading occupational health organization in Europe.

In LookSmart, the search statement retrieved many of the same sites under the category "Workplace Issues:Health & Safety". Links to the home pages of some different organizations did emerge, however, including the National Institute for Occupational Safety and Health.

6. Revise search:
Much like browsing the book shelves of a library, the hierarchical arrangement of subject categories in Yahoo! and LookSmart permits the easy discovery of other topically related categories. Moving up and down these topical chains in Yahoo! led to the category of "Health:Medicine:Occupational", which contained links to the Web sites of such relevant organizations as the Occupational Injury Prevention Rehabilitation Society and the Occupational and Industrial Orthopaedic Center, a research center affiliated with the NYU Medical Center.

Navigating through the hierarchy in Yahoo! also revealed the category of "Health:Workplace:Indices". Listed in this category were useful subject guides that provided links to all of the major Web sites on the topic. Included in the list were the Directory of Internet Sites in Occupational and Environmental Health, Occupational Safety and Health Resources Net, and the Institute of Occupational Safety Engineering's OSHWeb.


Test Case Five

1. Type of information: Newsgroup and discussion forum messages.

2. Potential search terms: Carpal tunnel syndrome, cts, repetitive stress injuries.

3. Search tools:
DejaNews (http://www.dejanews.com), ForumOne (http://www.forumone.com). These specialty search engines locate messages posted on thousands of Usenet newsgroups and Web-based discussion forums. DejaNews searches Usenet newsgroups, and ForumOne searches discussion forums.

4. Search statement:
"carpal tunnel syndrome" OR cts

5. Evaluate results:
ForumOne retrieved 16 discussion forum messages, all of which were relevant to the topic. Discussion forums included Cafe Utne, Sympatico, iVillage.com, and ParentsPlace.com. DejaNews retrieved 130,000 messages from such Usenet newsgroups as misc.health.therapy.occupational, alt.support.chronic-pain, and alt.guitar. To reduce the number of messages to a more manageable size, add additional search terms or limit the search by date.

6. Revise search:
("carpal tunnel syndrome" OR cts) AND prevention
In DejaNews, this revised search statement reduced the retrieval set to a list of 200 messages sorted by date and revealed the existence of an additional relevant newsgroup, sci.med.occupational. Although many of the newly retrieved messages were relevant to the topic, some irrelevant messages still remained on the list.



EXPANDING SEARCH HORIZONS

As the test cases demonstrate, search engines and subject directories can be used to locate otherwise difficult-to-obtain information with minimal frustration. All that is needed is a strategic plan and a willingness to experiment with various search tools using a variety of techniques. The Internet, of course, is not the only-or even necessarily the best-game in town. Many college and university libraries subscribe to commercial Web-based bibliographic, full-text and numeric databases that provide access to far different sources of information than can be found using AltaVista, HotBot or Yahoo!. Databases available through online systems such as FirstSearch, Dialog, Infotrac SearchBank and Ovid should not be overlooked as potential sources of valuable information and can also be searched with minimal difficulty if a well-designed, process-oriented research strategy is followed.

Keith Gresham is assistant professor and instruction librarian at the University of Colorado at Boulder Libraries. gresham@spot.colorado.edu


Educom Review Table of Contents