|
Surfing with
a Purpose
Process
and strategy put to the test on the Internet
by
Keith Gresham
"Since
we have no choice but to be swept along by [this] vast technological
surge, we might as well learn to surf."
--Michael
Soule, in Conservation for the 21st Century, David Western and
Mary C. Pearl, New York: Oxford University Press, 1989.
Searching
the Internet, I've come to conclude, is a hybrid skill, part art and part
science. While perched high upon Soule's metaphorical wave, success or
failure largely depends upon a combination of finesse, experience, technique,
aptitude, inventiveness, experimentation and good luck, all combined with
an ability to proceed in a clear, methodical direction. It is, simply,
all about learning to surf with a purpose.
Academic
librarians who teach database searching and information retrieval have
long focused their students' attention on strategies that emphasize processes
rather than tool-based specifics. Individual search tools may come and
go, students are taught, but the basic processes or concepts of information
retrieval remain constant over time. The ability to understand and apply
these processes is particularly critical when the search tools in question
are Web-based search engines and related Internet tools.
CONCEPTUALIZING
THE PROCESS: THE STRATEGY
The degree
of disorganization of the Internet in general makes it far more likely
that users of search engines or other Internet search tools often settle
for what they can get rather than what they need. To improve the odds
of locating what is needed, a clear definitive plan of attack is required.
Having a plan, or a search strategy, makes all the difference in whether
an Internet search ends in success or frustration. The foundation of a
successful search strategy is built upon an understanding and application
of the online search process. This process consists of six basic steps:
1. Determine the
type of information you need.
The Internet, and particularly the Web, provides access to vast amounts
of information from a variety of sources. Much of this information is
useful, but much more is of questionable value or accuracy. Also recognize
that much of the news and information published by newspapers, magazines
and research journals is not made freely available on the Web. Try to
determine what type of information (news articles, government reports,
industry statistics, or professional papers, for example) best meets your
need, and then try to pinpoint those specific businesses, organizations,
or governmental bodies that might best and most reliably provide the type
of information you hope to obtain.
2. Create a list
of potential search terms.
Search terms are those specific words or phrases that best describe the
major concepts of your topic. These terms may also include particular
keywords that describe the type of information you identified as needing
in the first step of the process. Identify both synonyms and broader and
narrower terms for each concept. The full-text nature of Internet sources
and the lack of controlled vocabulary to describe these sources require
that you possess the widest array of potential search terms as possible.
3. Choose specific
search tools that will retrieve
the type of information you want.
As with many complex tasks, success in searching the Internet depends
upon selecting the right tool for the right job. Using a tool like AltaVista
to locate a list of museums within a specific state is like using pliers
to remove a screw from wood. You may eventually succeed in the task, but
with substantially more effort and frustration than was probably necessary.
When searching for information on the Internet, there are three main types
of Web-based search tools to consider. General search engines are search
tools that retrieve a wide variety of Internet sources that happen to
contain your search words anywhere in the full-text of the source. Searches
using these tools typically retrieve many thousands of sources, not all
of them relevant to the topic. AltaVista and HotBot are two of the most
frequently used general search engines. Specialty search engines are similar
to general search engines but will only retrieve specific types of Internet
sources. Some specialty search engines only retrieve news articles freely
available on the Web, others search only medical or business-oriented
sites, while still others search for relevant Usenet newsgroup messages
and Web-based discussion forums. NewsBot, DejaNews and ForumOne are all
examples of specialty search engines. Subject directories are search tools
that organize selected Internet sites into broad-based hierarchical subject
categories. Search results vary depending upon whether or not your search
terms appear in the category headings or the title and description of
a particular Internet site. Subject directories generally do not search
the full-text of Internet sources. Yahoo!, frequently referred to incorrectly
as a search engine, is probably the best known Internet subject directory.
4. Construct a
search statement and conduct your search.
A search statement, the words you actually type into the computer
when searching, consists of selected search terms for each topic concept
identified earlier. Depending on the search tool you have chosen, you
may have an option to search by individual key words, by exact phrases,
or by Boolean search expressions. Be as specific as possible in constructing
the statement, and take advantage of any advanced features that allow
limiting by date, language or type of site.
5. Evaluate your
results.
Even the most carefully constructed search statement will often retrieve
irrelevant results or result lists with thousands of hits. If you don't
locate useful information within the first 15 retrieved sites, you should
consider moving on to the next step of the process. If the search tool
does retrieve information relevant to your topic, ask yourself if the
information meets your specific needs, whether the information seems current
enough to be useful, and whether the producer of the information or Web
site is a reliable source.
6. Revise your
search statement if necessary
(and it probably will be).
Construct new search statements using different combinations of search
words or phrases. Each new combination will retrieve slightly different
results. Add or remove relevant limiting options to see if any new results
emerge. If after five revisions you still aren't finding useful information,
consider switching to a different search tool and repeat the search process
over again.
THEORY
INTO PRACTICE: 5 TEST CASES
Developing
strategic processes and conceptual frameworks are all good and fine in
the abstract, but will they float when caught in the shifting tides that
characterize the Web? Using a general research topic supplied by the editor
of Educom Review, I set upon a course of action to find out. What follows
are five focused test cases that demonstrate how the search process carried
out using specific Internet search tools can retrieve recent information
on work-related or profession-related accidents and injuries.
Test Case
One
1.
Type of information:
Extremely recent news articles and Web sites about on-the-job injuries
and accidents.
2. Potential
search terms: Employment, workplace, work-related, injuries, accidents,
strains, stresses.
3. Search
tools:
AltaVista (http://www.altavista.digital.com)
HotBot (http://www.hotbot.com)
Northern Light (http://www.northernlight.com).
These three general search engines are reported to possess the largest
databases of all the available Web-based search tools. As a result, retrieval
lists tend to be quite large unless specific limiting techniques are used.
A full range of basic and advanced searching options are available with
all three tools. In addition to searching for information on Web pages,
Northern Light also retrieves full-text newspaper and magazine articles
from its database and makes them available for online purchase.
4. Search
statements:
"workplace injuries" limited to sites created between June 1 and June
25, 1998
5. Evaluate
results:
AltaVista,
long the general search engine with the widest name recognition, retrieved
10 items with this search statement. With the exception of a news article
from PCWorld Online on pending legislation to reduce the incidence of
carpal tunnel syndrome in the nation's workforce, the retrieved sites
were all advertising a specific product or service. Four of the retrieved
items were press releases and advertisements (disguised as news stories)
from a workers compensation insurance company, and three others were from
law firms specializing in personal injury lawsuits. The two remaining
items were advertisements from a manufacturer of non-skid safety coatings
and a company selling a product that uses electric current to treat repetitive
stress injuries.
HotBot, considered by many searchers to contain the largest database and
be the most user-friendly of the search engines, retrieved 466 hits. Although
the limit-by-date feature did not work properly, the first 10 hits to
appear on the retrieval list were largely relevant. The list included
the National Coalition on Ergonomics (an anti-ergonomics regulation lobbying
group), the Occupational Safety and Health Administration, the Bureau
of Labor Statistics, two foreign government labor regulatory agencies,
two law firms, and the non-skid safety coating manufacturer.
Northern Light, a relative newcomer to the search engine game,
retrieved 20 citations and abstracts to news articles from its fee-based
file. The articles, ranging in price from $1-$4 for the full text (citations
and abstracts are free), were from such sources as the Detroit News,
the Berkeley Journal of Employment and Labor Law, the St. Petersburg
Times, the Greensboro News & Record, and the Business Wire
news service. Removing the limit-by-date request in the Northern Light
search expanded the results to also include traditional, freely available
Web pages.
6. Revise
search:
"workplace
injuries" AND "repetitive motion"
To reduce the large number of results in HotBot to a more manageable size,
the search was revised to include search terms on a specific occupational
injury. This revision cut the retrieval size down to 25 hits and revealed
the existence of CTDNews, an online newsletter about workplace repetitive
stress injuries. Other results included articles from the San Francisco
Chronicle, the Business Journal of Sacramento, the Washington
Post, and the Associated Press wire service.
Test Case
Two
1. Type
of information: News articles written in the past week.
2. Potential
search terms: Workplace accidents, workplace injuries, occupational
injuries.
3. Search
tools:
NewsBot (http://www.newsbot.com),
NewsIndex (http://www.newsindex.com)
TotalNews (http://www.totalnews.com).
These specialty search engines are similar to general search engines,
but they search specifically for news articles made freely available on
the Web by online newspapers, broadcast news sites, and Web-based news
services.
4. Search
statement:
"workplace injuries"
OR "occupational injuries"
5. Evaluate
results:
When limited to articles from the most recent seven days, NewsBot returned
a single article from Reuters news service about a multimillion-dollar
lawsuit against Digital Equipment Corporation over workplace injuries
allegedly caused by the use of its computer keyboards.
With its
less sophisticated search engine, NewsIndex retrieved more articles than
NewsBot, but many of these were irrelevant to the topic. Useful articles
that did emerge included a Los Angeles Times News Service article on workplace
violence in post offices, an Associated Press wire story on high rates
of hand and arm injuries in the telecommunications industry, and a U.S.
Newswire press release about a recent nominee to the federal Occupational
Safety and Health Review Commission.
The TotalNews
search engine does not allow the use of Boolean connectors with phrases,
so each phrase in the search statement was tried individually. Neither
phrase produced any results.
6. Revise
search:
"workplace accidents" Entering the revised statement in TotalNews
proved successful and retrieved an Associated Press story about on-the-job
accident rates for substance abusers and a Reuters story on the causes
of workplace accidents in France.
Test Case
Three
1. Type
of information: Government statistics on work-related accidents and
injuries.
2. Potential
search terms: Statistics, data, workplace, occupational, accidents,
injuries.
3. Search
tools:
AltaVista (http://www.altavista.digital.com)
HotBot (http://www.hotbot.com)
Northern Light (http://www.northernlight.com)
4. Search
statement:
occupational AND injuries AND statistics
5. Evaluate
results:
With this basic search statement, all three search engines generally returned
far too many results (8,000+) to be very useful. However, at the top of
HotBot's retrieval list was the Bureau of Labor Statistics' Safety and
Health Statistics Home Page, a wealth of statistical data compiled by
the federal government. This site included numerous statistical tables
and data sets on annual fatal and non-fatal occupational injuries and
accidents broken down by specific industry and by selected characteristics.
6. Revise
search:
To narrow the search to just those sites created or produced by a governmental
body, the search statement in all three search engines was revised to
limit search results to those whose addresses end in the domain ".gov".
In all three cases, the revised search narrowed the retrieval list down
to under 2,000 hits. The results toward the top of the lists from both
HotBot and Northern Light revealed many relevant sites, including major
workplace statistics sites from OSHA, the Centers for Disease Control
and Prevention, and the states of Florida, Alaska and Wisconsin. The relevancy
of the results in AltaVista was not as good.
Test Case
Four
1. Type
of information: Reports and working papers from occupational safety
organizations.
2. Potential
search terms: Worker, workplace, safety, accidents, organizations,
associations.
3. Search
tools:
Yahoo! (http://www.yahoo.com),
LookSmart (http://www.looksmart.com).
With their topical category arrangements, Internet subject directories
like Yahoo! and LookSmart can be extremely useful for quickly locating
the Web sites of research, professional and nonprofit organizations and
associations involved in similar types of work. These Web sites, in turn,
frequently provide quick access to the reports, working papers, essays
and press releases of the particular organizations.
4. Search
statement:
workplace safety organizations
5. Evaluate
results:
Entering
the search statement in Yahoo! revealed the topical category "Health:Workplace:Organizations".
Included in this category were links to the useful home pages of the Occupational
Safety and Health Administration, the American Industrial Hygiene Association,
and the Institution of Occupational Safety and Health, a leading occupational
health organization in Europe.
In LookSmart,
the search statement retrieved many of the same sites under the category
"Workplace Issues:Health & Safety". Links to the home pages of some different
organizations did emerge, however, including the National Institute for
Occupational Safety and Health.
6. Revise
search:
Much like browsing the book shelves of a library, the hierarchical arrangement
of subject categories in Yahoo! and LookSmart permits the easy discovery
of other topically related categories. Moving up and down these topical
chains in Yahoo! led to the category of "Health:Medicine:Occupational",
which contained links to the Web sites of such relevant organizations
as the Occupational Injury Prevention Rehabilitation Society and the Occupational
and Industrial Orthopaedic Center, a research center affiliated with the
NYU Medical Center.
Navigating
through the hierarchy in Yahoo! also revealed the category of "Health:Workplace:Indices".
Listed in this category were useful subject guides that provided links
to all of the major Web sites on the topic. Included in the list were
the Directory of Internet Sites in Occupational and Environmental Health,
Occupational Safety and Health Resources Net, and the Institute of Occupational
Safety Engineering's OSHWeb.
Test Case
Five
1. Type
of information: Newsgroup and discussion forum messages.
2. Potential
search terms: Carpal tunnel syndrome, cts, repetitive stress injuries.
3. Search
tools:
DejaNews (http://www.dejanews.com),
ForumOne (http://www.forumone.com).
These specialty search engines locate messages posted on thousands of
Usenet newsgroups and Web-based discussion forums. DejaNews searches Usenet
newsgroups, and ForumOne searches discussion forums.
4. Search
statement:
"carpal tunnel syndrome" OR cts
5. Evaluate
results:
ForumOne retrieved 16 discussion forum messages, all of which were relevant
to the topic. Discussion forums included Cafe
Utne, Sympatico, iVillage.com,
and ParentsPlace.com. DejaNews
retrieved 130,000 messages from such Usenet newsgroups as misc.health.therapy.occupational,
alt.support.chronic-pain, and alt.guitar. To reduce the number of messages
to a more manageable size, add additional search terms or limit the search
by date.
6. Revise
search:
("carpal tunnel
syndrome" OR cts) AND prevention
In DejaNews, this revised search statement reduced the retrieval set to
a list of 200 messages sorted by date and revealed the existence of an
additional relevant newsgroup, sci.med.occupational. Although many of
the newly retrieved messages were relevant to the topic, some irrelevant
messages still remained on the list.
EXPANDING
SEARCH HORIZONS
As the test cases demonstrate, search engines and subject directories
can be used to locate otherwise difficult-to-obtain information with minimal
frustration. All that is needed is a strategic plan and a willingness
to experiment with various search tools using a variety of techniques.
The Internet, of course, is not the only-or even necessarily the best-game
in town. Many college and university libraries subscribe to commercial
Web-based bibliographic, full-text and numeric databases that provide
access to far different sources of information than can be found using
AltaVista, HotBot or Yahoo!. Databases available through online systems
such as FirstSearch, Dialog, Infotrac SearchBank and Ovid should not be
overlooked as potential sources of valuable information and can also be
searched with minimal difficulty if a well-designed, process-oriented
research strategy is followed.
Keith
Gresham is assistant professor and instruction librarian at the University
of Colorado at Boulder Libraries. gresham@spot.colorado.edu
Educom
Review Table
of Contents
|