There are good reasons to believe that the unparalleled flowering and
growth of the
World Wide Web may ultimately prove to be a curse rather than a blessing.
In January 1995, according to the Lycos crawler database
(http://lycos-tmp1.psc.edu), more than 2 million WWW documents were
published online. In early August 1995, Lycos had to keep track of 5.07
million Web pages. Two years later, in February 1997, Lycos had to keep
track of more than 34 million URLs. As of Tuesday, February 25, 1997, at
16:03 (Pacific Time), there were an estimated 2,188,545 World Wide Web
sites on the Internet. Certainly, information technology professionals are
delighted that since the early 1990s they have been able to inspect and
work with each other's hypertext files whenever and wherever they want.
Similarly, it is exciting to know that the realm of WWW resources is
growing exponentially and that every week powerful technological
innovations are devised and implemented. Nevertheless, two elementary
questions remain unanswered.
The first concerns the ratio of total volume of networked information
(measured in megabytes) to information useful to scholars - or to anyone,
for that matter. Is the ratio around 1:1? 100:1? 1,000:1? or perhaps even
greater? The other question regards long-term trends and prospects for the
quality and reliability of WWW-based information resources. Are we, with
the passage of time, being blessed with an ever-reaching, ever-faster, and
over-arching information matrix of true reliability, or are we being cursed
with tomorrow's multimedia mediocrity?
These questions cannot be overlooked or taken lightly, for they directly
affect our own future and electronic well-being. And, unfortunately, the
networked future looks far from rosy.
The Sins and Turmoil of the Web
The Web is the global sum of the uncoordinated activities of several
hundreds of thousands of people who deal with the system as they please. It
is a nebulous, ever-changing multitude of computer sites that house
continually changing chunks of multimedia information (numeric values,
text, graphic images, sound tracks, video clips and data-input forms), all
arranged in a bewildering variety of shapes and sizes. This information is
displayed on millions of pages (files) wired together by multiple hypertext
links. If the WWW were compared to a library, the "books" on its shelves
would keep changing their relative locations as well as their sizes and
names. Individual "pages" in those publications would be shuffled
ceaselessly. Finally, much of the data on those pages would be revised,
updated, extended, shortened or even deleted without warning almost daily.
Thus, the Web's chief structural feature is its permanent state of flux,
its fundamental inability to offer "Internauts" either a sense of constancy
of information or any real stability of location for its electronic
repositories. A state of flux is intrinsic to the nature of the Web, so
that any attempt to curb it is bound to inflict on the Web phenomenon a
violent shock, loss of vitality, and finally a rapid death; or, more
likely, the regulatory attempt itself will result in a dismal failure. In
other words, the WWW is very unlike the traditional world of books,
research journals, and microfilm, and it cannot now be made to emulate
Any complications and problems arising from the dynamic and near-chaotic
state of the Web are further compounded by the behavior of people and
institutions that manage sites on the system. I refer here to
organizational problems such as the abysmal and wasteful replication of
effort by different parties claiming to be the Internet's main site for a
given field of specialization; the lust and carelessness bordering on
promiscuity with which maintainers of Web pages establish links to other
related (and frequently unrelated) sites and pages; and the labyrinthine
circularity of links, forcing readers to jump for minutes on end from site
to site in search of a server that publishes its own data instead of
pointing to other catalogues. Other Web sins include chronic lack of
communication and cooperation between maintainers of sites specializing in
similar subject areas, as well as passivity and lack of feedback from
readers who use a site.
However, the greatest sin of all is an absurd fascination with
technological issues at the expense of any serious thought as to the raison
d'Ítre of the WWW, namely information itself. Our greatest folly seems to
be our willingness to cultivate this global communication system, open to
all and sundry, without first ensuring that we have enough useful and
trustworthy, accurate and timely information to be circulated across such a
It is strange indeed that reams upon reams of electronic pages are created
to deal with secondary issues such as SGML and HTML style sheets,
telecommunication and interface standards, delivery of the latest browser
and server software, and so on and so forth without anyone's ever bothering
to ask, What is the Web to be used for? How do we define and judge the
quality of electronic information? What are the minimal standards to be
observed? How might these standards differ from those developed for other
forms of publication?
If the World Wide Web, which originated as a communication tool for
scholars and researchers working at the cutting edge of the sciences, is to
have a valid future, these issues urgently need to be tackled. We must
investigate notions of information quality and make them applicable to
material published on the Web.
For instance, would the idea of accuracy remain equally important across
such a wide range of documents as news bulletins, dictionaries, medieval
wheat prices, mantissae in a table of logarithms, phone numbers,
photographs of Elvis Presley, photos taken by the Hubble Space Telescope,
maps of major historic battles, maps of airports, drawings of flowers and
shock-absorbers, and sound files of classical music and of the Nobel Prize
acceptance speech by the Dalai Lama?
And what about timeliness of information and the related issue of file
update frequency? Are news files to be updated every 12 hours? or 6 hours?
or every hour? Maybe it should happen continuously. Are online history
handbooks to be modified and updated every month, every year, or every
decade? Or should they be revised each time, without fail, that another
politically correct linguistic twist becomes fashionable? Also, how often
should Mendeleev's periodic table be updated?
Clearly, a mighty task awaits site managers, one that must be handled in
close cooperation with librarians, scholars, publishers and philosophers.
However, the analysis and successful resolution of all these methodological
issues will be only part of a greater battle. Web- and Net-related
standards still need to be drafted and circulated on the Net so that they
can be seen and eventually put into practice by the people involved in
shaping the Web.
Difficulties in Reforming the Web
The major battle, if and when it is fought, will be about the minimal
content standards for the Web. It is going to be a bloody, uphill struggle
against hundreds of thousands of people who love publishing online simply
because publishing is now feasible and inexpensive. The present body of the
WWW is determined largely by the developers' hunger for recognition and
applause from their peers. And who are these developers? My observations
suggest that they include primarily university undergraduates who gain
access to an account (and promptly generate personal home pages); online
advertising agencies (who mount business catalogues and leaflets);
programmers (who publish pages with links to Web software, Perl scripts,
and graphics converters or who construct new generations of Web crawlers
and information harvesters); a handful of adventurous journalists and
academics (who, as an experiment, mount a couple of documents derived from
their work or try to run an online database or electronic journal); and
finally, cataloguers and librarians (who compile registers of links).
One thing seems clear: Those with access (and copyright) to ample and
high-quality factual and/or scholarly materials are in the minority. Hence,
sites will inevitably vie with each other for the status of being the Web's
biggest (in terms of the catalogued hypertext links and the size of their
logos), or most technically advanced (in terms of the speed and capacity of
their search engines, interactivity, CGI scripts, and gateways to other
software systems), or most colorful and dazzling (in terms of visual
effects and virtual-reality technologies).
We can now see how this self-referential loop is formed: Good data is not
readily forthcoming, hence the preoccupation with hypertext and multimedia
techniques, and the "cool" appearance of pages. This motivates the WWW
culture to revolve around the bigger and better "containers" for
information and not around the information itself. Thus, the World Wide Web
Consortium (http://www.w3.org/hypertext/WWW), major Web sites, and software
houses regularly ignore the question of Web content and fuss exclusively
about new tools and applications:
So you've been wandering the Web for a while now, and you're ready to start
contributing to the great flow of information on the Internet. The first
thing to do is learn about HTML [. . .] to bring documents to your screen.
Next, you'll want to learn about adding online forms, graphics, sound, and
video - taking advantage of the interactive and multimedia capabilities of
the Web. And once you understand the basics and start coding pages like
mad, you'll want to find some development tools to make your work easier.
(Excerpt from the Creating Net Sites page by the Netscape Communications
With such a simplistic view of the Web, it's no wonder the so-called "great
flow of information on the Internet" in 1995 proved to be mainly a tide of
colorful snippets, advertising leaflets, cybermalls, tele-cafes, personal
home pages, tedious corporate mission statements, and sporadic pages of
unattributed and unreferenced data culled from paper sources. Whether this
flood of cyber junk will ever be halted remains to be seen.
The World Wide Web is extremely large and unruly, and thus it is very
difficult to influence or reengineer. The system's vastness and its
individualistic, polycentric, nonhierarchical mode of operation are
simultaneously the source of its tremendous vitality and its weakness. Bad
solutions and erroneous practices cannot be imposed arbitrarily on the WWW.
Yet, by the same token, good ideas are not necessarily embraced by the
population of Web maintainers, because the overall system has by now grown
so large and so multilayered that in the ongoing rush of new sites and new
pages, any solution - sensible or not - is simply invisible.
Impending Emergence of the MMM?
The WWW system has reached a crossroads. Since its inception in 1991, it
has evolved rapidly from a tool for congenial information-sharing among
CERN's high-energy-particle physicists to a channel of communication for
anyone with access to the Internet. Do we really need to link up with pages
about someone's goldfish or a Coke-vending machine?
Web-based information, tracked by dozens of Web crawlers and harvesters,
continues to grow exponentially without much thought for guidelines,
safeguards and standards concerning the quality, precision,
trustworthiness, durability, currency and authorship of this information.
The situation is untenable. Unless serious and energetic remedial steps are
taken at once by managers of the most prestigious and resourceful Web
sites, and by as many of the organizations dealing with Web and Internet
standards as possible, the system currently known as the WWW may come to be
known as the MMM (multimedia mediocrity).
Our concerted actions will wholly determine whether this most dynamic and
most promising part of the Internet will be seen in the not-too-distant
future as (with apologies to William Shakespeare) "a useless shadow, a poor
MMM that struts and frets his hour upon the Net, and then is heard no more,
a tale told by an idiot, full of digitized sound and multimedia fury,
During the time required to read this article, another 120 Web sites have
joined the Net. (See the Internet Statistics - Estimated site at
T. Matthew Ciolek, formerly a behavioral scientist, works at the Coombs
Computing Unit, Research Schools of Social Sciences and Pacific and Asian
Studies, Australian National University, Canberra, Australia.
He is the architect and administrator of the Coombspapers
(ftp://coombs.anu.edu.au/coombspapers), the world's oldest and largest
social sciences and humanities FTP site. He also runs the Coombs-web Social
Sciences Server (http://coombs.anu.edu.au) as a platform for seven
specialist World Wide Web virtual libraries (social sciences, Aboriginal,
Asian, Buddhist, demography and population, Pacific, and Tibetan studies).
He can be found on the WWW at http://coombs.
Take me to the index