Rethinking the Integrity of the Scholarly Record in the Networked Information Age

By Clifford A. Lynch

Sequence: Volume 29, Number 2


Release Date: March/April 1994

There has been a great deal of discussion about the integrity of digital
information at the level of individual objects--texts, images, digital
video, computer programs--due in part to the great ease with which such
objects can be modified and redistributed in the network environment.
The ability, for example, to create extremely realistic digital images
of events that have not actually occurred is deeply disturbing when
these images are thought of simply as a technological extension of the
tradition of photography.

This author takes a broader perspective. As the transition from
print sources and the framework and traditions of print publication to
electronic information proceeds, extensive changes are taking place in
the nature of the scholarly and cultural record of our society and our
intellectual discourse. They go far beyond questions about the integrity
of the individual elements such as writings, images, sound recordings,
videos, and data files that compose those records. The changes are
systemic and often subtle.

Libraries and the Changing Record of Culture and Scholarship
Libraries have played a historical and vital role in preserving,
organizing, and providing access to the scholarly and cultural record.
Indeed, such a body of information loses much of its meaning if it is
not preserved and organized and if society as a whole does not have a
reasonable level of access to it. But a number of factors are calling
that historical role of libraries into doubt.

Electronic information is very seldom sold. Rather, it is licensed
to specific organizations for particular uses. Libraries are thus being
forced to acquire electronic information for use only by specific
communities of patrons. In addition, the information cannot be shared
among libraries through the interlibrary loan system in the way that
printed works historically have been shared, thus substantially reducing
access to the body of electronic information. Further, since much
electronic information is licensed only for a limited term, the ability
of the libraries to preserve the information is threatened by concern
about continued funds for licenses and even by the ongoing willingness
of rightsholders to offer the material for license at reasonable rates.

Much of the critical information is becoming elusive. For a variety
of reasons (including costs, unfamiliarity, and copyright concerns),
libraries have not done much to collect and preserve the contents of the
broadcast media that have become an essential part of our cultural and
historical record, at least when compared to the library community's
role in preserving print publication. Much of that material is broadcast
for onetime use by the public. And the sheer volume of broadcast seems
overwhelming.

Even in the print environment and the electronic products that
extrapolate printed publications, there are a number of trends that make
it difficult for libraries to continue to collect, preserve, and
organize the records of our society. Much of our media is steadily
moving toward narrowcasting and micropublishing. In essence, these are
the creation and distribution of large numbers of varying forms of works
targeted for ever-more-narrowly-defined communities. The trend is
manifested in specialized issues of magazines and regional versions of
newspapers. In many cases, these variant issues are poorly labeled,
making it difficult to determine where an article actually was published
and who read it. Libraries cannot collect all of the versions of these
works, even if they can identify them. In the extreme case, newspapers
can be viewed as databases of articles that are repackaged endlessly for
the needs of different readers; reconstructing who knew what when is
becoming a very complex problem. In this world, capturing the record
itself involves ensuring access to the source database (which is likely
to be licensed at very high cost, if at all, to the library community
and presented as a database that can be searched over the network on a
pay-per-view basis, which creates additional problems to be discussed
shortly), plus perhaps capture of the publication packaging rules, which
are likely to be viewed as highly proprietary.

Multimedia visionaries now speak of personal newspapers (or their
electronic analogs), which can be seen as representing the ultimate
breakdown of the cultural record and indeed of cultural coherence. No
two people will share the same view of events that are taking place in
our society from day to day, and libraries will not be able to capture
the shared cultural experience. This is information transformed into
pure commodity rather than reflecting shared experience and knowledge.
We do not know how to accommodate such a transformation in the
institutions that manage our record of discourse, experience, history,
and knowledge.

The networked environment raises other troubling problems. One
often-cited benefit of the transition to electronic information accessed
through networks is that one needs only one copy of the information,
which everyone can access, rather than redundantly storing large numbers
of printed copies. The advantage of the print distribution model is that
it is highly distributed and highly robust. It is unlikely, once a work
is distributed in printed form and acquired by libraries throughout the
country, that the contents of the printed work can be either lost or
altered (revised). But today some publishers are migrating from print
distribution to publisher-controlled pay-per-view network database
servers; other publishers are partnering with service bureaus such as
the Online Computer Library Center to establish similar models but with
the database stored at the service bureau. We have not yet addressed
what happens when these files fail to produce sufficient revenue to
justify maintaining them on the network (though some commercial service
bureaus, such as DIALOG Information Services, have removed some files
that serve the humanities, because such files failed to pay their way).

We have also yet to encounter the electronic analog of the burning
of the great library at Alexandria (either due to natural disaster and
inept off-site backup procedures or out of malice or cold, commercial
calculation), which was so devastating precisely because in a pre-
printing-press world there was such centralization of information at a
single site. In a post-printing-press world, we run the danger of
returning to the vulnerabilities inherent in such centralization. And it
is not only publishers (both commercial and nonprofit) who are moving to
centralized storage sites: government at all levels as well is
exploiting the potential for low-cost distribution of information
through computer networks. Though this has the advantage of making
government information much more accessible to the public in the near
term, it also places the information at risk in the longer term. Because
it is stored in a single place, still under government control rather
than widely distributed through mechanisms such as the Federal
Depository Library Program for printed works, it is subject to revision
or removal as government policies shift. Indeed, as recent experience
with the federal Department of Justice's JURIS system has suggested,
essential bodies of information could suddenly vanish, at least as far
as reasonable access by the research and education communities is
concerned.

The benefits of centralization and network access are very real.
But it seems prudent not to carry them too far; a reasonable number of
independently controlled archive sites for certain material are
necessary to provide confidence in the continued integrity and
accessibility of our scholarly, historical, and cultural record. In
cases when key components of the record represent valuable intellectual
property that is owned by publishers, however, we have yet to establish
and reach consensus on the compacts of responsible behavior necessary to
ensure acceptance of a conversion of the print publication base to
electronic format.

Convenience of Electronic Information Resources
The research libraries that have provided their patrons with electronic
information resources--either primary material in electronic formats or
secondary information such as online catalogs and abstracting and
indexing databases that provide access to primarily print holdings--have
observed a phenomenon that is occasionally amusing but at root deeply
disturbing: The electronic information resources are too successful and
too convenient. Users view them as defining the totality of available
information. If books do not appear in the online catalog, then the
library may as well not own them, since they will not be used; if a
journal isn't covered by the appropriate abstracting and indexing
databases, then articles published in that journal may as well not
exist.

The creators of these secondary information resources now wield
enormous power to define the literature of a discipline. Traditionally,
for example, reviewers have always been able to influence which works
receive attention, and the diversity of reviews and reviewers provided
some safeguards. In the world of expensive-to-produce and expensive-to-
acquire abstracting and indexing databases, however, there is often only
a single tool that defines the literature of a discipline. Even in cases
when multiple competing databases exist, the costs to license those
databases are such that a given library may license access to only one
for its user community.

Ironically, most of these electronic abstracting and indexing
databases provide coverage of the literature starting only in the 1970s
or, at best, the late 1960s. To a large part of the user community, this
means that the journal literature before that time (as opposed to the
monographic literature, which has been retrospectively cataloged by
libraries) might as well not exist.

In the area of primary literature, an even more interesting and
volatile situation exists, in which ambitious publishers have real
opportunities to establish their publications as definitive. The primary
literature currently available in electronic form is a patchwork: some
publishers have been willing to license their works to libraries at
reasonable prices; others have demanded very high prices or have refused
to license the material under any terms. As the electronic primary
literature reaches a critical mass, the natural tendency of library
patrons is to use the best of what is available and to ignore even very
high quality materials that are available only in printed form. In that
connection, I find it particularly interesting that the University of
California has been unable, to date, to secure acceptable licenses to
any of the major national newspapers of record in electronic form,
although there are a large variety of other news providers eager to do
business with the university. If our experience is typical, there is a
real possibility that those national print newspapers will be displaced
as an authoritative record by other information providers in the 1990s.
Similar observations can be made about scholarly journals and other
publications, some of which are moving aggressively into electronic
distribution while others remain unavailable in electronic formats.

Conclusions
The scholarly, historical, and cultural record is of central importance
to the research and education community and, indeed, to our entire
society. Massive shifts are taking place in the content that composes
this record and in the ways that content is managed. Questions about the
content and coherence of the record are complex, and the implications of
the trends that are developing, as information becomes an increasingly
customized electronic commodity, are still unclear. These problems will
make fertile ground for scholars and librarians, as well as for
policymakers, in the coming years. It is also important to recognize
that the continued management of the integrity of the record is too
vital to be entirely sacrificed in the name of economic efficiencies
made possible by the networked information environment. In that sense,
the role of libraries as trustees for society as a whole in preserving,
organizing, and providing access to the record needs to be maintained as
we enter the networked information environment, even though the
specifics of how that role is fulfilled will undoubtedly change.

Clifford A. Lynch is director of Library Automation at the University of
California Office of the President.




Take me to the index