University of Illinois Library

1408 W. Gregory Drive

(MC-522)

Urbana, IL, 61801


Email:
dcc[at]library.illinois.edu

An Introduction to Digital Projects for
Libraries, Museums and Archives

Trevor Jones, Project Coordinator, Illinois Digitization Institute

Originally Published as a Technical Insert for the Illinois Heritage Association
(http://illinoisheritage.prairienet.org/)


May, 2001

We live in an increasingly digital world. Hundreds of libraries, museums and archives have recently launched projects designed to digitize their collections and place them on the web. According to digital expert Stephen E. Ostrow, this trend is both “auspicious and ominous” for cultural heritage institutions. The potential of digital projects to present information in new and important ways seems limitless. Currently, however, digitization remains plagued by confusing standards, changing technologies, and doubts about the long-term viability of digital files. This technical insert is designed to provide an overview of digital project planning.

What is Digitization?

Cornell University Library defines digital images as: “electronic snapshots taken of a scene or scanned from documents, such as photographs, manuscripts, printed texts, and artwork. The digital image is sampled and mapped as a grid of dots or picture elements (pixels). Each pixel is assigned a tonal value (black, white, shades of gray or color), which is represented in binary code (zeros and ones). The binary digits ("bits") for each pixel are stored in a sequence by a computer and often reduced to a mathematical representation (compressed). The bits are then interpreted and read by the computer to produce an analog version for display or printing.” (http://www.library.cornell.edu/preservation/tutorial/intro/intro-01.html)

In summary, digitization converts materials from formats that can be read by people (analog) to a format that can be read only by machines (digital). Flatbed scanning, digital cameras, planetary cameras, and a number of other devices can be used to digitize cultural heritage materials.

Why Digitize? -- Access and Preservation:

The main reasons to digitize are to enhance access and improve preservation. By digitizing their collections, cultural heritage institutions can make information accessible that was previously only available to a select group of researchers. Digital projects allow users to search collections rapidly and comprehensively from anywhere at any time. Examples of digital projects include the Making of America website where entire books can be searched for specific words (http://www.hti.umich.edu/m/moa.new/), and the Library of Congress’s American Memory page where you can listen to recordings made by Thomas Edison. (http://memory.loc.gov/ammem/edhtml/edsndhm.html). Digitization can even make the invisible visible. The Nebraska State Historical Society has scanned glass plate negatives and enhanced them in order to see inside the formerly dark doors of pioneers’ sod houses (http://www.nebraskahistory.org/lib-arch/research/photos/digital/index.htm). To see an extensive list of digital projects, visit the Vassar College Libraries website at: (http://library.vassar.edu/research/guides/general/imageprojects.html).

Preservation:

Digitization can also help preserve precious materials. Making high-quality digital images available electronically can reduce wear and tear on fragile items. This does not mean, however, that digital copies should be seen as a replacement for the original piece. Digital files are not permanent and should be maintained and periodically transferred to new formats. Even after digitization, original documents and artifacts must still be cared for. Preservation remains a secondary benefit of digital projects. If preserving a collection is deemed a higher priority than increasing access to it, a better use of resources would to purchase acid free folders, encapsulate fragile documents, or otherwise improve storage conditions.

In summary:

The Benefits of Digital Access for Collections:

The Preservation Benefits for Collections:

The Downside of Digitization:

Despite everything that digitization can accomplish, there are also some very good reasons to stay out of the digital realm. First, not every collection is worth digitizing. The idea of entire libraries or museums being completely online is a long way off, and many experts say that it will never happen. Successful digital projects are the result of careful evaluation of collections, and the digitization of only those items that will provide the greatest benefit to the user.

Digital projects are also expensive. At this point no institution has managed to make digitization projects cost-effective, and most attempts to recoup the costs of digital imaging through user fees have failed. Costs for digitization continue even after a project’s conclusion, as all digital files require maintenance to ensure that they will readable in the future. The Research Library Group has posted some excellent and eye-opening information about the long-term costs of digitization in their DigiNews newsletter. It is available at: http://www.oclc.org/programs/publications/newsletters/diginews.htm.

Despite the high costs of digital conversion, it is a near certainty that digitization’s importance will increase exponentially in the future. Patrons are already requesting high quality digital images from cultural heritage institutions, and this demand will only increase as time goes on. An excellent source about the pros and cons of digitization is Why Digitize? by Abby Smith, published by the Council on Library and Information Resources (CLIR), February 1999. It is (of course) available online at (http://www.clir.org/pubs/reports/pub80-smith/pub80.html#preservation).

Planning for Digitization:

The success of digital projects hinges not on expensive technology, but rather on sound project planning. Perhaps because digitization is relatively new, institutions too often concentrate on technological aspects before deciding on a project’s goals. Technology should never drive digital projects. Goals should be determined first, and only then should the appropriate technology be selected in order to meet a project’s objectives.

It is best to ask a series of questions before starting a digital project. What will be gained by digitizing? Could the same ends be reached with a book, exhibit, pamphlet, presentation or video? How will the project fit into the institution’s goals? What benefits will be realized? How will it be determined if the project was successful?

A pilot digitization project should start with a manageable collection. Focusing on items with consistent or standard formats (photographs of all one size or type, documents from one collection, etc) provides the best chance of success.

Setting Goals:

Good goal setting is important for any new initiative, and digitization is no exception. Simply having a goal that states “we want to make our materials more accessible on the web” is not specific enough. Consider the following: Who will access this collection? What will they be looking for? How will they use it? How many people will use it? How will it be advertised? How will increased use of this material benefit the user and the institution?

Contacting current and potential users is an excellent way to determine the answers to these questions. Consider sending out a survey to the project’s intended audience in order to learn how they are currently using the material, and how they might use it differently if it was digitized. It is also helpful to contact other institutions that have digitized similar collections and learn from their successes and failures.

Who Owns It? --- Copyright:

Copyright is a complex issue that strongly impacts the selection of materials for digitization. Many cultural heritage institutions have chosen to avoid the complexities of copyright law by digitizing materials that have passed into the public domain and are no longer covered by copyright restrictions. Consulting information on what is covered by copyright is an essential step in the selection process. Some excellent (and brief) materials on copyright include. “Copyright Term and the Public Domain in the United States” by Peter Hirtle (http://www.copyright.cornell.edu/training/Hirtle_Public_Domain.htm) as well as a set of guidelines from Lolly Gasaway of the University of North Carolina at Chapel Hill, which can be viewed at (http://www.unc.edu/~unclng/public-d.htm). For a more detailed discussion of copyright restrictions, take the superb Copyright Crash Course from the University of Texas at (http://www.utsystem.edu/ogc/intellectualproperty/cprtindx.htm).

Who Does the Work -- In-house or Outsourcing?

Deciding whether or not to digitize in-house or hire an outside company to do the work is a major decision, and largely depends on the materials to be digitized. Outside vendors are generally very efficient at digitizing materials of similar format and size. Most vendors, however, work primarily for large corporations and they may not be familiar with archival standards, or how to handle rare and fragile objects. Remember that even when a project is outsourced, good project management remains crucial. For a list of vendors, as well as information on how to create a formal Request for Proposal, visit the Colorado Digitization Project at: http://www.cdpheritage.org/digital/scanning/vendors.html.

Hiring and Training Staff:

Digital projects require new skills. Project planning should allow time to teach current staff new technologies. Even if an outside vendor completes a project, or new staff is hired specifically to work on a digital project, permanent staff should at least learn the basic theories and practices of digitization. Institutions often hire short-term staff for digitization projects which can result in the loss of digital expertise when the project ends.

Budget:

Budgets for digitization projects should include the following categories:

Analysis of past digital projects has provided some useful information on budget development. Stephen Puglia from the National Archives and Records Administration has determined that digitization costs typically breakdown as follows:

Puglia’s entire article is available at: http://www.oclc.org/programs/publications/newsletters/diginews.htm.

Metadata:

Metadata is typically defined as “data about data.” Librarians, archivists and museum professionals are all familiar with this concept under different names. Libraries and archives have traditionally maintained a variety of cataloging data about books and documents, and museums keep detailed catalog worksheets describing three-dimensional artifacts. Sometimes the information about a piece is more significant than the actual artifact being described. This is certainly the case with digital projects. Good metadata makes it possible to catalog and effectively present digital information to the public. Metadata typically describes how the image was digitized, its format, ownership and copyright information, and much more. A wide variety of metadata schemes currently exist, but as of yet no single metadata standard has gained worldwide acceptance. Deciding what metadata standard to use should be determined before materials are digitized.

Evaluation:

Evaluation is an oft-neglected aspect of digitization projects. Project evaluations should move beyond easily quantifiable figures and attempt to determine program’s impact on the user. Many digital projects are judged by the number of items they digitize, but this is really one of the least useful measures of a project’s success. Digitizing 500, 1,000 or even 100,000 images means nothing if they are low quality, hard to locate in a database, or not interesting to the public. Surveying users to learn how they are using digital materials provides a more effective evaluation tool. At the bare minimum, projects should be formally evaluated based on if they have met their goals. The Institute of Museum and Library Services (IMLS) has been promoting a technique called Outcome Based Evaluation which stresses ways to assess the impact of digital projects. More information about project evaluation methods can be found at the IMLS website at: http://www.imls.gov/applicants/obe.shtm.

Technical Information:

This insert has outlined some of the steps involved in planning a digital project, but has overlooked most of the technical aspects. A good digital project depends more on planning than technical expertise, but digitization does also require an understanding of technical issues. Howard Besser has written an excellent short guide to the “Procedures and Practices for Scanning” for the Canadian Heritage Information Network (CHIN), at (http://sunsite.Berkeley.EDU/Imaging/Databases/Scanning). For a much more in-depth technical information and training for digitization, take Cornell University Library’s Digital Imaging Tutorial at: http://www.library.cornell.edu/preservation/tutorial/.