Humanities Image Database Project

Sponsored by the Advanced Information Technology Group in cooperation with the Digital Services and Development Unit, both programs at the University of Illinois Library.

Project Summary

About Canto

Cumulus is an image database program that allows you to collect all the digitized images you use in your research and instruction into a central location. Cumulus allows you to attach information, or metadata, to your images to facilitate organization, sorting, and retrieval. For more information about the software, see the Canto Cumulus Web Site.

There are several reasons for choosing Canto Cumulus:

A quick description of the Cumulus Environment

Cumulus is a client/server application. The server, located in 452 Grainger Engineering library is open 24 hours a day for you to connect from a Mac that has been loaded with the Cumulus Client software. When you open an image database, called a catalog, you are actually "borrowing" it from the server. This frees your computer to store other things.

Cumulus Display

When you connect to a catalog, the database immediately opens to the thumbnail display of your image collection. Thumbnails are a smaller representation of the image.

You may display your collection in any one of four ways by selecting either the text version or one of three thumbnail sizes. To do this, simply click your mouse on the pull down bar located at the top of the gray work area on the left of the Cumulus interface.

If you choose not to see this gray work area, click the small box icon just above the up arrow on the vertical scroll bar.

You can also sort your images based on the image name, type, color mode, resolution, or modification date. A button for each sort order is placed in the gray work area.

Image Information

Clicking twice on any image will open the Information screen for the image.

Cumulus automatically enters the following unmodifiable fields: Cumulus can automatically catalogs the following modifiable fields (using catalog options.):
  • file name
  • file location
  • image editing program
  • image type
  • file size
  • color mode
  • resolution
  • height
  • width
  • modification date
  • catalog date
  • status
  • image name
  • user
  • notes
  • categories

From the information display you can:

About The Categories

Canto Cumulus may be unlike any database you've ever worked with before. Most databases have independent fields for each type of information, such as the field "ARTIST" for artist's name. Cumulus has a much more free design, and utilizes a Category field to hold this information, along with any other data you want to apply to the image. The terms in the category field are retrieved from the user-constructed Categories Window, a list of terms arranged hierarchically according to relationships and for use with all images in the catalog. The Categories Window is capable of storing an almost infinite list of hierarchical terms in the form of an index, similar to what you might find at the back of a book. So, instead of a field named "ARTIST" you would have a Category "ARTIST" with artists' names located hierarchically below it.

The major difference that this affords the user is an increased authority and responsibility for maintaining the structure of your hierarchical list. Yes, this does mean a bit more effort on the part of the user, but, in the long run you are not constrained by having a limited number of fields to enter. For example, if at any given point you want to start cataloging the artist's nationality, a traditional database would require you to rework the layout, whereas Cumulus just requires that you add the category terms onto the Category List. It is as simple as that.

Also, the Categories List is universal to all images in your catalog, that is, the list is available and, by design, usable for each image. So another great advantage to Cumulus' architecture is that the terms in your Categories List can be used over and over; there is no need to constantly retype a term.

The Categories List is a powerful tool in cumulus for both the database editor and the end user. There are many functions that can be accomplished using the categories list.

Related Categories -- the newest version of Cumulus allows you to add Related Categories to your Categories List. Related categories work like aliases in your Mac finder: you create a related category from an existing category, and move it to another location in the hierarchy. Related categories ultimately help the searcher locate like images and to expand his or her search, much like a "see also" term would do in a book index.

To create a related category:

  1. Select a category from which you want to make a related category.
  2. Choose "Make Related Category" from the Categories menu. The related Category appears under the original.
  3. Drag the Related Category to its new location in the Category Structure. You may rename the related category without affecting the relationship to the original Category.

Locating Categories -- The newest version of Cumulus allows you to easily locate a Category from your list by selecting "Show Categories Containing" for the Categories pull down menu. Simply type a word or portion of a word, and Cumulus will automatically select the matching categories for you.

Searching with Categories -- You can easily locate images from the Categories Menu by clicking twice on a Category. For more information on searching using Categories, see Searching for Images.

For related information see Setting Catalog Options, Assigning Categories, Deleting Categories, and, for related information on indexing, the section Indexing Your Images.

Some Guidelines Before Starting Your Database

Opening A Catalog

Once you have had the Cumulus Client loaded onto your computer, you are ready to connect to a database on the network server. Here's what you need to do

  1. Make sure that you have requested that a both a database and a password have been created for you. This should already have been done, but if it hasn't contact the administrator. You won't be able to proceed further without this step.
  2. Click Twice on the Cumulus Client icon to open it.
  3. Under the File menu, select Share Catalog.
  4. Select the Apple Talk Zone "Grainger Eng Library."
  5. Select the "Canto Cumulus" Macintosh.
  6. Select your database.
  7. Enter your password to be able to make changes to the database.

Setting Catalog Options

The Catalog Options feature allows you to assign universal information to a group of images that you are cataloging in your database. You can assign a status (see Assigning Status) and common Categories (See Assigning Categories), and type in any common notes.

The Catalog Options feature is universal to all Cumulus databases linked to the Cumulus Network Server, so the information entered here is not permanent. Please do not remove the source and copyright text from the notes field.

This is a very important step, that you must take every time you enter images onto your database.

  1. Catalog Options is located under the Record menu.
  2. A menu will open. Highlight "General" if not already.
  3. Click the mouse button on "Files" -- If you are cataloging several image file folders, and there are one or more folders that you do not want to catalog, click on "Add File." You will be asked to show Cumulus the Folder to be omitted.
  4. Click on "File Filters." -- You should only be using GIFFs and JPEGs and perhaps TIFFs. Therefore, you may opt to turn off all other file filters. Otherwise, you may ignore this.
  5. Click on "Thumbnails." This is where you can control how your thumbnails appear.

Assigning Status

Currently, we have only two status settings: cataloged and processed. You should tag an image "Cataloged" if you have entered it into the database, but have made no further changes. When you have added information to the image, you may opt to select "processed" to indicate that you have finished adding information to the image.

If you would like to add a status name to the status list, please contact the administrator.

Cataloging Images

Before you add images to your database, it is very important that you set your catalog options.

There are two ways to catalog images, one is simple, and the other is just a little more complex.

The Simple Way

  1. Locate the folder of images on your machine or on disk
  2. Select the folder so that it is highlighted
  3. With your mouse button depressed, drag it onto you Cumulus work screen.

Just a Little More Complex

  1. Under the Record pull down menu, select Catalog.
  2. A window will appear asking you to find the file or folder you wish to catalog. Locate the file or folder.
  3. If you are cataloging a folder, do not open the folder, just select it. When the folder or file is selected click on catalog, or just hit enter.

The latest version of Cumulus automatically copies the file structure from your finder into your category list. To turn this default setting off, go into Catalog Options, select "General" and click in the box next to "Include Folder Names as Categories."

Assigning Categories

If you are comfortable in a Mac environment, you'll have no trouble constructing a Category list and assigning the terms. Adding terms to your Category list is very similar to Mac file management.

  1. In an opened database, open your Category list. Do this by selecting Show Categories, under the Category menu. Category headings (e.g., geographic location, medium, content) should have already been determined and added. If not, contact the administrator.
  2. To add a Category, select New Category, also under the Categories Menu.

    Because we want to facilitate searching and encourage common structure among database, we are asking that you use a controlled vocabulary. For more information please read Indexing Your Images.
  3. When you select New Category, a box with the word "Category" will appear on your Category list. If it is highlighted, simply type your Category term, and press enter. The term you just typed will then be placed on the list alphabetically.

Hint: If your category list is long, place the new category in the hierarchical structure of your choice before typing the new word.

  1. The box around your Category should disappear, but the term will remain highlighted. Just drag the term on top of the image to which you want to assign the term, or, if you have the information box open on the image, drag the term into the Category list.

Note: The latest version of Cumulus makes this task a little tricky. You can now search for images using your category list by clicking holding down your mouse button on a term. Therefore, if you are adding terms to images, you must first select the image or images then, in your category menu, quickly select the term and drag it onto the image or one of the images. There don't seem to be any problems when you are in the information screen of individual images.

  1. You can also assign categories to several images when you are cataloging images. See Setting Catalog Options.

Deleting Categories

Category are easily deleted by selecting the word to be deleted in your opened Category list, and then selecting Delete Category under the Category menu.

HOWEVER, if you are adding to a database that several people are working on, you must first be sure that no one else is using the Category you want to delete. To do this perform a search of the database using the Category term you want to delete (See How to Search for Images)

Please also be careful when deleting Categories that list sub-Categories beneath them -- ALL SUB CATEGORIES WILL ALSO BE DELETED. You must first drag the sub Categories from beneath the term you want to delete.

How to Search for Images

There are several ways to search for images in the latest version of Canto Cumulus. For a complete, detailed description, consult your manual. For a bare bones, get- what- you- need version, read on.

Searching from your Category Window List (Select "Show Categories" from the Categories pull down menu.)

To search for a images matching a single category:

To search for multiple images matching more than one category, just press the shift button as you select the terms, then select "Find Matching Records."

The Search Compass

In the top right corner is a small circle with a cross. This is the search compass, and by clicking on it, you will open the search preferences. Why do you need to know this? This is where you select the direction of your search. Cumulus is automatically defaulted to search for all terms that fall hierarchically below the term you select to search on. If you want to search above or find related categories, you can select those in your preference box.

Find Window Search

If you want to search in more than one field, this is where you need to conduct your search. Select "Find" from the Find pull-down menu.

The Find Window contains the search conditions (the field, the operation, and the text box). You may change the contents of all three boxes.

To perform a simple search, one that has only one condition, follow these steps:

  1. In the find window, and in the search condition section, select the criteria, or field name, in which you want to search. This is the left most box of the three. To do this, press on the arrow next to the box, and use your mouse button to click on the field of choice.
  2. Choose your operator. Each field comes with a unique set of operations (e.g., "is" and "is not"), that allow you to better control your search. A summary of these operations is included in the manual. To choose your operator, simply pull down the menu using the arrow to the right of the box.
  3. Select or type in the value. Some fields will allow you to select from the pull down menu, or, if you are looking for a category, you may type or drag and drop the term in the box.
  4. Click on find.

Hint: When you want to find like images, a good way to use the Find Windows is to select a Criteria, an operation, and then drag a record into the value field. Cumulus will automatically enter the required information.

You may also perform compound searches, or searches that consist of more than one condition. The find window includes condition buttons that allow you to add or delete search conditions, located in the lower left corner. From left to right, they are "Insert" which places a search condition above the first condition; "Append," which places a search condition below your original condition; and "Delete," which deletes the selected condition.

Boolean Searching

Cumulus is capable of Boolean searching (AND or OR). When you search in the Find Window mode, and are doing a complex search, you may select AND to find all images that match both of your search categories. If you use OR, Cumulus will locate all images matching either of the search categories.

When you are searching from the Category window, using more than one categories, to do an AND search, you must select your category terms (by depressing the shift button as you select) and select "Find Matching Records" from the Categories pull down menu. To do an OR search, select the categories in the same way, and click twice on one of the selected terms.

When to Contact the Administrator

Contact me anytime via email, if you have questions, need to add a status term, or need a new password or database.

Indexing Your Images

Now we come to the critical part of cataloging your images: the classification of each image into an index. This activity is key if you want your searcher to have a meaningful and successful searching experience.

Catalogers indexing text documents have the actual text from which to pull significant terms. Images, however, pose a problem: often there is not text to which you can refer. As an expert in your field, you know your image collection better than anyone else, and that is why we are asking you to become a cataloger. Because this activity is so important, there are a number of rules and guidelines that must be followed.

In a nutshell, here is the process, but please also refer to the sections associated with each step:

  1. Identify the purpose of your database. Think about why your images should be indexed (process for establishing an image database link coming soon).
  2. Determine what category headers you will be using.
  3. Explore controlled vocabulary and the online thesauri, Art & Architecture Thesaurus, and the Thesaurus For Graphical Materials.
  4. If you want to automate your efforts, identify groups of like images. If you catalog these sets, you will be able to apply common category terms to the group. You can do this when you set your cataloging options in Cumulus.
  5. Label the cataloged images with attributes and other terminology, change the name of the image and add notes and categories.
  6. Periodically, check for the existence of your terms in your chosen online thesaurus or thesauri.
  7. Add text to the notes field, where appropriate. This is free text and natural vocabulary is acceptable. It is preferable that you make note of ownership and/or copyright here. Also, this window is capable of storing up to 8 single spaced pages of text, if necessary.
  8. Move on to the next cataloged image.

Category Headers

Before you begin assigning categories to your images, it will be to your advantage to have established Category headers on your category list. These headers are really field titles that will help you group -- or pigeon-hole -- related categories. For example, you may want to classify the work the image represents into a major category (e.g., painting, drawing, etc.) you would then place the category "CLASSIFICATION" into your category list, so that other users would know where to find and place classification terms. Or if you have a large collection of photographs, you may want to have the general header of "SUBJECT." Then, if a number of the photos are of presidents, you can group the images by adding the term "presidents" to your category list and pace it hierarchically beneath the "SUBJECT" header. Not only does this help you to index the images, but also, it will help the searcher, who will be able to select the term "presidents" and know that it will be the subject of the photos.

A couple of important points to remember

To facilitate consistency among databases, we are asking you to consider using a standard set of categories. The Categories for the Description of Works of Art, produced by the Art Information Task Force, offer a wide variety of categories to choose from, including a small set of core categories (those they feel should be included in all image databases). For a printed version of this information, please contact the administrator.

Using these standard categories will not only help multiple catalogers and the end user, but it will also be beneficial in the event that databases need to be combined. However, we realize that the terms may not fully meet your needs and that you may need to add your own category headers to your database.

Why it is important to have a controlled vocabulary

Classification involves the conceptual and/or physical grouping together of items. In creating an index, two basic approaches can be used, either separately or together. Index terms may be "natural language" terms such as those appearing in the original text, or "controlled language," in which the vocabulary would strictly adhere to a predetermined thesaurus.

A Thesaurus is a vocabulary of controlled indexing language, formally organized so that relationships between concepts are made explicit. We will be using two thesauri: the Art and Architecture Thesaurus, and the Library of Congress' Thesaurus of Graphical Materials. Some of you may only use one.

There are several advantages to using a controlled vocabulary:

Here is an example of how a controlled vocabulary works: a user might want to tag an image with the term, "racing, horse", another user might want to apply the term, "harness racing" to a similar image. If the two consult a controlled vocabulary, they would discover that the broader term is "horse racing." Not only does this supply consistency, but also the end user, the searcher, will be able to retrieve these like items.

To ensure that the greatest population of users will be able to access and locate images, and to ensure that there is consistency across all our Cumulus Databases, we ask that you use a controlled vocabulary as much as possible in the Category list. You may use natural language freely in the Notes Field.

The Art and Architecture Thesaurus

The Art and Architecture Thesaurus (AAT) provides terminology for art and architecture of the western world from antiquity to the present. The AAT is used to index objects, text and images, by incorporating highly specific concepts as well as the more general categories in which those concepts belong.

The AAT focuses strictly on the human created environment, furnishings, equipment, artwork, and text. The AAT provides supporting terminology for the physical attributes, persons, and concepts that relate to the creation and appreciation of art and architecture. It does not provide vocabulary for the broader range of people, events and activities.

The Thesaurus for Graphical Material

The Thesaurus for Graphical Materials (TGM) provides a controlled vocabulary for describing a broad range of subjects depicted in such materials, including activities, objects, types of people, events and places.

The TGM is divided into two sections. TGM I includes subject categories, excluding proper noun names, serving as a source of terminology for topics represented in graphic materials, indexing guidelines. TGM II focuses on genre and physical characteristics of the image (e.g., poster, daguerreotype, print)

TGM is arranged alphabetically, and includes "postable" (authorized for indexing) and "non-postable" (cross references pointing to authorized indexing terms). Each authorized term includes related unauthorized terms (UF), broader terms (BT), narrower terms (NT), and related terms (RT).

Used For (UF):
A term that is not authorized for indexing, UF terms are listed primarily for editorial purposes, but they may help searchers by clarifying the scope or meaning of a term.
 
Use (USE):
A cross reference that points catalogers and searchers to an authorized term. USE references may be made from synonyms, near synonyms, antonyms, inverted phrases, or other closely related terms or phrases.
 
Broader term (BT):
An authorized term which indicates the more general class to which a term belongs. Everything that is true of a term is also true of its broader term(s).

Narrower term (NT):
An authorized term that is narrower in scope and a member of the general class represented by the broader term under.
Related term (RT):
An authorized term that is closely related to the term under which it is listed, by the relationship is not a hierarchical one.

The TGM also includes notes about the terms:

Public note (PN): defines a term, explains its scope, or helps a user understand the structure of the thesaurus.
Cataloger's note (CN): clarifies how to use a term or when to use it in conjunction with another term (double indexing)
History note(HN): records the fact that a change has taken place in a term or the status of a term since the publication of the first edition.

For more information, refer to the Library of Congress' page for the Thesaurus of Graphical Materials.

The Concept of Sameness

Part of a cataloger's job is to search for "sameness" among a collection and to search for the most generally understood and objective terminology to which to apply to an image. Each of your images may have a unique meaning, but there may also be some common attributes among the image collection. In fact, you may already have like items arranged together in your computer file structure. Try to look for the "sameness" among your image collection, and group your collection accordingly.

Also, for any single concept you should only have one term (this is not to say you should only have one term per image). Two similar images, representing the same concept should be represented by the same term.

Layne states three reasons why grouping of images is so important:

  1. The comparing of images that share one or more characteristics are essential to the research process.
  2. A searcher may not be able to verbalize all of the criteria that can be verbalized in order to identify the precise image needed.
  3. A searcher may have highly specific criteria that can be more efficiently identified by visually scanning the collection, rather than searching completely with textual descriptors.

The Attributes of Images

In her 1994 article, Some Issues in the Indexing of Images, Sara Shatford Layne provides a valuable resource about what image indexing should accomplish and how to accomplish it. According to Layne, image indexing work should not only provide access to images based on the attributes of each image in the collection, but also provide access to useful grouping of images, in addition to access to individual images.

Layne offers four categories of image attributes: "biographical", subject, exemplified, and relationship.

Biographical attributes include information about the creator of the image, the date of creation and the title. Additionally, this category may contain current owner or location, or current value, depending on the image collection. This attribute would also include any copyright information.

Subject attributes are less objective, and a bit more difficult. Layne makes the distinction that an image can be both of and about something, and can be either generic or specific.

An image is always of something. When you state what an image is of, you are identifying what you see in the image. More than likely, this information is very concrete and objective. For example, you may be cataloging an image of a man and a lion.

When you are identifying what an image is about, you are more likely to be abstract and subjective. The same image of the man and lion may also be about pride. You must be careful when you do this, avoiding to read too much into an image.

An image is often both generic and specific. For example, an image is generically of a "bridge", but specifically it is of the "Brooklyn Bridge". Several generic terms may apply to a single image. Ideally, access should be provided at all possible generic identities as well as to the specific identity of a person, object, or event.

Exemplified attributes of an image refer to the object characteristics of the work. Distinguished from the subject of the image, you are classifying what the object in the image is, not what the image is of. You are making the distinction that a painting, by classification, is different from a photograph, which is different from an etching, etc.

Relationship attributes identify the relationship among other images, objects, or textual works.

In addition to Layne's four attributes, a fifth attribute is File Information. This is the specific information of the image file (name, location, size, resolution, etc. ) This information is automatically indexed by Cumulus.

Using the Online Thesauri

Both the AAT and TGM are searchable from this page, and by the buttons on the left side of the screen. If you are looking for terms to use to describe the work, not the content of the work, refer to the Art & Architecture Thesaurus. If you are looking for terms to help you describe the content of the image, or the "aboutness" of the image, then use Thesaurus of Graphical Materials.

Sometimes you will want to make a term compound, that is, use more than one term in the thesaurus to construct a "new term." For example, you may want to construct the term "photography--digital." This is acceptable, as long as the hierarchical structure doesn't already make the term explicit. In other words, if you have the term "photography" as a superior term, you may insert "digital" as an inferior term.

geography, dates, names, etc.

Hint: Both thesauri are available from this web page. To make the most efficient use of your time and energy, resize the Netscape window down so that your Cumulus windows are also visible. Then you should be able to toggle back and forth between applications.


References

Shatford Layne, S. (1994) Some Issues in the Indexing of Images. Journal of the American Society for Information Science, 45, 583-588.

Wiggins, Bob. Indexing -- The Key to Retrieval. Document Image Automation. Summer 93; v 13 n 2, 13-15.

 Last Update: 2.4.2002