Data Elements for Image Collections

Jon Rudin (mailto:Rudin@MEDINFO.LABMED.UMN.EDU)
Mon, 1 Aug 1994 10:14:46 -0500

Message-Id: <mailto:199408011529.KAA17933@library.wustl.edu>
Date:         Mon, 1 Aug 1994 10:14:46 -0500
From: Jon Rudin <mailto:Rudin@MEDINFO.LABMED.UMN.EDU>
Subject:      Data Elements for Image Collections
To: Multiple recipients of list IMAGELIB <mailto:IMAGELIB@ARIZVM1.BITNET>

In message <mailto:9407292137.AA00907@medinfo.labmed.umn.edu> IMAGELIB writes:

> Mr. Gherman brings up another issue having to do with
> cross-disciplinary indexing of image collections. I cannot speak to the
> requirements of all types of image collections, but I can say that in
> cataloging images of art and architecture, LCSH is not very adequate. We
> are fortunate to have the Art and Architecture Thesaurus and I think that
> most art and architecture image collections doing subject analysis use
> it. Translating image information into words is difficult and I would
> think that specialized thesauri would be needed in most disciplines. As
> long as one list of generalized subject headings for a collection that
> contains information on many different disciplines is used, then everyone
> understands what the words mean. But in image collections where any
> number of thesauri might be used depending upon the discipline, problems
> could quickly arrise. For example, if one searched an image database
> indexed with the AAT using the word "orange", one would be led to a
> color, not a piece of fruit. But if one used the word "orange" in an
> agricultural image database, I imagine a very different result would
> occur. So, how in a cross-disciplinary image database does one build in
> an indexing structure that will make it possible to quickly focus on the
> desired subject? I'm sure this is possible, but it would need to be
> thought out.

> Linda McRae (813) 974-9234 VOICE
> College of Fine Arts/FAH 110 (813) 974-2091 FAX
> University of South Florida mailto:mcrae@arts.usf.edu

Linda-

In the oral pathology image database project I worked on, we created menus of descriptors in various categories such as color, morphology, and location. This system would accommodate different descriptors that are spelled the same (e.g., orange for color and orange for fruit).

For example, the color orange would be located in the color menu and the fruit orange would be in the plant menu (or in the fruit sub-menu, if you like).

The benefit of using a controlled vocabulary for image cataloging and retrieval is that those cataloging images will be using the same terms as those retrieving them. Both catalogers and retrievers should have access to the same dictionary which defines the descriptors. With our database, we realized that physicians preferred the term 'violaceous' while most everyone else used 'violet' or 'purple'. Others used the term 'erythematous' to indicate 'red'. We had to narrow the choices down for the menu system and eliminated 'violaceous' and 'erythematous'.

Example images should be provided for each term either automatically or on command. For example, if a user would like to clarify the meaning of a particular term, such as 'translucent yellow', an image that portrays the essence of translucent yellow would be displayed next to the term. Perhaps the best system for retrieving specific colors would be to present a palette of example images - each portraying a predominate color that users would select from. As an option, users might then request the corresponding TEXT if they wished. This method eliminates a step in the perception-translation process.

That is, if users select their colors by choosing example images, they won't first need to translate their mental image into a text word, select it from a text menu (or type it in if a free text entry system is used), and retrieve what they hope the catalogers also described textually in the same way. They will instead be able to short circuit the translation step.

An alternative way of allowing selection of color is to provide a color wheel and allow the user to select a pie slice which contains the color(s) of interest. The size of the slice will determine the number of images retrieved. The broader the slice, the more sensitive the retrieval and the narrower the slice, the more specific the retrieval will be.

Of course, the first step in setting up an image database is to determine who the users will be so that the cataloging/retrieval vocabulary (text and/or images) may be appropriately constructed. It may be that trying to create such a vocabulary for users with widely varying interests is not possible. In the oral pathology project, the target audience was dentists, physicians, physician assistants, nurses, and dental hygienists. Even that apparently narrow audience may be too broad for the terms that we provided in the menus.

Thanks-

Jon

-----------------------------------------------------------------------------

Jonathan Rudin, DDS, MS

Post-doctoral fellow

Institute for Health Services Research University of Minnesota School of Public Health Box 729 420 Delaware St. SE Minneapolis, MN 55455

(612) 624-6151

Home: 1883 Yorkshire Ave., St. Paul, MN 55116; (612) 699-5472

E-mail: mailto:Rudin@MEDINFO.LABMED.umn.edu