(U)

Donna Romer (mailto:romer@KODAK.COM)
Fri, 15 Jul 1994 11:32:28 -0400

Message-Id: <mailto:199407151652.LAA00641@library.wustl.edu>
Date:         Fri, 15 Jul 1994 11:32:28 -0400
From: Donna Romer <mailto:romer@KODAK.COM>
Subject:      (U)
To: Multiple recipients of list IMAGELIB <mailto:IMAGELIB@ARIZVM1.BITNET>

Date: 07/15/94 11:27:27
To: mailto:IMAGELIB@ARIZVM1.ccit.arizona.edu
cc: LZA200  --KR25     ROMER DONNA M

>From: Donna Romer, CD Imaging, 3/12/HE
Subject: (U)

Over the last few years Kodak has been working on an online image search service that was released in November 1993 called the Kodak Picture Exchange. The first application on that service is for commercial photography.(There are other applications envisioned and on the drawing boards as well.)The design of the commercial database was heavily researched before implementation for the kinds of "visual" access points that are needed for the retrieval of images in a "populist" (so to speak) manner. In my research I interviewed dozens of visually literate people from art directors, photo researchers, graphic artists, etc. to create a Visual Resources User Model upon which to build a schema. That schema had to support low cataloging overhead AND good access points for a VERY heterogenous group of end users.(Remember this is for the co mmercial world of images: advertising and editorial areas.) What I found from my research exactly matches the earlier note on this list that the prevalent formal library tools for cataloging may not be useful at a broader base level. While the biblio-centric model has indeed been sensitive to the need for audio-visual expansion, there are a number of characteristics that non-text materials exhibit that are not covered adequately.

Over the last year or so I have been giving presentations on this Visual Resources User Model and am in the process of publishing my work. A few of the ideas though follow.

The Patterns of Questions that people ask of an image archive: Three basic patterns emerged from my research (there are more, but hard to generalize). Typically people search for (singly or in combination):

* Specific Objects and Actions depicted in an image. That is, "I am looking for images of the Three Stooges throwing pies." The EXACT relationships of Objects and Actions to one another were less often cited - but I do think that this needs much more study. That is, it may be rarer to find a questions that state : I am looking for an image of the Three Stooges where Moe is throwing a pie at Curly with Larry reacting in the background. It seems that people do not initally think in graphic detail about what they are looking for, but just generally want to find the objects and actions and allow relationships to fall in the context of what the subject suggests.

* Compositional qualities that were the choice of the artist. That is, "I am looking for images that are a horizontal format (that is what is needed for the magazine page for instance) where there is a lot of open space in the image at the top (so that the graphic artist can drop text onto it or over it), and the dominant colors should be red and gray (that is the theme for that article's illustrations) and it should be a close-up.

Usually compositional characteristics do not stand alone, and at least one or more Object or Action words are also added. It has been my proposition to the image database community that in the multi-million databases of the future, it will certainly be the difference that makes the difference in effective image searching. Compositional qualities are very important characteristics that will help to get to the visual point the end user wishes to arrive at, without having to scan hundreds of images. I am also keen on the idea that once we understand better the compositional semantics (not syntax) there will be another leap in our ability to handle very large image databases.

* Subjective qualities that are both the choice of the artist and the meaning of the objects and actions depicted. That is, I am looking for images that feel powerful and threatening. In the case of subjective qualities, it is very common for people to ask only those words (typically adjectives...) and not coordinate with object or action words.

There is a lot more to say here, but will conclude that these are my observations about commercial photography that encomapss the common everyday objects of the world around us. These images are varied: children playing baseball, shar pei dogs, the Eiffel Tower, Mt Fuji, Isaac Newton, to name a miniscule fraction of the subjects that are included in the data set of the Kod ak Picture Exchange. How this Visual Resources User Model might incorporate the issues for images that are important cultural properties, I would hope to discover in the future.

There are so many semantic issues tied to the need for real data structures that are interoperable that at times the issue of image description and database defintion seem in such conflict. But I am a firm believer that non-textual materials need to be approached with the literacies that are appropriate to those media before we can hope to arrive at appropriate databases.

Donna Romer Senior Image Database Engineer Eastman Kodak Company

~ Donna Romer