Metadata Engine - Newsletter now available

From: Simon Tanner (S.G.Tanner@HERTS.AC.UK)
Date: Mon Feb 26 2001 - 04:06:02 CST

  • Next message: Stuart Glogoff: "FYI: Preventing and Detecting Plagiarism"

    Message-Id: <200102261013.DAA26352@dns.ccit.arizona.edu>
    Date:         Mon, 26 Feb 2001 10:06:02 +0000
    From: Simon Tanner <mailto:S.G.Tanner@HERTS.AC.UK>
    Subject:      Metadata Engine - Newsletter now available
    To: mailto:IMAGELIB@LISTSERV.ARIZONA.EDU
    

    <pre>
    *** Apologies for cross-postings ***

    The Metadata Engine Project (METAe) - Newsletter now available.

    The first issue of the METAe Newsletter is now available from: http://meta-e.uibk.ac.at/newsletter/news.htm
    (for an introduction to METAe see the base of this email)

    In this first issue we introduce our project and tell you some information about progress to date. Our next issue due out in April 2001 will have even more detail and information. The METAe homepage has further information and of course the METAe team welcome contact at any time: http://meta-e.uibk.ac.at/. The METAe Project is funded under the European Union IST Programme.

    In this issue, Günter Mühlberger, from the Project Co-ordination team at University of Innsbruck explains the genesis of the idea that led to the Metadata Engine Project. Also, the influence of the METAe project is already being felt on the international scene and Alexander Eggar explains why METAe have been invited to attend the next MOA2 DTD meeting in New York.

    We also introduce the 14 partners that make up the Metadata Engine project. In future issues two partners per issue will showcase their expertise and involvement in METAe. This will give a good opportunity to find out more about the backgrounds to our various partners.

    We will endeavour to keep you up to date with the METAe project progress and to give details of forthcoming events that METAe organises or will be presenting information at. The newsletter may also include reports on meetings attended by METAe partners - as this issue does, with an article by Gerd Prasthofer on the SCHEMAS-workshop held in Bonn during November 2000.

    We hope you will find this newsletter useful and informative. Any feedback can be directed to Simon Tanner, Editor of the METAe Newsletter at mailto:mailto:s.g.tanner@herts.ac.uk

    Best regards,
            Simon Tanner
            Senior Digitisation Consultant (HEDS)
            Higher Education Digitisation Service
            Web: http://heds.herts.ac.uk

    Some further information about METAe:

    The METADATA ENGINE Project
    "Metadata" are playing a significant role in "digital preservation": Firstly, they are, in conjunction with emerging standards (such as XML, EAD, Dublin Core or RDF ), among the most promising ways to keep digital material "alive" over the years and decades. Secondly, metadata are needed for all kinds of resource discovery, i. e. using and accessing digital collections in a user-friendly way. The METADATA ENGINE project picks up these considerations and will develop software modules in order to automate metadata capturing by introducing layout and document analysis as a key technology for digitisation software. METAe will enhance dramatically the quality of creating and maintaining digital collections of printed material such as books and journals.

    Objectives The METAe project will address the need for an automated generation of metadata during the conversion of printed documents and thus be able to make large scale digitisation of printed material, such as books and journals, more reliable in terms of digital preservation, more cost-effective in terms of automation, and more user-oriented in terms of future applications. In order to achieve these aims the METADATA ENGINE project will
    (1) introduce layout and document analysis to be employed as a key technology in future digitisation software,
    (2) develop capturing and conversion tools for the automated recording and generation of administrative and descriptive metadata,
    (3) develop an omnifont OCR-engine specialising in processing old European typefaces of the 19th century,
    (4) strictly obey emerging standards in the fields of digital preservation and resource description, such as XML, EAD, TEI, or ISO 12083,
    (5) develop a XML search engine capable for retrieving the tagged full text and the images.

    Description of work The METAe project will develop a software package which extensively automates and improves the generation of metadata by applying new technologies for character, layout and document recognition, and converts the captured information into XML documents. These XML files will serve as a basis for a variety of applications, such as new XML search engines, navigation tools, electronic books, audio books, or the automated production of HTML, XHTML, PDF or PS files. The METAe package consists of (1) an input module for scanning printed material and importing existing bibliographic metadata, (2) an omnifont character recognition module (OCR-engine) specialising in typefaces of the 19th century, (3) a document analysis module capable of classifying pages according to their physical and logical structure (items such as title pages, table of contents pages, etc., will be recognised automatically),
    (4) a page layout analysis module capable of analysing and segmenting page elements such as page numbers, headings, captions, footnotes, pictures, highlighted phrases, or graphical separators, (5) a knowledge base providing a controlled vocabulary and rules for the recognition process
    (the table of contents is, in most cases, called "contents"), (6) a conversion module assembling an XML document containing all recognised metadata, and (7) an export module for the XML enriched document and the scanned image. The XML documents will be generated according to emerging standards for digital preservation and the electronic interchange of information such as RDF, DC, EAD, TEI, or ISO 12083. In order to introduce a wide public to the new features of accessing and browsing images and XML-marked full texts, a METAe search engine and web application will be developed as well.
    ============================================================ Simon Tanner Senior Digitisation Consultant (HEDS) Higher Education Digitisation Service University of Hertfordshire Phone: +44 (0) 1707 286078 Fax: +44 (0) 1707 286079 Web: http://heds.herts.ac.uk METAe Project: http://meta-e.uibk.ac.at/

    </pre>



    This archive was generated by hypermail 2b29 : Mon Feb 26 2001 - 04:16:56 CST