Metadata Encoding and Transmission Standard

From MPDLMediaWiki
Revision as of 16:10, 16 July 2009 by Inga (talk | contribs)
Jump to navigation Jump to search


"The METS schema is a standard for encoding descriptive, administrative, and structural metadata regarding objects within a digital library, expressed using the XML schema language of the World Wide Web Consortium. The standard is maintained in the Network Development and MARC Standards Office of the Library of Congress, and is being developed as an initiative of the Digital Library Federation."[1]

"METS is intended to provide a standardized XML format for transmission of complex digital library objects between systems".[2] One METS file corresponds to one digital object (i.e. one digitized work) and provides separate sections for descriptive metadata, administrative metadata, structural metadata, files and behaviors. The structural parts are directly defined by the METS standard, while the other sections incorporate "extension schemas", e.g. MARC/Dublin Core for descriptive metadata or MIX for technical metadata. METS is very powerful for grouping together various digital items into one research object, e.g. to combine scans and TEI transcription of one work.

METS is highly flexible and allows multiple representations of the same digital object. In particular, METS does not restrict the usage of metadata schemas (it only defines a set of supported schemas = "extension schemas") and the structural maps can be organized in multiple ways. Therefore, the standard itself does not provide interoperability. METS profiles may reduce this problem to a certain extend.


METS structure[edit]

An example METS xml is available from the Fedora homepage[3] and a METS structure diagram is provided as well[4]

Header (metsHdr)[edit]

Information about the METS document itself, e.g. various time stamps and the institutions and/or individuals (agent) involved in creating the package:

<METS:metsHdr ID="BOOK1" CREATEDATE="2007-05-20T06:32:00" LASTMODDATE="2007-05-22T06:32:00" RECORDSTATUS="A">
  <METS:agent>ROLE="CREATOR" TYPE="ORGANIZATION">
    <METS:name>Max Planck Institute for History of European Law</METS:name>
  </METS:agent> 
</METS:metsHdr>

Descriptive Metadata (dmdSec)[edit]

One or several records describing the work - in any metadata format. Descriptive metadata might be embedded in the METS object (mdWrap) or stored externally and pointed to (mdRef).

<METS:dmdSec ID="DMD1">
  <mdRef LOCTYPE="URL" MIMETYPE="application/xml" MDTYPE="OTHER" 
  LABEL="MAB record"></mdRef>
</METS:dmdSec>
<METS:dmdSec ID="DMD2">
  <mdWrap MIMETYPE="application/mab" MDTYPE="OTHER" LABEL="MAB Record">
    <binData>0471nM2.01010024      h001 66230�002 19941207000000.0�003 20070608000000.0�030 zz5d||rz||||7�050 ||||||||||||||�051 n||||||�077 �c0�100 Oertel, Christian Gottfried�331 VollstÉandiges corpus gravaminum evangelicorum�359 An das Licht gestellet von Christian Gottfried Oertel�410aRegensburg�412aNeubauer�501 Erschienen: 1 (1771) - [8] (1775). -Bd. [8] im Verl. Montag, Regensburg, erschienen�710 Corpus Evangelicorum / Gravamen�902   |Corpus Evangelicorum�902   |Gravamen�� 
    </binData>
  </mdWrap>
</METS:dmdSec>
<METS:dmdSec ID="DMD3">
  <mdWrap MIMETYPE="text/xml" MDTYPE="DC" LABEL="Dublin Core Metadata">
    <xmlData>
      <dc:title>Vollständiges corpus gravaminum evangelicorum</dc:title>
      <dc:creator>Oertel, Christian Gottfried</dc:creator>
      <dc:date>1 (1771) - [8] (1775)</dc:date>
      <dc:publisher>Montag, Regensburg</dc:publisher>
      <dc:type>text</dc:type>
    </xmlData>
  </mdWrap>
</METS:dmdSec>

Administrative Metadata (amdSec)[edit]

A collection of administrative metadata available for a METS document and/or its components. This can be:

  1. technical metadata (techMD): information regarding the file, e.g. compression, bit depth, etc.
  2. IPR metadata (rightsMD): copyright and/or license statement
  3. source metadata (sourceMD): "descriptive and administrative metadata regarding the analog source from which a digital library object derives".[5]
  4. digital provenance metadata (digiprovMD): "information regarding source/destination relationships between files".[5]

Again, the information can be embedded (mdWrap) or just pointed to (mdRef).

<METS:amdSec>
  <METS:techMD ID="TMD1">
    <METS:mdWrap MDTYPE="OTHER" MIMETYPE="text/xml" OTHERMDTYPE="TECHMD">
      <METS:xmlData>
        <techmd:compression NAME="LZW"/>
        <techmd:image>
          <techmd:bitDepth>24</techmd:bitDepth>
          <techmd:storage PLANARCONFIGURATION="UNKNOWN" SEGMENT="STRIP"/>
          [...]
        </techmd:image>
      </METS:xmlData>
    </METS:mdWrap>
  </METS:techMD>
  <METS:rightsMD ID="RMD1">
    <METS:mdWrap MDTYPE="OTHER" MIMETYPE="text/xml" OTHERMDTYPE="RIGHTSMD">
      <METS:xmlData>
        <rightsmd:versionStatement>Copyright by MPIeR</rightsmd:versionStatement>
      </METS:xmlData>
    </METS:mdWrap>
  </METS:rightsMD>
  <METS:rightsMD ID="ADMRTS1"> 
    <METS:mdWrap MDTYPE="OTHER" OTHERMDTYPE="METSRights">
      <METS:xmlData> 
        <rts:RightsDeclarationMD RIGHTSCATEGORY="PUBLIC DOMAIN"> 
        […]
        </rts:RightsDeclarationMD> 
      </METS:xmlData> 
    </METS:mdWrap> 
  </METS:rightsMD> 
</METS:amdSec> 

File List (fileSec)[edit]

The file list is the inventory of all files which comprise the digital object. This section is not repeatable, thus each file is listed once and then referenced from the structural map. The inventory arranges the files into groups (fileGrp), which may represent the hierarchy of the document. Every file element (file) may optionally reference descriptive as well as administrative metadata.

The content data streams may be referenced (xlink:href) or embedded in the METS document.

<METS:fileSec>
  <METS:fileGrp ID="DATASTREAMS">
    <METS:fileGrp ID="DS1" USE="MASTER IMAGE">
      <METS:file ID="DS1.0" CREATED="2007-05-20T06:32:00" MIMETYPE="image/tiff" SIZE="8238866"
        ADMID="TMD1 RMD1" DMDID="DMD1" OWNERID="E">
        <METS:FLocat LOCTYPE="URL" xlink:href="http://www.escidoc.mpg.de/virr/12433.tiff"/>
      </METS:file>
      [...]
    </METS:fileGrp>
    <METS:fileGrp ID="DS2" USE="text/tei">
      <METS:file ID="DS2.0" CREATED="2007-10-20T06:32:00" MIMETYPE="text/xml" SIZE="7343"
        ADMID="RMD1" DMDID="DMD1" OWNERID="X">
         <METS:FLocat LOCTYPE="URL" xlink:href="http://www.escidoc.mpg.de/virrbeame.xml"/>
      </METS:file>
      [...]
    </METS:fileGrp>
  </METS:fileGrp>
</METS:fileSec>

Structural Map (structMap)[edit]

A representation of the complete object modeled as tree structure. "The structural map is the heart of a METS document, defining the hierarchical arrangement of a primary source document which has been digitized. This hierarchy is encoded as a tree of div elements. Any given div can point to another METS document via the mptr element, or to a single file, to a group of files, or to segments of individual files or groups of files through the fptr and subsidiary elements."[2]

Maps may focus on the physical composition of the digitized work (e.g. book->pages) or the intellectual structure of the work (e.g. book->table of contents->chapter->references, etc.). The type attribute specifies which kind of structural map is provided.

The div element supports parallel number via the ORDER, ORDERLABEL, and LABEL attributes. "[...] imagine a text with 10 roman numbered pages followed by 10 arabic numbered pages. Page iii would have an ORDER of '3', an ORDERLABEL of 'iii'; and a LABEL of 'Page iii';, while page 3 would have an ORDER of '13';, an ORDERLABEL of '3'; and a LABEL of 'Page 3'".[2]

<METS:structMap TYPE="physical">
  <METS:div TYPE="multiVolume" LABEL="Vollstaendiges corpus gravaminum evangelicorum">
    <METS:div TYPE="book" LABEL="Vollstaendiges corpus gravaminum evangelicorum, Band 1" ORDERLABEL="Band 1" ORDER="1">
      <METS:div TYPE="page" LABEL="Blank page" ORDER="1"></METS:div>
      <METS:div TYPE="page" LABEL="Page i: Half title page" ORDERLABEL="i" ORDER="2">
        <METS:fptr FILEID="DS1.0"/>
        <METS:fptr FILEID="DS2.0"/>
      </METS:div>
      <METS:div TYPE="page" LABEL="Page ii: Blank page" ORDERLABEL="ii" ORDER="3"></METS:div>
      [...]
    </METS:div>
  </METS:div>
</METS:structMap>

One digitized work may have several structural maps, e.g. one to describe the physical and one to describe the logical structure. The METS Profile used by the DFG Viewer[6] describes both structures independently from each other and connects them via entries in the strucLink section:

<METS:structMap TYPE="LOGICAL">
  <METS:div ID="log0000" TYPE="Multivolume" LABEL="Vollstaendiges corpus gravaminum evangelicorum">
    <METS:div ID="log0001" TYPE="Book" LABEL="Vollstaendiges corpus gravaminum evangelicorum, Band 1">
      <METS:div ID="log0002" TYPE="Section" LABEL="Blank pages"/>
      <METS:div ID="log0003" TYPE="Chapter" LABEL="Kapitel 1">
        <METS:div ID="log0004" TYPE="Chapter"/>
      </METS:div>
    </METS:div>
  </METS:div>
</METS:structMap>
<METS:structMap TYPE="PHYSICAL">
  <METS:div ID="phys0000" TYPE="physSequence">
    <METS:div ID="phys0001" TYPE="page" ORDER="1" ORDERLABEL="I">
      <METS:fptr FILEID="DS1.0"/>
    </METS:div>
    <METS:div ID="phys0002" TYPE="page" ORDER="2" ORDERLABEL="II">
      <METS:fptr FILEID="DS2.0"/>
    </METS:div>
    <METS:div ID="phys0003" TYPE="page" ORDER="3" ORDERLABEL="Seite 1">
      <METS:fptr FILEID="DS2.0"/>
    </METS:div>
    <METS:div ID="phys0004" TYPE="page" ORDER="4" ORDERLABEL="Seite 2">
      <METS:fptr FILEID="DS2.0"/>
    </METS:div>
  </METS:div>
</METS:structMap>
<METS:structLink>
  <METS:smLink xlink:from="log_0001" xlink:to="phys_0000"/>
  <METS:smLink xlink:from="log_0002" xlink:to="phys_0001"/>
  <METS:smLink xlink:from="log_0003" xlink:to="phys_0003"/>
  <METS:smLink xlink:from="log_0004" xlink:to="phys_0004"/>
</METS:structLink>

Structural Link (structLink)[edit]

"The Structural Links section of METS allows METS creators to record the existence of hyperlinks between nodes in the hierarchy outlined in the Structural Map. This is of particular value in using METS to archive Websites."[5]

The METS profile of the DFG Viewer[6] view uses the structural link to connect the sections of various structural maps to each other (see above).

Behaviors[edit]

"A behavior section can be used to associate executable behaviors with content in the METS object."[5]


Tools for METS generation[edit]

An overview of METS tools is available on the METS homepage

  • MEX - Tools provided by the Deutschen Bundesarchivs, additions to Eclipse
  • 7Train - "an XSLT 2.0 tool for generating METS files from XML input. It builds the basic METS structure so that the user can worry about what is specific to the user's project. Includes examples for generating METS from OAI and CONTENTdm records."
    The tool provides generation from one xml format to another, thus it's probably not of help for the MPIeR
  • METS Java Toolkit - "for the procedural construction, validation, and marshalling and unmarshalling for METS"
  • OpenMIC: METS-based Bibliographic Utility - "an open source, web-based cataloging tool that can be used as a standalone application or integrated with other repository architectures by a wide range of organizations. It provides a complete metadata creation system for analog and digital materials, with services to export these metadata in standard formats."

METS profiles[edit]

"METS Profiles are intended to describe a class of METS documents in sufficient detail to provide both document authors and programmers the guidance they require to create and process METS documents conforming with a particular profile."[7]

METS profiles define the use of extension schema, rules of description and specify the technical characteristics. METS profiles allow implementers to reduce the flexibility to those constraints they would like to support.


References[edit]

  1. http://www.loc.gov/standards/mets - the official METS homepage
  2. 2.0 2.1 2.2 http://www.loc.gov/standards/mets/mets.xsd
  3. http://www.fedora.info/documents/obj-image-userinput-archdraw.xml - METS example
  4. http://sunsite.berkeley.edu/mets/diagram/ - METS structure diagram
  5. 5.0 5.1 5.2 5.3 http://www.loc.gov/standards/mets/METSOverview.v2.html
  6. 6.0 6.1 http://dfg-viewer.de/profil-der-metadaten - im Rahmen des Projektes "DFG-Viewer" ist ein METS/MODS-Profile entwickelt worden. Diese Seite verlinkt auf Dokumentationen und Beispiele
  7. http://www.loc.gov/standards/mets/mets-profiles.html - METS Profiles, also includes a list of registered profiles

METS examples[edit]

Further documents[edit]