Digitization Lifecycle METS Profile

MPDL,Digitization Lifecycle

METS_Profile: @xsi:schemaLocation="http://www.loc.gov/METS_Profile/

http://www.loc.gov/standards/mets/profile_docs/mets.profile.v1-2.xsd

http://www.loc.gov/METS/ http://www.loc.gov/standards/mets/mets.xsd

http://www.loc.gov/mods/v3 http://www.loc.gov/standards/mods/v3/mods-3-3.xsd

URL:http://dlc.mpdl.mpg.de/profiles/dlc_mets_profile.xml

title:

MPDL DLC General METS Profile

abstract:

The Max Planck Digital Library (MPDL) uses this profile to govern METS instance documents intended for submission to the Digitization Lifecycle (DLC) repository as well as METS documents intended for reference use and dissemination to end users through the DLC web interface. The digital content governed by the METS documents conforming to this profile may be of any type or combination of types including, but not limited to: images, tei-based structured text, ocr-based unstructured text, pdf documents, quicktime video, and Real audio.

date:

2011-04-04T00:00:00

contact:


 * name:
 * Wilhelm Frank


 * address:
 * Max Planck Digital Library, Amalienstr. 33, 80799 München


 * phone:
 * (089) 38602-201


 * email:
 * frank@mpdl.mpg.de

related_profile: @RELATIONSHIP="supersedes" @URI="" Previous version of this profile

extension_schema:


 * name:
 * MODS
 * 


 * context:
 * mets/dmdSec/mdWrap/xmlData


 * note:
 * Used for bibliographic metadata (i.e. derived from MAB or MARC)

extension_schema:


 * name:
 * TEI_HEADER
 * 


 * context:
 * mets/dmdSec/mdWrap/xmlData or mets/dmdSec/mdRef


 * note:
 * Used for bibliographic or any other kind of metadata. This scheme will be defined by the DLC data format working group

extension_schema:


 * name:
 * OTHER
 * 


 * context:
 * mets/dmdSec/mdWrap/xmlData or mets/dmdSec/mdRef


 * note:
 * Used for additionally required metadata. This scheme can be implemented on demand.

extension_schema:


 * name:
 * METSRights
 * 


 * context:
 * mets/amdSec/rightsMD/mdWrap/xmlData


 * note:
 * METSRights

extension_schema:


 * name:
 * PREMIS
 * 


 * context:
 * This schema may be used in either or both of the following contexts: mets/amdSec/sourceMD/mdWrap/xmlData and/or mets/amdSec/ digiprovMD/mdWrap/xmlData


 * note:
 * This profile neither requires nor constrains the use of the PREMIS schemas in the METS documents that implement it.

description_rules:


 * All applications of the MODS schema in conforming METS documents should follow the MODS User Guidelines published by Library of Congress' Network Development and MARC Standards Office (.http://www.loc.gov/standards/mods/v3/mods-userguide.html). In addition, all conforming objects destined for the MPDL Digitization Lifecycle project should follow the provisions for descriptive metadata: http://colab.mpdl.mpg.de/mediawiki/Mapping_MAB_to_MODS.


 * All applications of the TEI_HEADER schema in conforming METS documents should follow guidelines published by the DLC data format working group.


 * All applications of the OTHER schema in conforming METS documents should follow guidelines published by ...

controlled_vocabularies:


 * vocabulary:


 * name:
 * MPDL DLC METS TYPE attribute values


 * maintenance_agency:
 * Max Planck Digital Library, Munich


 * 


 * context:
 * mets/@TYPE


 * description:
 * This profile specifies no required vocabulary for the TYPE attribute value on the root &lt;mets&gt; element, nor does it require that this attribute appear. However, the MPDL DLC project will maintain a list of values used by its presentation programs to control the default display format for the objects it ingests into the DLC repository.


 * vocabulary:


 * name:
 * MPDL DLC METS / USE Attribute Values


 * maintenance_agency:
 * Max Planck Digital Library, Munich


 * values:


 * value:
 * image/master


 * value:
 * image/reference


 * value:
 * image/thumbnail


 * value:
 * text/tei


 * value:
 * text/tei element


 * value:
 * text/ocr


 * value:
 * text/reference


 * value:
 * application


 * value:
 * video/master


 * value:
 * video/reference


 * value:
 * audio/master


 * value:
 * audio/reference


 * context:
 * mets/fileSec/fileGrp/@USE
 * mets/fileSec/fileGrp/file/@USE


 * description:
 * These are the supported values for &lt;file&gt; and &lt;fileGrp&gt; USE attributes in METS documents implementing this profile. They are intended mainly to help presentation programs know how they should treat particular content files. In particular, they should help a presentation program distinguish between content files that are intended for reference use and those that  served as masters, but which are not intended for reference purposes.  They can also provide a presentation program with information that can help it or the end user decide what is the most appropriate content file for a particular purpose when this information is not apparent from the content file's MIME type.  For example, the USE attribute can help identify TEI encoded files and to distinguish these from other structured text files.  They allow presentation programs to recognize files with a MIME type of  &quot;text/plain&quot; which represent &quot;dirty&quot; OCR. The individual USE attribute values are defined below.


 * &quot;image/master&quot;, &quot;image/reference&quot; and &quot;image/thumbnail&quot; are appropriate values to describe the intended use of image content files from a presentation program's standpoint. The value &quot;image/master&quot; designates image master files (ultimate or intermediate) not intended for reference use. The appropriate USE values for any image that is intended for reference use, even if it also serves master image for other derivatives, would be &quot;image/reference&quot; or &quot;image/thumbnail&quot;.  The value &quot;image/thumbnail&quot; designates images intended for  thumbnail applications; and the more general value &quot;image/reference&quot; designates all other types of still image content files which are intended for general reference use.


 * &quot;text/tei&quot; and &quot;text/tei element&quot; are the appropriate values to describe associated structured text files encoded according to TEI rules.  &quot;text/tei element&quot; applies when a &lt;file&gt; element represents just an element within an integral TEI content file. In this case, the xlink:href value in the &lt;FLocat&gt; element would use XPointer syntax to qualify the file URL and specify the ID attribute value of the relevant element in the TEI file.  &quot;text/tei&quot; applies to all &lt;file&gt; elements that represent integral TEI content files.


 * &quot;text/ocr&quot; designates versions of the text produced by ocr technologies, but which have not been cleaned, and hence may not be appropriate for some purposes. &quot;text/reference&quot; designates unstructured text, including clean ocr, which is appropriate for reference and presentation purposes.


 * &quot;application&quot; designates all types of non-video and non-audio content files encoded in a proprietary format requiring special browser plugins or players for presentation to the user. This is the appropriate USE attribute value for &quot;pdf&quot; and &quot;ps&quot; content files.


 * Audio content files are represented by two USE values: &quot;audio/master&quot; for audio master or archive files not intended for general reference use; and &quot;audio/reference&quot; for audio files suitable for general reference use.


 * Video content files are also represented by two USE values: &quot;video/master&quot;; for video master or archive files not intended for general reference use; and &quot;video/reference&quot; for video files suitable for general reference use.


 * vocabulary:


 * name:
 * MPDL DLC METS &lt;structMap&gt; TYPE attribute values


 * maintenance_agency:
 * Max Planck Digital Library, Munich


 * values:


 * value:
 * physical


 * value:
 * logical


 * value:
 * mixed


 * context:
 * mets/structMap/@TYPE


 * description:
 * These are the supported values for the &lt;structMap&gt; TYPE attribute in METS documents conforming to this profile.


 * &quot;physical&quot; designates a purely physical structure. For example, a book divided into page views.


 * &quot;logical&quot; designates a purely logical structure. For example, a book divided into chapters.


 * &quot;mixed&quot; designates a mixed structure. For example, a book divided into chapters, divided into page views.

structural_requirements:


 * metsRootElement:


 * requirement:
 * The root &lt;mets&gt;element must include a LABEL attribute value.


 * requirement:
 * The root &lt;mets&gt; element must include an OBJID attribute value containing a valid ID and uniquely identifying the object represented by the METS document in its owning repository.


 * requirement: @RELATEDMAT="vc1"
 * The root &lt;mets&gt; element may, but need not include a TYPE attribute. This profile does not specify a vocabulary for the TYPE attribute.


 * metsHdr:


 * requirement:
 * Conforming METS documents must contain a metsHdr element.


 * requirement:
 * The &lt;metsHdr&gt; element must include the CREATEDATE attribute value. It must also include the LASTMODDATE attribute value if this does not coincide with the CREATEDATE.


 * requirement:
 * The &lt;metsHdr&gt; element must include a child &lt;agent&gt; element identifying the person or institution responsible for creating the METS object.


 * dmdSec:


 * requirement:
 * Conforming METS documents must contain a one or more &lt;dmdSec&gt; elements. Each &lt;dmdSec&gt; may in turn contain a &lt;mdRef&gt; or a &lt;mdWrap&gt;.


 * requirement: @RELATEDMAT="ext1"
 * This version of any descriptive metadata appearing in &lt;mdWrap&gt; elements must conform to the MODS schema.


 * requirement: @RELATEDMAT="ext2"
 * This version of any descriptive metadata appearing in &lt;mdWrap&gt; elements must conform to the TEI_HEADER schema.


 * requirement: @RELATEDMAT="ext3"
 * This version of any descriptive metadata appearing in &lt;mdWrap&gt; elements must conform to the OTHER schema.


 * amdSec:


 * requirement:
 * Conforming METS documents may, but need not, contain an &lt;amdSec&gt; element. This may but need not contain one or more &lt;techMD&gt; elements, &lt;sourceMD&gt; elements, &lt;digiprovMD&gt; elements and/or &lt;rightsMD&gt; elements.


 * requirement:
 * A conforming METS document will contain no more than one &lt;amdSec&gt; element. All &lt;techMD&gt;, &lt;sourceMD&gt;, &lt;rightsMD&gt; and &lt;digiprovMD&gt; elements must appear in this single &lt;amdSec&gt; element.


 * requirement: @RELATEDMAT="ext4"
 * If one or more &lt;rightsMD&gt; elements are present they must contain &lt;xmlData&gt; conforming to the RightsDeclarationMD (METSRights) schema.


 * requirement: @RELATEDMAT="ext5"
 * Any &lt;sourceMD&gt; or &lt;digiprovMD&gt; elements must contain &lt;xmlData&gt; conforming to PREMIS or another METS Editorial Board endorsed schema whenever such a schema exists and covers the requisite concepts.


 * fileSec:


 * requirement:
 * The &lt;fileSec&gt; of a conforming METS document must contain a parent &lt;fileGrp&gt; for each file format/use represented by the content files. For example, the &lt;fileSec&gt; of a typical METS document implementing this profile might contain one &lt;fileGrp&gt; representing TIFF master images, one &lt;fileGrp&gt; representing high resolution JPEG reference images, one &lt;fileGrp&gt; representing medium resolution JPEG reference images, one &lt;fileGrp&gt; representing GIF thumbnail images, and one &lt;fileGrp&gt; representing TEI transcriptions.  This profile does not support nested &lt;fileGrp&gt; elements.


 * requirement: @RELATEDMAT="vc2"
 * Each &lt;file&gt; represented in the &lt;fileSec&gt; must have an associated USE attribute. The USE attribute may be expressed directly at the &lt;file&gt; element level. Alternately, however, the USE attribute may be expressed in conjunction with the &lt;fileGrp&gt; that is the immediate parent of a &lt;file&gt; element; in this case it is taken to pertain to all &lt;file&gt; elements in the &lt;fileGrp&gt;. The  &lt;file&gt;/&lt;fileGrp&gt; USE attribute values must be drawn from the MPDL DLC METS &lt;file&gt;/&lt;fileGrp&gt; USE Attribute Values.


 * requirement:
 * Each &lt;file&gt; represented in the &lt;fileSec&gt; must have an associated MIMETYPE attribute. This attribute must contain the official MIME type value for the content file represented.


 * requirement:
 * The &lt;file&gt; elements in a conforming METS document may, but need not contain ADMID, SEQ, SIZE, CREATED, CHECKSUM, CHECKSUMTYPE, OWNERID or GROUPID attribute values.


 * requirement:
 * Any &lt;file&gt; element may reference any number of pertinent &lt;techMD&gt;, &lt;sourceMD&gt; and &lt;digiprovMD&gt; metadata elements within the &lt;amdSec&gt; via its AMDID attribute value. It should only reference ID values at the &lt;techMD&gt;, &lt;sourceMD&gt; and/or &lt;digiprovMD&gt; levels of the &lt;amdSec&gt;.


 * requirement: @RELATEDMAT="structMap5"
 * &lt;file&gt; elements should not reference the IDs of &lt;rightsMD&gt; elements in their ADMID attributes. Under this profile it is the responsibility of the &lt;div&gt; elements in the &lt;structMap&gt; to reference the &lt;rightsMD&gt; elements that pertain the content the &lt;div&gt; elements represent.


 * requirement:
 * If the &lt;file&gt; element SEQ attribute is used, it should appear in every &lt;file&gt; element and express the ordinal number corresponding to the &lt;file&gt; element's sequence in its immediate &lt;fileGrp&gt;.


 * requirement:
 * If the &lt;file&gt; element GROUPID attribute is used, it should appear in every &lt;file&gt; element. The GROUPID of &lt;file&gt; elements that represent different manifestations of the same content should have the same GROUPID value.


 * requirement: @RELATEDMAT="structMap4"
 * This profile does not support the use of the DMDID attribute in &lt;file&gt; elements. Under this profile it is the responsibility of the &lt;div&gt; elements in the &lt;structMap&gt; to reference the &lt;dmdSec&gt; elements that pertain the content the &lt;div&gt; elements represent. Content file level descriptive metadata is not supported.


 * requirement:
 * Each &lt;file&gt; element must contain an &lt;FLocat&gt; element which specifies external location of the content file in its xlink:href attribute.The &lt;FLocat&gt; element must contain an xlink:href attribute, as well as a LOCTYPE attribute indicating the type of href being provided. It may, but need not contain, any of the other attributes defined in xlink:simpleLink as well as a CONTENTIDS attribute. No guidelines are provided, however, for the use of these attributes.


 * requirement: @RELATEDMAT="vc2"
 * In cases where the content represented by a &lt;mets&gt; document includes just a selected element or elements from an XML encoded structured text file conforming to the TEI schema, then the xlink:href attribute in the &lt;FLocat&gt; element may use XPointer syntax to isolate the relevant section (or element) of the integral TEI document. For example, the &lt;FLocat&gt; element for a TEI file might look like this in the case where &quot;Part1&quot; was the ID value associated with the relevant element of the TEI file  represented by the &lt;file&gt; element: &lt;mets:FLocat xlink:href="http://dlc.mpdl.mpg.de/khi/rara/text/dlc1234.xml#Part1" LOCTYPE="URL" /&gt;. This handling is an alternative to the more standard practice of using the &lt;area&gt; element in conjunction with an &lt;fptr&gt; element in the &lt;structMap&gt; to isolate a portion of an integral file. Note that when a &lt;file&gt; element references just an element within a TEI file as described here, the file USE attribute must be &quot;text/tei element&quot;.


 * requirement:
 * This profile supports the use of one and only one &lt;FLocat&gt; element in conjunction with each &lt;file&gt; element.


 * requirement:
 * This profile does not support the use of the &lt;FContent&gt;, &lt;stream&gt;, or &lt;transformFile&gt; elements.


 * requirement:
 * In the case of Real audio content, only the actual Real audio content files (the .rm files) should be represented in the &lt;fileSec&gt; and referenced via &lt;fptr&gt; and &lt;area&gt; elements in the &lt;structMap&gt;. This profile assumes that any necessary launch file (e.g, an .ram file) will be generated dynamically by the presentation applications.


 * structMap:


 * requirement:
 * A conforming METS document must contain a one or more &lt;structMap&gt;. This &lt;structMap&gt; must not be empty.


 * requirement: @RELATEDMAT="vc3"
 * A conforming &lt;structMap&gt; may, but need not, contain a TYPE attribute.


 * requirement:
 * Each &lt;div&gt; must include a LABEL attribute value and a TYPE attribute value.The LABEL attribute should identify the division in a manner suitable for presentation to the end user in an associated &quot;table of contents&quot; and that will facilitate user navigation. While there is no controlled vocabulary list dictated for the TYPE attribute, the TYPE attributes for &lt;div&gt; elements representing the physical levels of the structure of the original source material should, whenever possible, contain a common  generic designation for the physical level represented.  For example, &quot;page&quot;, &quot;detail&quot;, &quot;recto&quot;, &quot;verso&quot;, etc.


 * requirement:
 * A &lt;div&gt; element at any level may point to one or more pertinent &lt;dmdSec&gt; elements via its DMDID attribute value. However, the DMDID attribute should only reference ID values declared at the &lt;dmdSec&gt; element level, and not IDs at lower levels. For example, a &lt;div&gt; DMDID attribute should not reference the ID value of an element within the &lt;xmlData&gt; section of a &lt;dmdSec&gt;.


 * requirement:
 * A &lt;div&gt; element at any level may use its ADMID attribute to point to a &lt;rightsMD&gt; element that contains the rights metadata pertinent to the content the &lt;div&gt; element represents. In this case, the indicated rights metadata applies to the &lt;div&gt; that references it as well as all of its descendant &lt;div&gt; elements which do not themselves contain an ADMID reference to a different &lt;rightsMD&gt; element.


 * requirement:
 * A &lt;div&gt; element may, but need not, include ORDER, ORDERLABEL, CONTENTIDS, and/or xlink:label attribute values. This profile dictates no guidelines or rules for the use of these &lt;div&gt; attributes.


 * requirement:
 * A &lt;div&gt; element at any level may itself contain one or more &lt;div&gt; elements, one or more &lt;fptr&gt; elements, or a single &lt;mptr&gt; element. A &lt;div&gt; element may contain both &lt;div&gt; elements and &lt;fptr&gt; elements; the &lt;mptr&gt; element, however, may not occur in combination with any other elements including another &lt;mptr&gt; element under the parent &lt;div&gt;.


 * requirement:
 * &lt;fptr&gt; elements that reference images representing different manifestions (resolutions) of the same content must appear consecutively under the &lt;div&gt; to which they pertain. Any &lt;fptr&gt; elements referencing &lt;file&gt; elements whose USE is &quot;image/thumbnail&quot; must be arranged together in order of increasing size; and  any &lt;fptr&gt; elements referencing &lt;file&gt; elements whose USE is &quot;image/reference&quot; must also also be arranged together in order by size.


 * requirement:
 * The &lt;mptr&gt; element, if it is used, must contain an xlink:href attribute, as well as a LOCTYPE attribute indicating the type of href being provided. It may, but need not contain, any of the other attributes defined in the xlink:simpleLink attribute group as well as a CONTENTIDS attribute. No guidelines are provided, however, for the use of these attributes.


 * requirement: @RELATEDMAT="structMap11"
 * An &lt;fptr&gt; element must either 1) directly point to a &lt;file&gt; element via its FILEID attribute; or 2) contain an &lt;area&gt; element that points to a &lt;file&gt; element; or 3) contain a &lt;seq&gt; element comprising multiple &lt;area&gt; elements that point to the relevant &lt;file&gt; elements. METS documents implementing this profile must not use the &lt;par&gt; element.Typically the &lt;seq&gt; element would only be used under a &lt;div&gt; element that represented an intellectual (or logical) division, such as a diary entry. In this case, more than one content file, played in sequence, may be required to manifest the logical division.


 * requirement:
 * An &lt;fptr&gt; element could directly contain an &lt;area&gt; element if only a portion of an integral file manifested the parent &lt;div&gt;. This profile supports such use of the &lt;area&gt; element in conjunction with structured text, audio and video content only and as follows. 1) Structured text content. If an &lt;fptr&gt; element represents just a portion of an integral structured text file (such as  a TEI file),  an &lt;area&gt; element under the &lt;fptr&gt; would point to the  &lt;file&gt; element representing the integral structured text document (via its FILEID attribute) and would at least indicate the starting point of the relevant section of the content file via the &lt;area&gt; BEGIN attribute. The BEGIN attribute, in this case, would have a BETYPE of &quot;IDREF&quot;. The &lt;area&gt; element might also express the end point of the relevant section of the referenced file via its END attribute, but it need not do so. 2) Audio and video content. If an &lt;fptr&gt; element represents just a segment of an  integral audio or video content file, then  the &lt;area&gt; element under the &lt;fptr&gt; should point to the  &lt;file&gt; element representing the integral audio or video file (via its FILEID attribute) and must indicate the starting point and duration of the relevant section of the referenced audio or video file via the &lt;area&gt; BEGIN  and EXTENT attributes. The BETYPE and EXTTYPE attributes in these cases must be &quot;TIME&quot;  and the BEGIN and EXTENT attributes must contain a simple time value in the format HH:MM:SS.  If the relevant extent from the specified BEGIN point is the remainder of the audio or video file, then the EXTENT and EXTTYPE atributes may be omitted.


 * requirement:
 * An &lt;area&gt; element in documents implementing this profile should not use the SHAPE, COORDS or ADMID attributes. It may, but need not, contain a CONTENTIDS attribute; however this profile provides no guidelines or rules for the use of this attribute.


 * requirement:
 * An &lt;fptr&gt;; element representing a content type other than structured text, audio or video in conforming documents can only reference integral content files. METS documents conforming to this profile must not use the &lt;area&gt; element with its associated BEGIN and END attributes to isolate internal segments of such content files. An &lt;fptr&gt; element could, however, under some circumstances, contain &lt;area&gt; elements, providing these &lt;area&gt; elements reference integral content files. For example, an &lt;fptr&gt; element would contain a &lt;seq&gt; element with multiple child &lt;area&gt; elements if multiple files needed to be&quot;played&quot; in sequence to manifest a division.  This might be the case if the &lt;structMap&gt; expressed a logical structure and a &lt;div&gt; in that structure required several files to manifest it.  For example, the &lt;div&gt; elements in the &lt;structMap&gt; for a diary might represent diary entries; and some of these entries might span multiple physical pages, and hence require multiple page image content files to manifest them. In this case, the &lt;div&gt; representing the spanned diary entry would contain at least one &lt;fptr&gt;element; this &lt;fptr&gt; element would contain a &lt;seq&gt; element which in turn contained a separate &lt;area&gt; element pointing to each &lt;file&gt; element representing a page the diary entry spans.


 * requirement:
 * Each &lt;fptr&gt; element that does not contain subsidiary &lt;area&gt; or &lt;seq&gt; elements must point directly to a &lt;file&gt; element in the &lt;fileSec&gt; via its FILEID attributes. Similarly, each &lt;area&gt; element appearing under an &lt;fptr&gt; element or a &lt;seq&gt; element must point to directly to a &lt;file&gt; element via its FILEID attribute.


 * structLink:


 * requirement:
 * A conforming METS document may contain a &lt;structLink&gt; element. This profile, however, establishes no guidelines or expectations for its use.


 * behaviorSec:


 * requirement:
 * A conforming METS document may contain a &lt;behaviorSec&gt; element. This profile, however, establishes no guidelines or expectations for its use.


 * multiSection:


 * requirement:
 * Only &lt;file&gt; elements will reference &lt;techMD&gt;, &lt;sourceMD&gt; and/or &lt;digiprovMD&gt; elements. In other words, documents implementing this profile will express technical, source, and digital provenance administrative metadata in conjunction with content files only rather than in conjunction with &lt;div&gt; elements in the &lt;structMap&gt;.  &lt;rightsMD&gt; elements, however, may be referenced only from &lt;div&gt; elements in the &lt;structMap&gt;.


 * requirement:
 * Only &lt;div&gt; elements will reference &lt;dmdSec&gt; elements. In other words, documents implementing this profile will express descriptive metadata in conjunction with divisions of the &lt;structMap&gt; and not in conjunction with individual content files (&lt;file&gt; elements).

technical_requirements:


 * content_files:


 * requirement:
 * Image, application, video and audio content files referenced from conforming METS documents may be of any type supported by MPDL DLC Guidlines for Digital Objects (http://www.dlc.mpdl.mpg.de/guidelines/).


 * requirement:
 * All &quot;tei&quot; files must be encoded according to version 3 of the "TEI Text Encoding in Libraries: Guidelines for Best Encoding Practices" maintained by the Digital Library Federation (http://www.tei-c.org/SIG/Libraries/teiinlibraries/).

tool: