ESciDoc Content Model in Fedora

From MPDLMediaWiki
Revision as of 14:21, 28 July 2009 by Frank (talk | contribs)
Jump to navigation Jump to search

This is a protected page.

!!! in progress !!!


Values stated in an eSciDoc Content Model Object defining content and behavior of a content object (e.g. Item, Container) are separeted in three categories:

  1. Creation; values pertaining the initial state
  2. Transition; values defining possible transitions or effects of specific transitions
  3. State; values describing content independent of the previous or next state

Creation[edit]

Information considered in the creation process.

  • initial state
  • versioning enabled
  • name of main metadata record
  • schema of main metadata record
  • dc mapping
  • (content checksum enabled)
  • (applies to object pattern)

Transition[edit]

Information considered for specific operations in order to decide if the operation is allowed and/or which sub-operations must be triggered or are requirements for that operation.

  • status transitions (e.g from pending to submitted)
  • cascade information (e.g. for containers, should a release of all members be tried on release)

State[edit]

Information used to validate the current state of the resource.

  • Name, schema, and occurrence of additional metadata records.
  • mime-types of content
  • Name, content-category etc. of Component
  • content model of allowed members, occurrences

Note: The listings above are not necessarily complete.


Validation[edit]

Information related to the state of a resource (the content object) are used to validate the object.

Components and Members[edit]

The description of which Components of an Item and what kind of members of a Container are allowed may in Fedora be validated as relations. In fact both are stored as relations in Fedora (see "Structural Relations" below) but informations about the kind of the related resource is needed to reach the intended level of description.

Relations and Datastreams[edit]

With CMA and ECM the relations and datastreams of a Fedora Object can be validated. Both can easily be mapped from a description of an eSciDoc resource into a description of a Fedora object except for cardinality of relations.

Relations of an eSciDoc resource are not directly mapped to relations of the corresponding Fedora object (see below "Content Relations").

Metadata Datastreams[edit]

Metadata records of an eSciDoc resource are defined by a name and a XML Schema. Technically such a record is stored as datastream of MIME type text/xml in a Fedora object where the name of the metadata record is the name of the datastream, the XML schema applies to the content of the datastream and the datastream is marked as eSciDoc metadata record.

The description of an eSciDoc metadata record can be mapped to the description of a Fedora datastream and vice versa. So a validation of the eSciDoc resource is possible as well as a Fedora object validation based on Fedoras modeling language extended by ECM.

This approche lacks the possibility to state optional metadata records or to restrict the set of metadata records to the defined set.

Content Datastreams[edit]

Content (also referred as binary content) of an eSciDoc resource is defined by a name (an individual name in case of content-stream in Item and the name "content" in Component) and a mime-type. These values can be accurately mapped to the values of a datastream in Fedora. The storage-type of the content can be freely choosen and is not restricted by the content model. So a validation of the eSciDoc resource is possible as well as a Fedora object validation based on Fedoras modeling language extended by ECM. The content itself is not considered for validation by content model.

This approche lacks the possibility to state optional content-streams in Item or to restrict the of set content-streams in Item to the defined set.

Content Relations[edit]

Idea: Define a global ontology which allows all defined relations for all possible objects. With ECM: "All allowed relations must be defined in the ontology". So every content model object in Fedora must hold every possible relation.

BUT technically content relations are objects by their own. So every Fedora content model derived from an eSciDoc content model must just allow to state relations to Content Relation Objects. From this point of view it must be considered as advantage there is only one predicate referring to Content Relations. Otherwise the ECM rule always to state the complete set of possible relations would break the idea to freely relate resources by Content Relations without respect for ownership and content model of the resources.

Structural Relations[edit]

Yes. No other cardinality then 1 or 0.

The validation of structural relations in Fedora may cover the validation of the descriptions of Components and members in eSciDoc.

Members conform to a specified set of content models but ... (? Does ECM allow everything from OWL Lite?). Allowed member content models by allValuesFrom-restriction with union of content model classes.

To be able to map the description of a set of Components including values of content-category, mime-type etc. a content model for Component objects is necessary.

For every component-type statet in the eSciDoc content model a separate Fedora content model object is created. The name (aka content-category) of a component-type and the ID of the eSciDoc Content Model Object are used to generate the ID of the Component Content Model Object. Allowed metadata records are modeled as for common eSciDoc resources (see above "Metadata Datastreams"). The allowed mime-types are listed for the datastream "content". The content-category is ensured by a hasValue-restriction (must be checked if supported by ECM).


File:ContentModel-draft.xml

File:ContentModel-draft-dsCompositeModel.xml

File:ContentModel-draft-ONTOLOGY.xml

File:ContentModel-draft-CM component FULLSIZE-dsCompositeModel.xml

File:ContentModel-draft-CM component FULLSIZE-ONTOLOGY.xml