ESciDoc Logical Data Model

From MPDLMediaWiki
Revision as of 08:40, 28 September 2009 by Natasab (talk | contribs) (→‎Introduction)
Jump to navigation Jump to search

Status: In PROGRESS

This is a protected page.

Restricted Access to eSciDoc group

Introduction[edit]

Understanding the structure and the nature of the data is essential for the ability to meet the requirements of managing various type of content within an eSciDoc repository. Therefore the eSciDoc logical data model was developed to enable on one hand side implementation of core data services based on high-level abstractions, and on the other hand side to allow for further specialization of data and further implementation of specialized services. This page may be used as a starting point for understanding the eSciDoc data structures. The eSciDoc manages two general categories of data:


  • Resources (content resources) - the content of the repository such as: articles, book, images, image albums, scanned manuscripts, pages etc.
  • MasterData - additional classes of data that are used for management of Resources such as: organizational units, contexts.


A simple delineation between these two data categories may be stated in the following manner: resources are the real content that can be further extended, shared and preserved. Master data are used for content (i.e. resource) administration and as referenced entities of importance. Master data can also be referenced by objects outside of the core eSciDoc repository.





LDMExplained.png




Content resources are defined by two generic object patterns: Item and Container.

  • An Item resource consists of metadata records (e.g. eSciDoc publication metadata, SISIS MAB record, MODS record, Dublin Core record) and optionally of components that represent the actual content (e.g. PDF file, JPEG file, XML file).
  • A Container resource is an aggregation of other resources that allows for aggregating other items or containers. Like the Item resource, Container can be described by multiple metadata records.


Each resource (Item or Container) is maintained in a single administrative Context. Contexts are responsibility of organizations (e.g. on or more a project groups, institutions etc.). Organizations responsibly for contexts are define the settings within the context (by the mechanism of so called Admin Descriptor) in accordance with their needs to express rules for content creation, update, quality assurance of the metadata, dissemination, preservation, authorization policies, submission policies, etc.

A Content model is a formal representation of discipline-specific data model such as an integrated image and text view of primary sources or a precisely documented collection of images. For example, a digitized book can be expressed as a container of book page items and related transcription items. The book container (see image below) has bibliographic metadata based on the MODS and MAB metadata schema. Each page item consists of the digitized image of the page and a metadata record. The metadata record may contain metadata inherited from the book container metadata. In addition, it has own, page item specific metadata such as: page number (e.g. 1, 2, 3, 4 or I, II, III, IV), chapter information.




CModel Example.jpg

Data model explained[edit]

  • Item
  • Container
  • Context
  • Organizational unit