Difference between revisions of "ESciDoc Logical Data Model"
m (→Introduction) |
m (→Introduction) |
||
Line 31: | Line 31: | ||
Items and Containers are very generic resources and they do not speak for themselves about the content they represent or about their own structure e.g. what kind of metadata may be associated with them, what kind of members a container aggregates, or what kind of resources they represent semantically. Therefore, eSciDoc logical data model introduced the concept of a '''Content model'''. It is a formal representation of discipline-specific data model such as an integrated image and text view of a scanned manuscript page or a precisely documented collection of images. For example, a digitized book can be expressed as a container of book page items and related transcription items. The book container (see image below) has bibliographic metadata based on the MODS and MAB metadata schema. Each page item consists of the digitized image of the page and a metadata record. The metadata record may contain metadata inherited from the book container metadata. In addition, it may has its own, page item specific metadata such as: page number (e.g. 1, 2, 3, 4 or I, II, III, IV), chapter information. Each Item or a Container has to claim that it is an instance of exactly one content model. There is no limitation on the number of instances for a single content model. | Items and Containers are very generic resources and they do not speak for themselves about the content they represent or about their own structure e.g. what kind of metadata may be associated with them, what kind of members a container aggregates, or what kind of resources they represent semantically. Therefore, eSciDoc logical data model introduced the concept of a '''Content model'''. It is a formal representation of discipline-specific data model such as an integrated image and text view of a scanned manuscript page or a precisely documented collection of images. For example, a digitized book can be expressed as a container of book page items and related transcription items. The book container (see image below) has bibliographic metadata based on the MODS and MAB metadata schema. Each page item consists of the digitized image of the page and a metadata record. The metadata record may contain metadata inherited from the book container metadata. In addition, it may has its own, page item specific metadata such as: page number (e.g. 1, 2, 3, 4 or I, II, III, IV), chapter information. Each Item or a Container has to claim that it is an instance of exactly one content model. There is no limitation on the number of instances for a single content model. | ||
<br/> | |||
'''Example:''' a content model named "CModel: Publication" defines a resource which is an Item, has bibliographic metadata record in accordance with the ESciDoc Publication metadata profile, and may have several PDF file associated that represent the publisher version, the pre-print or some supplementary material. It is used to represent content resources which are published Articles, Conference Papers, Books etc. | |||
Besides providing semantic information about the content and the structure of a resource, the | <br/> | ||
Besides providing semantic information about the content and the structure of a resource, content model may additionally define services that are applicable for the resources that are instances of the content model. Such services are for example specialized image viewers, TEI-formatted text viewers, services that offer various transformations etc. | |||
As being formalized, the definition of the content model is additionally used* for validation of the instantiated resources. | |||
* - as of core-service release 1.3 of eSciDoc | |||
<br/><br/><br/> | <br/><br/><br/> |
Revision as of 08:56, 28 September 2009
Status: In PROGRESS
This is a protected page.
Restricted Access to eSciDoc group
Introduction[edit]
Understanding the structure and the nature of the data is essential for the ability to meet the requirements of managing various type of content within an eSciDoc repository. Therefore the eSciDoc logical data model was developed to enable on one hand side implementation of core data services based on high-level abstractions, and on the other hand side to allow for further specialization of data and further implementation of specialized services. This page may be used as a starting point for understanding the eSciDoc data structures. The eSciDoc manages two general categories of data:
- Resources (content resources) - the content of the repository such as: articles, book, images, image albums, scanned manuscripts, pages etc.
- MasterData - additional classes of data that are used for management of Resources such as: organizational units, contexts.
A simple delineation between these two data categories may be stated in the following manner: resources are the real content that can be further extended, shared and preserved. Master data are used for content (i.e. resource) administration and as referenced entities of importance. Master data can also be referenced by objects outside of the core eSciDoc repository.
Content resources are defined by two generic object patterns: Item and Container.
- An Item resource consists of metadata records (e.g. eSciDoc publication metadata, SISIS MAB record, MODS record, Dublin Core record) and optionally of components that represent the actual content (e.g. PDF file, JPEG file, XML file).
- A Container resource is an aggregation of other resources that allows for aggregating other items or containers. Like the Item resource, Container can be described by multiple metadata records.
Each resource (Item or Container) is maintained in a single administrative Context. Contexts are responsibility of organizations (e.g. on or more a project groups, institutions etc.). Organizations responsibly for contexts are define the settings within the context (by the mechanism of so called Admin Descriptor) in accordance with their needs to express rules for content creation, update, quality assurance of the metadata, dissemination, preservation, authorization policies, submission policies, etc.
Items and Containers are very generic resources and they do not speak for themselves about the content they represent or about their own structure e.g. what kind of metadata may be associated with them, what kind of members a container aggregates, or what kind of resources they represent semantically. Therefore, eSciDoc logical data model introduced the concept of a Content model. It is a formal representation of discipline-specific data model such as an integrated image and text view of a scanned manuscript page or a precisely documented collection of images. For example, a digitized book can be expressed as a container of book page items and related transcription items. The book container (see image below) has bibliographic metadata based on the MODS and MAB metadata schema. Each page item consists of the digitized image of the page and a metadata record. The metadata record may contain metadata inherited from the book container metadata. In addition, it may has its own, page item specific metadata such as: page number (e.g. 1, 2, 3, 4 or I, II, III, IV), chapter information. Each Item or a Container has to claim that it is an instance of exactly one content model. There is no limitation on the number of instances for a single content model.
Example: a content model named "CModel: Publication" defines a resource which is an Item, has bibliographic metadata record in accordance with the ESciDoc Publication metadata profile, and may have several PDF file associated that represent the publisher version, the pre-print or some supplementary material. It is used to represent content resources which are published Articles, Conference Papers, Books etc.
Besides providing semantic information about the content and the structure of a resource, content model may additionally define services that are applicable for the resources that are instances of the content model. Such services are for example specialized image viewers, TEI-formatted text viewers, services that offer various transformations etc.
As being formalized, the definition of the content model is additionally used* for validation of the instantiated resources.
- - as of core-service release 1.3 of eSciDoc
Data model explained[edit]
- Item
- Container
- Context
- Organizational unit