Each item stored in the eSciDoc system will be accompanied by at least one metadata record keeping all descriptive information about the resource represented by the respective item, e.g. the creator and title of a publication.
The functional specifications regarding the bibliographic/descriptive metadata required to describe the item types handled by eSciDoc (e.g. "collection", "image", "publication") are part of the eSciDoc Solution pages. These specifications are implemented by a set of W3C XML schemas which are available in the subversion repository. The metadata record need to conform to these schemas in order to be handled by the system, e.g. to support the creation, modification or search for the item.
The team currently works on a set of application profiles for eSciDoc solutions.
Scope of MDS specification and XML schema (XSD)
In general, the MDS specification is aimed to "influence" the system in several contexts, (e.g. when generating edit/view item interfaces (GUI)) and should serve as a basis for validation and indexing of metadata records. In our discussion on 19th of December we agreed on following usage scenarios for the XML schema (XSD):
In scope metadata schema (XSD)
- Outlook: simple schema, with few element and value constraints.
- Each content type is described by one schema
- Objects will be validated according to this schema by the framework in the moment they are stored
- XML schema will be used as basis for transformation definition/document required for indexing by lucene (taks owner: FIZ)
Out of scope metadata schema (XSD)
The eSciDoc metadata schemas focus on information to describe the intellectual content of the work. Therefore, it does not include
- administrative metadata for the items and files (e.g. pids, owner, accessibility, content type of files). Those elements are implemented as properties of the items or the item components, see PubMan File Properties. Properties can also be searched and will be available for comprehensive exports and transformations to other formats (e.g. Dublin Core)
- copyright & license statements = part of administrative metadata
- structural and content relations between items. Relations are stored in a separate data stream within the object (rel-ext) and will be available for comprehensive exports and transformations to other formats (e.g. Dublin Core)
- validation rules: additional constraints (e.g. "Original size X or Original size Y needs to be filled", "Language has to be specified") may be defined via the Validating service. This service is also considered to implement constraints between genres and relevant metadata elements
- display issues: which elements are relevant for which genre (e.g. an ISBN can only be defined for books)
- GUI is currently basing on java value objects, thus the interface knows the elements and certain display conditions
- As all elements are known to the interface, adding/changing the metadata set requires changes in the interface as well
Relationship to ePrints Application Profile
The ePrints application profile is available under http://www.ukoln.ac.uk/repositories/digirep/index/Model
Usage of ePrints AP for eSciDoc metadata schema:
- We currently do not build on an hierarchical structure to express the four entity levels, but describe Publication objects as "flat" objects. Anyway, we should relate our objects to one of the levels (expression or manifestation) defined there
- "ScholarlyWork" is a very abstract level, we might build this view by interpreting the relationships "isRevisionOf" eSciDoc PubMan will build between objects.
- We will consider the re-use of relation specified with ePrints AP