User:Inga/container tocs

From MPDLMediaWiki
Jump to: navigation, search

Inga currently tries to understand the container discussion in the eSciDocEnhanced Scientific Documentation technical team and uses this page to structure results and thoughts.

Everything should start with a definition

Following definition of a Container is taken from the eSciDocEnhanced Scientific Documentation framework specification[1]:

Containers offer the concept of aggregation, i.e. they can contain other (simple and complex) objects. 
Each container includes a structural map and can have a TOCTable of Contents and an AdminDescriptor. A container can be 
a collection or a bundle."

The corresponding container XMLExtensible Markup Language schemas[2] slightly differs from this perception by allowing minOccurs=0 to structMap.

The CoLabCollaboration Laboratory provides some more details[3]:

The information of the aggregated Items and Containers is saved within the StructuralMap of the Container. 
The information of representation of the aggregated Items and Containers is saved within the TableOfContents 
of the Container. The StructuralMap and the TableOfContents of the Container are not necessarily same [...]

and[4]

In eSciDocEnhanced Scientific Documentation hierarchical structures are build by means of container resources. A container resource refers 
to its members which are again containers or items. The set of references is represented as structural map 
(struct-map) inside the representation of a container resource. Additionally a container may contain a table 
of content (TOCTable of Contents) which contains an ordered selection of members.

The paper "Concepts for Versioning"[5] elaborates the idea that bundles could be used to represent digitized books or manuscripts:

eSciDocEnhanced Scientific Documentation allows the grouping of objects by means of container objects like collections or bundles. Bundles 
may represent e.g. books or manuscripts. The contained objects would then be chapters or pages.

Following definition for StructuralMap and TableOfContents are derived from the logical data model and have been provided by Natasa

StructuralMap is defined for a Container and holds references to all directly aggregated Items/Containers
within that Container. 
TableOfContents is defined for a Container and defines (all or subset) of aggregated objects that should be 
presented in the table of content for the container. 

Why TableOfContent is different from the StructuralMap: TableOfContent uses the information from the StructuralMap but is not necessarily the same information e.g. information presented with the StructuralMap is not by definition an ordered aggregation - TableOfContent is always ordered; while a StructuralMap must contain information on "directly" aggregated objects, table of content not necessarily references all (directly and not directly) aggregated objects. While there is only one StructuralMap allowed for a Container - a content container can have more than one TableOfContents defined (depending on the purpose of use of the table of content). --Natasa 10:40, 14 March 2008 (CETCentral European Time)

Containers and members

Aggregation of digital items introduces potential dependencies between containers and members, e.g. it could be reasonable that the release of a member object creates a new version of the container object or that the release of a container object automatically releases the last version of its members as well. Anyway, this stuff is tricky and complicated - especially because items could belong to several containers and this information is not part of the item object.

More information is available in the eSciDocEnhanced Scientific Documentation Content Model [6] and the concept paper on versioning

My assumptions and recommendations

Please check User_talk:Inga/container_tocs for a discussion of these points. The following conclusions has been copied to Talk:ESciDoc Container Toc‎

  • Considering the ViRRVirtueller Raum Reichsrecht requirements I believe that digitized books are VERY strong entities and that individual scanned-in pages are no independent resources. Thus every change in the description of one page or in the table of contents should create a new version of the book. Therefore, I would suggest to implement digitized books as items with an structural map datastream. This would also help us providing METSMetadata Encoding and Transmission Standard exports at a later stage because we would operate on the same granularity. Anyway, the ViRRVirtueller Raum Reichsrecht project still could be a test bed for containers, because books need to be grouped to multivolumes as well as books and multivolmes need to be grouped to the ViRRVirtueller Raum Reichsrecht collection. Anyway, on that level "no deep level TOCTable of Contents" is required, it's fine to provide a grouped list of direct members first.
  • The TOCTable of Contents is an optional, but integral component/member of a digitized book
    ->Changes in the TOCTable of Contents object should version the container object in any case
  • Members are independent from their container(s), thus each item can be member of n containers.
    -> In cases where users would like to provide an additional TOCTable of Contents for an existing container for which they have no privilege ("non-editor"), they still could create their own container including the same set (or subset) of the items.
  • I would strongly vote for synchronizing definitions, re-considering terms used and harmonizing notations
    • container, i.e. in regard to hierarchical structure
    • table of contents/tableOfContents/TOCTable of Contents/toc - if the escidoc toc is an ordered and grouped overview of [selected] members, it may be semantically in sync with the METSMetadata Encoding and Transmission Standard concept "structMap" -> renaming to avoid confusion?
    • StructuralMap/struct-map - if a structural map is the "flat" list of item reference, it may be semantically in accordance to the METSMetadata Encoding and Transmission Standard concept "fileSec" -> renaming?

References

  1. API Documentation Container Handler, rest interface, framework release 0.9
  2. XML schema Container rest interface, framework release 0.9
  3. ESciDoc_Services:core_services describes the Container Handler service and introduces the terms "Container", "TableOfContents" and "StructuralMap"
  4. The ESciDoc Container Toc‎ article introduces a TOCTable of Contents representation based on RSSReally Simple Syndication 1.0 which is probably no longer in consideration. The Questions & Discussion section may be of further interest.
  5. Matthias Razum, Frank Schwichtenberg: Concepts for Versioning. Last Changed: May 22, 2007
  6. eSciDoc Content Model - according to Natasa, this page was created longer time ago and never finalized