User:Inga/container tocs

Inga currently tries to understand the container discussion in the eSciDoc technical team and uses this page to structure results and thoughts.

Everything should start with a definition
Following definition of a Container is taken from the eSciDoc framework specification : Containers offer the concept of aggregation, i.e. they can contain other (simple and complex) objects. Each container includes a structural map and can have a TOC and an AdminDescriptor. A container can be a collection or a bundle."

The corresponding container XML schemas slightly differs from this perception by allowing minOccurs=0 to structMap.

The CoLab provides some more details : The information of the aggregated Items and Containers is saved within the StructuralMap of the Container. The information of representation of the aggregated Items and Containers is saved within the TableOfContents of the Container. The StructuralMap and the TableOfContents of the Container are not necessarily same [...]

and In eSciDoc hierarchical structures are build by means of container resources. A container resource refers to its members which are again containers or items. The set of references is represented as structural map (struct-map) inside the representation of a container resource. Additionally a container may contain a table of content (TOC) which contains an ordered selection of members.

The paper "Concepts for Versioning" elaborates the idea that bundles could be used to represent digitized books or manuscripts: eSciDoc allows the grouping of objects by means of container objects like collections or bundles. Bundles may represent e.g. books or manuscripts. The contained objects would then be chapters or pages.

Following definition for StructuralMap and TableOfContents are derived from the logical data model and have been provided by Natasa StructuralMap is defined for a Container and holds references to all directly aggregated Items/Containers within that Container. TableOfContents is defined for a Container and defines (all or subset) of aggregated objects that should be presented in the table of content for the container.

Why TableOfContent is different from the StructuralMap: TableOfContent uses the information from the StructuralMap but is not necessarily the same information e.g. information presented with the StructuralMap is not by definition an ordered aggregation - TableOfContent is always ordered; while a StructuralMap must contain information on "directly" aggregated objects, table of content not necessarily references all (directly and not directly) aggregated objects. While there is only one StructuralMap allowed for a Container - a content container can have more than one TableOfContents defined (depending on the purpose of use of the table of content). --Natasa 10:40, 14 March 2008 (CET)

Containers and members
Aggregation of digital items introduces potential dependencies between containers and members, e.g. it could be reasonable that the release of a member object creates a new version of the container object or that the release of a container object automatically releases the last version of its members as well. Anyway, this stuff is tricky and complicated - especially because items could belong to several containers and this information is not part of the item object.

More information is available in the eSciDoc Content Model and the concept paper on versioning

My assumptions and recommendations
Please check User_talk:Inga/container_tocs for a discussion of these points. The following conclusions has been copied to Talk:ESciDoc Container Toc‎


 * Considering the ViRR requirements I believe that digitized books are VERY strong entities and that individual scanned-in pages are no independent resources. Thus every change in the description of one page or in the table of contents should create a new version of the book. Therefore, I would suggest to implement digitized books as items with an structural map datastream. This would also help us providing METS exports at a later stage because we would operate on the same granularity. Anyway, the ViRR project still could be a test bed for containers, because books need to be grouped to multivolumes as well as books and multivolmes need to be grouped to the ViRR collection. Anyway, on that level "no deep level TOC" is required, it's fine to provide a grouped list of direct members first.


 * The TOC is an optional, but integral component/member of a digitized book ->Changes in the TOC object should version the container object in any case


 * Members are independent from their container(s), thus each item can be member of n containers. -> In cases where users would like to provide an additional TOC for an existing container for which they have no privilege ("non-editor"), they still could create their own container including the same set (or subset) of the items.


 * I would strongly vote for synchronizing definitions, re-considering terms used and harmonizing notations
 * container, i.e. in regard to hierarchical structure
 * table of contents/tableOfContents/TOC/toc - if the escidoc toc is an ordered and grouped overview of [selected] members, it may be semantically in sync with the METS concept "structMap" -> renaming to avoid confusion?
 * StructuralMap/struct-map - if a structural map is the "flat" list of item reference, it may be semantically in accordance to the METS concept "fileSec" -> renaming?