Difference between revisions of "User:Inga/container tocs"

From MPDLMediaWiki
Jump to navigation Jump to search
 
(11 intermediate revisions by the same user not shown)
Line 8: Line 8:
  a collection or a bundle."
  a collection or a bundle."


The corresponding container XML schemas<ref name="container_xsd">[http://www.escidoc-project.de/schemas/rest/container/0.3/container.xsd - XML schema Container] rest interface, framework release 0.9</ref> slightly differs from this perception by allowing minOccurs=0 to structMap.  
The corresponding container XML schemas<ref name="container_xsd">[http://www.escidoc-project.de/schemas/rest/container/0.3/container.xsd XML schema Container] rest interface, framework release 0.9</ref> slightly differs from this perception by allowing minOccurs=0 to structMap.  


The CoLab provides some more details<ref name="core_services">[[ESciDoc_Services:core_services]] describes the Container Handler service and introduces the terms "Container", "TableOfContents" and "StructuralMap"</ref>:
The CoLab provides some more details<ref name="core_services">[[ESciDoc_Services:core_services]] describes the Container Handler service and introduces the terms "Container", "TableOfContents" and "StructuralMap"</ref>:
Line 20: Line 20:
  (struct-map) inside the representation of a container resource. Additionally a container may contain a table  
  (struct-map) inside the representation of a container resource. Additionally a container may contain a table  
  of content (TOC) which contains an ordered selection of members.
  of content (TOC) which contains an ordered selection of members.
The paper "Concepts for Versioning"<ref name="versioning">Matthias Razum, Frank Schwichtenberg: Concepts for Versioning. Last Changed: May 22, 2007</ref> elaborates the idea that bundles could be used to represent digitized books or manuscripts:
eSciDoc allows the grouping of objects by means of container objects like collections or bundles. Bundles
may represent e.g. books or manuscripts. The contained objects would then be chapters or pages.


Following definition for StructuralMap and TableOfContents are derived from the logical data model and have been provided by [[User:Natasab|Natasa]]  
Following definition for StructuralMap and TableOfContents are derived from the logical data model and have been provided by [[User:Natasab|Natasa]]  
Line 33: Line 37:
Aggregation of digital items introduces potential dependencies between containers and members, e.g. it could be reasonable that the release of a member object creates a new version of the container object or that the release of a container object automatically releases the last version of its members as well. Anyway, this stuff is tricky and complicated - especially because items could belong to several containers and this information is not part of the item object.  
Aggregation of digital items introduces potential dependencies between containers and members, e.g. it could be reasonable that the release of a member object creates a new version of the container object or that the release of a container object automatically releases the last version of its members as well. Anyway, this stuff is tricky and complicated - especially because items could belong to several containers and this information is not part of the item object.  


More information is available in the eSciDoc Content Model <ref name="ecm"><[[ESciDoc_Content_Models#Aggregations_and_internal_life-cycle_of_resources |eSciDoc Content Model]] - according to Natasa, this page was created longer time ago and never finalized</ref> and the concept paper on versioning  
More information is available in the eSciDoc Content Model <ref name="ecm">[[ESciDoc_Content_Models#Aggregations_and_internal_life-cycle_of_resources |eSciDoc Content Model]] - according to Natasa, this page was created longer time ago and never finalized</ref> and the concept paper on versioning


== Is there light at the end of the tunnel? ==  
== My assumptions and recommendations ==
* [[ESciDoc Container Toc‎]] states "Grouping of direct members is not necessary; a hierarchical structure is build by container resources wich are linked as members." -> isn't the toc something like a grouping?
* [[ESciDoc Container Toc‎]] states "The only reason to provide more than one TOC for a container resource would be to have different selections of the container resource members. It is assumed that there is no use case for different selections of members of one single container.
* One TOC per container?
* What is the difference between


== The Result: Inga's assumptions and recommendations ==
Please check [[User_talk:Inga/container_tocs]] for a discussion of these points. The following conclusions has been copied to [[Talk:ESciDoc Container Toc‎]]
* Members are independent from their container(s), thus each item can be member of n containers. <br>-> In cases where users would like to provide an additional TOC for an existing container for which they have no privilege ("non-editor"), they still could create their own container including the same set (or subset) of the items.


:As some example, check: http://echo.mpiwg-berlin.mpg.de/content - this is the "overall" Toc of all collections (that represent so to say first level collection of ECHO repository. This would mean that whenever any of the collection of ECHO is changed, and each has its own editor, somehow this would have immediately to affect this TOC - maybe this is very nice automatism, but to use it in "network" like environment may become very complicated :). --[[User:Natasab|Natasa]] 11:04, 14 March 2008 (CET)
* Considering the ViRR requirements I believe that digitized books are VERY strong entities and that individual scanned-in pages are no independent resources. Thus every change in the description of one page or in the table of contents should create a new version of the book. Therefore, I would suggest to implement digitized books as items with an structural map datastream. This would also help us providing METS exports at a later stage because we would operate on the same granularity. Anyway, the ViRR project still could be a test bed for containers, because books need to be grouped to multivolumes as well as books and multivolmes need to be grouped to the ViRR collection. Anyway, on that level "no deep level TOC" is required, it's fine to provide a grouped list of direct members first.


* The TOC is an optional, but integral component/member of a digitized book<br>->Changes in the TOC object should version the container object in any case


 
* Members are independent from their container(s), thus each item can be member of n containers. <br>-> In cases where users would like to provide an additional TOC for an existing container for which they have no privilege ("non-editor"), they still could create their own container including the same set (or subset) of the items.
* The TOC is an optional, but integral component/member of the container<br>->Changes in the TOC object should version the container object


* I would strongly vote for synchronizing definitions, re-considering terms used and harmonizing notations
* I would strongly vote for synchronizing definitions, re-considering terms used and harmonizing notations
:absolutely necessary :) --[[User:Natasab|Natasa]] 11:04, 14 March 2008 (CET)
** container, i.e. in regard to hierarchical structure
** container, i.e. in regard to hierarchical structure
:container is not necessarily hierarchical structure it was originally thought of as aggregation (which may aggregate other items/containers (e.g. aggregations). It again, depends on how one understands the hierarchical structure. In SWB scenarios we've had the case of: CollectionA (member container C1, member container C2, member item I1, member itemI2). In addition (member container C2 of collection A is also a member of the container C1). The only thing which was to be restricted was that we can not have smth like (A has member C2, C2 has member C3, C3 has member C4 and C4 has member C2 (or C4 has member C3). --[[User:Natasab|Natasa]] 11:04, 14 March 2008 (CET)
** table of contents/tableOfContents/TOC/toc - if the escidoc toc is an ordered and grouped overview of [selected] members, it may be semantically in sync with the METS concept "structMap" -> renaming to avoid confusion?
** table of contents/tableOfContents/TOC/toc - if the escidoc toc is an ordered and grouped overview of [selected] members, it may be semantically in sync with the METS concept "structMap" -> renaming to avoid confusion?
** StructuralMap/struct-map - if a structural map is the "flat" list of item reference, it may be semantically in sync with the METS concpet "fileSec" -> renaming?
** StructuralMap/struct-map - if a structural map is the "flat" list of item reference, it may be semantically in accordance to  the METS concept "fileSec" -> renaming?
:(--[[User:Natasab|Natasa]] 11:04, 14 March 2008 (CET)) Dear, do you here refer to the concept or the implementation? It may get even more confusing when we talk about METS concept imho, as there we have the following definition of fileSec element:
<pre><nowiki>
<xsd:element name="fileSec" minOccurs="0">
    <xsd:annotation>
<xsd:documentation>fileSec: Content File Section.
The content file section records information regarding all of the data files which comprise the digital library object.
</xsd:documentation>
    </xsd:annotation>
...
</nowiki></pre>
 
:"flat" list of item reference in eSciDoc is not semantically in sync with the fileSec concept imho as items are not files.--[[User:Natasab|Natasa]] 11:04, 14 March 2008 (CET)


== References ==
== References ==
<references/>
<references/>

Latest revision as of 22:48, 15 March 2008

Inga currently tries to understand the container discussion in the eSciDoc technical team and uses this page to structure results and thoughts.

Everything should start with a definition[edit]

Following definition of a Container is taken from the eSciDoc framework specification[1]:

Containers offer the concept of aggregation, i.e. they can contain other (simple and complex) objects. 
Each container includes a structural map and can have a TOC and an AdminDescriptor. A container can be 
a collection or a bundle."

The corresponding container XML schemas[2] slightly differs from this perception by allowing minOccurs=0 to structMap.

The CoLab provides some more details[3]:

The information of the aggregated Items and Containers is saved within the StructuralMap of the Container. 
The information of representation of the aggregated Items and Containers is saved within the TableOfContents 
of the Container. The StructuralMap and the TableOfContents of the Container are not necessarily same [...]

and[4]

In eSciDoc hierarchical structures are build by means of container resources. A container resource refers 
to its members which are again containers or items. The set of references is represented as structural map 
(struct-map) inside the representation of a container resource. Additionally a container may contain a table 
of content (TOC) which contains an ordered selection of members.

The paper "Concepts for Versioning"[5] elaborates the idea that bundles could be used to represent digitized books or manuscripts:

eSciDoc allows the grouping of objects by means of container objects like collections or bundles. Bundles 
may represent e.g. books or manuscripts. The contained objects would then be chapters or pages.

Following definition for StructuralMap and TableOfContents are derived from the logical data model and have been provided by Natasa

StructuralMap is defined for a Container and holds references to all directly aggregated Items/Containers
within that Container. 
TableOfContents is defined for a Container and defines (all or subset) of aggregated objects that should be 
presented in the table of content for the container. 

Why TableOfContent is different from the StructuralMap: TableOfContent uses the information from the StructuralMap but is not necessarily the same information e.g. information presented with the StructuralMap is not by definition an ordered aggregation - TableOfContent is always ordered; while a StructuralMap must contain information on "directly" aggregated objects, table of content not necessarily references all (directly and not directly) aggregated objects. While there is only one StructuralMap allowed for a Container - a content container can have more than one TableOfContents defined (depending on the purpose of use of the table of content). --Natasa 10:40, 14 March 2008 (CET)

Containers and members[edit]

Aggregation of digital items introduces potential dependencies between containers and members, e.g. it could be reasonable that the release of a member object creates a new version of the container object or that the release of a container object automatically releases the last version of its members as well. Anyway, this stuff is tricky and complicated - especially because items could belong to several containers and this information is not part of the item object.

More information is available in the eSciDoc Content Model [6] and the concept paper on versioning

My assumptions and recommendations[edit]

Please check User_talk:Inga/container_tocs for a discussion of these points. The following conclusions has been copied to Talk:ESciDoc Container Toc‎

  • Considering the ViRR requirements I believe that digitized books are VERY strong entities and that individual scanned-in pages are no independent resources. Thus every change in the description of one page or in the table of contents should create a new version of the book. Therefore, I would suggest to implement digitized books as items with an structural map datastream. This would also help us providing METS exports at a later stage because we would operate on the same granularity. Anyway, the ViRR project still could be a test bed for containers, because books need to be grouped to multivolumes as well as books and multivolmes need to be grouped to the ViRR collection. Anyway, on that level "no deep level TOC" is required, it's fine to provide a grouped list of direct members first.
  • The TOC is an optional, but integral component/member of a digitized book
    ->Changes in the TOC object should version the container object in any case
  • Members are independent from their container(s), thus each item can be member of n containers.
    -> In cases where users would like to provide an additional TOC for an existing container for which they have no privilege ("non-editor"), they still could create their own container including the same set (or subset) of the items.
  • I would strongly vote for synchronizing definitions, re-considering terms used and harmonizing notations
    • container, i.e. in regard to hierarchical structure
    • table of contents/tableOfContents/TOC/toc - if the escidoc toc is an ordered and grouped overview of [selected] members, it may be semantically in sync with the METS concept "structMap" -> renaming to avoid confusion?
    • StructuralMap/struct-map - if a structural map is the "flat" list of item reference, it may be semantically in accordance to the METS concept "fileSec" -> renaming?

References[edit]

  1. API Documentation Container Handler, rest interface, framework release 0.9
  2. XML schema Container rest interface, framework release 0.9
  3. ESciDoc_Services:core_services describes the Container Handler service and introduces the terms "Container", "TableOfContents" and "StructuralMap"
  4. The ESciDoc Container Toc‎ article introduces a TOC representation based on RSS 1.0 which is probably no longer in consideration. The Questions & Discussion section may be of further interest.
  5. Matthias Razum, Frank Schwichtenberg: Concepts for Versioning. Last Changed: May 22, 2007
  6. eSciDoc Content Model - according to Natasa, this page was created longer time ago and never finalized