ESciDoc Developer Workshop 2008-03-25

ESciDoc  Restricted Access to eSciDoc group

Date: March 25.03.2008

Location: Karlsruhe, München (Video conference)

Participants MPDL: Natasa Bulatovic, Wilhelm Frank

Participants FIZ: Frank Schwichtenberg, Torsten Tetteroo, Matthias Razum

Start time: 14.00 25.03.2008

ToC XML schema
See also
 * ESciDoc_Developer_Telco_2008-02-26 - The option b) from described alternatives is agreed.
 * ESciDoc_Developer_Workshop_2008-03-11 - TOC proposal
 * Metadata_Encoding_and_Transmission_Standard - METS struct map

Agreed Outcome

 * start with current schema proposal as the simplest one
 * we are all clear that if elements can point to any resource/subresource it will be heavy or practically no value-add for validating it if that resource/subresource is somehow connected to the container
 * we are all clear that the current schema includes some DFG Viewer specifics
 * idea for viewer service will be checked for Toc objects and item lists in future and when we are more clear with all requirements
 * strongly confirmed also by MPIWG that more than a single TOC per container is needed

Inconsistencies in UserAccout and OrganizationalUnit
Excerpt from an email Torsten sent on February 29th, 2008:

Currently, there are two "organizational-unit" elements, one in the user-account, and one in the context. Both are not the same, because the user-account's element has an additional attribute "primary".

From my point of view, it is a semantic difference, that the user account's set of organizational units contains a special primary organizational unit, while the context's set of organizational units does not contain such a primary organizational unit. Maybe this point of view is wrong, but at least it is a syntactical difference between the two "organizational-unit" elements. Therefore, both currently cannot have the same full qualified name (i.e. namespace + "local" name).

From a technical point of view, the following solutions are possible:


 * Removing the difference between the two "organizational-unit" elements, either by
 * removing the "primary" attribute from the organizational unit of the user account, or by
 * adding this attribute to the context's organizational unit element.


 * Use different full qualified names, either by
 * different local names, or by
 * different namespaces.

I think, if both "organizational-unit" elements have the same semantic, they should neither have different local names nor different namespaces, as the full name defines the "meaning" of a properties element. In this case, i would prefer the solution to make the elements equal. But I'm not sure, if this really is possible.

Otherwise, if both elements does not have the same semantic, different names should be the best solution. If it is not possible to change the local name, different namespaces should be used to distinguish between them. The disadvantage of this solution is, that the "organizational-units" and "organizational-unit" elements of the user account would have other namespaces than the other properties elements, which is the reason why i changed the local name instead of the namespace.

Let me point out another problem with the relation between the user-account and the organizational-unit, that is related to the federated authentication of the user and needs to be discussed. If the "organizational-unit" element of a user-account points to the organizational-unit resource, as it currently is the case, the user's organizational units must either exists in the system, when they login and the shibboleth IDP provides the user's attributes, and they must be identifiable by the provided data. If a user's organizational-unit cannot be found, either new organizational units must be created (maybe as top level ous), or these non-existing ous are ignored meaning they are ignored for authorization, too.

Alternatively, the IDP's provided information about the user's organizational units could be ignored. In this case, the user account must be created with its related organizational units, before the user logs in.

Another solution could be not to define the user account's "organizational-unit" elements as references to organizational-unit resources, but to define them to hold the information as provided by the IDP, without the need to have a corresponding organizational resource defined in the system. But in this case, the user account's "organizational-unit" element would have a different meaning than the context's element. The first element defines the unit that is responsible for the user but need not be known in the system, while the second element specifies a defined organizational unit resource that is responsible for the context.

Agreed outcome

 * to remove the "primary" tag for organizational-unit element of the user account
 * for user management there is usually one (optionally provided i.e. none) organizational unit for user management
 * mapping to organizational units in escidoc can be done via providing corresponding OU attribute from IDP (but not necessarily by ID - we can not expect that each IDP will have the same identifiers and granularity for org units as in eSciDoc organizational units)
 * persons are not same as users
 * with Shibboleth if user should come via separate IDP (i.e. if user should have more then one org unit s/he has to have separate user account)

Problems identified with Container Handler, Fedora
1. Mass Ingestion using ItemHandler/ContainerHandler SOAP API Using a standalone JAVA application to
 * create an item (= scanned page) with 3 components
 * release the item
 * add the released item to a container (= scanned book)

tried to ingest 100 items. each addition of a new member creates a new version of the container. after about 50 items the container FOXML was blasted away and the Fedora Object was no longer accessable.
 * Error: 500 Internal eSciDoc System Error
 * Message: Shoud not occure in FedoraContainerHandler.addMember.
 * Cause: 500 Internal Fedora Error
 * Message: Unable to add or modify object (commit canceled).

2. ContainerHandler release method forces assignement of version pid.

Agreed outcome

 * there is a workaround solution which for start we can use
 * the problem of maximum supported versions has to be checked with Fedora people as this seems to be a serious Fedora bug
 * if this can not be resolved then we need to check again SWB requirements for Bundles, Containers, Versioning and rethink again the versioning concepts for container (at the moment each time a new member is added/removed to/from a container a new version of a container is created)

Next meeting
next VideoCall will be April 09, 15:30