ESciDoc Developer Workshop 2008-09-02

ESciDoc

Date: 02.09.2008 Start time: 14:30

Location: Karlsruhe, München (Video conference)

Participants MPDL: Natasa Bulatovic, Wilhelm Frank

Participants FIZ: Harald Kappus, Michael Hoppe, Frank Schwichtenberg, Matthias Razum

=Agenda=

Content Relations
why mpdl-ontologies? container-toc relation isTocOf


 * the relation for TOC is missing
 * FIZ will check and fixe it (Frank was informed via phone)
 * MPDL needs access to the ontology,
 * is there a documentation?

Outcome

 * current procedure to define new content relation type is to send the changed ontology file to FIZ
 * Discussion:
 * isTocOf is structural or content relation?
 * it was agreed longer time ago to start with simpler case of being structural relation, which would mean, TOC modification will also update the Container version
 * TOC relation will be created as structural relation
 * Container handler should provide a method with which we are able to retrieve references to Container TOC objects
 * additional content relations can be created from Container to TOC object (in this case, new content relation should be included in changed ontology file and sent to FIZ)
 * srel schema change necessary

Content-Model-Specific properties (Content-Type-Specific properties)

 * see initial set of requirements put at Local tags handling  --Natasa 07:40, 22 August 2008 (UTC)
 * resolving issues such as tagging in another manner? semantic store queries possible to combine with searches? See mentioned use cases for Faces, YB data above.
 * MPDL thinks Relation handler implementation would resolve this issue to also enable other tagging possibilities except the local-tags

Outcome

 * FIZ agrees that more proper is to use Relation handler
 * there was discussion on alternatives such as allow to relate object with string values
 * both FIZ/MPDL realized that both are valid use cases
 * Relation handler: where some metadata, i.e. description of the relation, access rights (i.e. public, private, user group) are needed
 * Tags as relation of objects to string values would be part of the Item (e.g. official moderator/librarian tags (e.g. subject classification)
 * start with Relation Handler for local tags (should not be indexed for search, should be visible only to moderators)

technical Metadata in Component

 * should be re-designed in the near future

Outcome (2008-08-28)
 * MPDL wants Jhove to be part of the create and update methods of Infrastructure and this should be configurable "if automatically used or not"

Institutional Visibility
some open issues: the list of organizational units
 * should the list be part of the item/component, so it will be versioned or should it be outside, since they are not part of the object?
 * i would think of the list is not and will not remain single visibility criteria just added. With Torsten before we agreed to enable specification of a policy that defines rules for visibility and relating this policy to each component. Other visibility criteria that are other then private, public will be : user group, IP-range etc.--Natasa 07:36, 22 August 2008 (UTC)


 * is there one list of OUs which are responsible for ALL Components of one Item or each component will have its own list?
 * see above: depends on component, each component in item can have different privileges--Natasa 07:36, 22 August 2008 (UTC)


 * if the editor is not member of one of the listed OUs, can he access the content?
 * rules on visibility apply to "logged-in-users" who are mostly readers and have no editing privileges given from collection level--Natasa 07:36, 22 August 2008 (UTC)


 * if a user is member of a child of one of the listed OUs, can he access the content?
 * yes --Natasa 07:36, 22 August 2008 (UTC)
 * what status lagl OUs have to be in?
 * all open?
 * may be some are closed?
 * also new?
 * opened or closed, see also link above on General rules --Natasa 07:36, 22 August 2008 (UTC)


 * From Matthias's email:
 * Institutional Visibility: we had some discussions whether to add the list of OUs to the item itself or to store it in a database and just reference the item from there. If we add the list to the item, it becomes part of the item (it could even create new versions of the item if changed), but is that desirable at all? My rule of thumb for this is always: do I want to keep that information for the long term (e.g., should it go to long-term archival)? Is it an integral part of the item or rather management information that is only valid for the system the item currently resides in? As you can see, it is less a technical but rather a philosophical question.

Outcome (2008-08-28)
 * visibility flag should have "public, private, policy"
 * if policy should have relation for policy object and change of policy should create new versions?
 * Policies for items/components are different?
 * create Colab discussion on this topic as it is urgent and very important

Outcome

 * MPDL provided feedback on Colab pages
 * relevance to advanced/administrative search requirements (e.g. search for all items with public visibility, non-public visibility)

see Institutional visibility requirements discussion

Start JBoss with PubMan EAR

 * current situation/status

Outcome

 * still problems when running on same JBoss, the last.deploy option for JBoss start-up did not help
 * both teams agreed it is very important to enable e.g. PubMan and core-services to work under same JBoss
 * PubMan properties: framework points to localhost may be the cause of problem (solution might be to point to external non-existing instance)
 * MPDL will take care on this issue intensively during next weeks as the problem might be the PubMan initialization (validation, statistics)
 * FIZ currently works on integrating Fedora with JBoss (no need to run TomCat) -> first tests successfull
 * Regarding further support for PubMan: MPDL will create user accounts for JIRA PubMan Bug Tracking for Frank, Harald
 * Frank to re-send the long-email as Willy is at present on Vacation

Input from MPDL on roadmap/component dependencies

 * excel sheet will be provided shortly before conference

Outcome

 * good starting point for knowing dependencies and MPDL plans
 * to be re-discussed again in the next Video conference with some more input from both sides if necessary
 * to be used to better define priorities

First results from mass ingestion

 * MPDL will report results
 * managed to ingest 50000 items in several tries
 * in a meantime internal Web server errors
 * actions for mass ingest
 * integrity of ingested items
 * spurious 409 errors (see http://www.escidoc-project.de/issueManagement/show_bug.cgi?id=636)

Outcome

 * after HTTP 500 errors it is not known what happened with an item
 * need to have transaction which can roll back all changes as soon as there is one failure
 * FIZ develops new method for creation (wrapping all operations into transaction)
 * FIZ would deliver the first implementation with the ingest tool end of September
 * MPDL pointed it is important also for normal handler operations
 * the problem with MPDL tests might be the memory consumption (MPDL had not insight on the server-side memory usage)
 * FIZ will make some more tests on regular heavy usage of handler interfaces (creation of e.g. 1 milion items) (repeat MPDL tests) and inform on this issue further

Migration from eDoc or other local systems

 * metadata migration (not to change version)
 * item/container handler should not give back migrated metadata record in retrieve output (only by explicit request to this MD record)

Outcome

 * FIZ proposed metadata from old systems to be created as components of the item (with minimum component descriptive metadata)
 * both MPDL and FIZ think it makes sense in retrieve method outputs (single resource, lists) to provide default metadata record (labeled as ESCIDOC).
 * MPDL proposed to in addition add to these outputs only reference to other existing metadata records (they will be retrieved with retrieveMetadataRecord handler methods)
 * would be great improvement and not delivery of non-necessary data always
 * FIZ will provide feedback when it can be included in the eSciDoc roadmap

Download of developer versions

 * download possibility at the moment comes from escidoc-project pages
 * not offered for fedora 3beta (only for fedora 3.0.1 beta)
 * in case of urgent bug-fixes we would need the same patch escidoc-core-bin.1.0beta3.build.3XX

Outcome

 * FIZ will offer bug-fixes for stable releases in future (separate branch)
 * new features will be part only of developer releases
 * new stable releases will contain new features and all bug fixes done to previous stable releases
 * MPDL will develop on stable releases (and would expect bug fixes for the stable releases)
 * MPDL development on developer releases (at own risk :)
 * There is a possibility to have a stable release before 1.0
 * MPDL will check the bug-fixes from 3.0.3 and 3.0.4 and will provide FIZ with a wish-list from those bug-fixes, features for the intermediate stable release (i.e. either 3.0.3 or 3.0.4)
 * FIZ will provide migration procedures from one to another stable release
 * migration procedures should be (for test purposes) provided in advance (before next stable release is fixed) to enable MPDL to test the migration
 * Release 1.0 is planned end of September

Miscelaneous

 * super user to wipe-out (delete) all objects independent on their status
 * talk on best method to communicate requirements for core services (Colab?, e.g. http://colab.mpdl.mpg.de/mediawiki/ESciDoc_Administrative_search )
 * Faces image retrieval problem