ESciDoc Developer Workshop 2007-08-07

From MPDLMediaWiki
Jump to navigation Jump to search

ESciDoc Developer Workshop 2007-08-07[edit]

Date: November 7-8, 2007
Location: MPDL, Munich

There will be another joint eSciDocWorkshop in September.

Topics[edit]

Prioritization according to MPDL/SMC preparation meeting. To be discussed with FIZ.

Performance[edit]

(Prio 1, common)

  • Performance Issues
    • Thumbnails
    • Access to parts of items/containers
    • Specialized item lists: How to retrieve item lists from the framework which do not contain whole items but only predefined item properties? Note: not only item properties may be of interest here but also parts of metadata records!
    • Children information in organizational unit

Outcome:

  • done (25.09.2007) triple store will be tested for storing additional data (component info, md records) enabling faster retrieval of item lists. Issues:
    • how does these lists could look like (examples will be provided)?
    • how many storage space will this require?
    • how many objects can be handled?
    • access rights won't not be considered while the test, but need to be checked with production setup
  • done (25.09.2007) end of August: first result of tests ran on Mulgara instance
  • FIZ-Task possibly additional methods will be provided using RSS/RDF
  • done (25.09.2007) Thumbnail: access right will be checked via the visibility flag on component level for released items. For non-released items regular rights checking is done.
  • done (25.09.2007) Children information in OU will be added very soon

Versioning[edit]

(Prio 1, common)

  • Versioning
    • Is there redundant information given? e.g. item.@last-modification-date <-> item.properties.current-version.date?
    • Remove ambiguity in xml element names (item/current-version vs. item/version)
    • Versioning specific methods (or existing methods will support version number in ItemId?)
      • retrieveItemVersions(itemId) - returns all existing versions for the specified item
      • retrieveItemVersion(specialItemVersionId) - returns the specified item version

Outcome:

  • done (25.09.2007) FIZ Karlsruhe will provide access to retrieve item version history (implemented but currently not publicly available)
  • done (25.09.2007) Element current-version will be renamed to version
  • done (25.09.2007) usage of PREMIS for item history might be reasonable, but is postponed
  • FIZ-Task documentation will be enhanced for object status
  • MPDL-TASK task: clearify where status "in revision" is required in framework
  • done (25.09.2007) Property status will be renamed to public-status
  • done (25.09.2007) content-type will be renamed to content-model.
  • done (25.09.2007) content-type/component-types/@label will be renamed to content-category.

Content Relations[edit]

(Prio 1, common)

  • Retrieving of item relations
  • Content Relations, access rights (to TripleStore method)
  • When will the according services be specified/available?
  • What queries will be possible?

Outcome:

  • done (25.09.2007) Content relation methods on items available since last framework relase
  • done (25.09.2007) There is a semantic store service
  • done for one ontology, more than one ontology has to be supported (25.09.2007) It will be updated to support filtering of ontologies = restrict queries to predicates of known ontologies (white lists)

Interfaces[edit]

(Prio 1, specific, Bulatovic, Müller, Forkel, Franke)

  • REST API: Would APP be applicable? OpenSearch? Streamlined/refactored XML representations of objects?
  • SOAP Interfaces
    • XSD lax handling of xlink attributes.

Outcome:

  • FIZ-Task (will be done later) The use of standard xml list formats (e.g. Atom) will be implemented later.
  • done (25.09.2007)There will be two schema definitions for REST and SOAP
  • done (25.09.2007)SOAP schemas will disallow xlink attributes.
  • done (25.09.2007)The new schemas/interfaces will be introduced one handler after the other.
  • done (25.09.2007)There will be read-only sections in the XMLs that will not be checked during an update procedure.

XSD[edit]

(Prio 1, specific, Bulatovic, Müller, Franke)

XSD consolidation[edit]

  • Proposal for all not versioned objects:
    • no element last-modification-date, only attribute in root element
    • sequence of elements is the same in all objects
      creation-date (first element after root element)
      created-by (renamed from creator)
      modified-by

Outcome: Okay.

  • done (25.09.2007) Changes will be done at once, announced by FIZ.
  • done (25.09.2007) FIZ checks the costs of implementing an ontology for property structure.

XSD versioning[edit]

  • FIZ object/ressource schemas and MPDL metadatas schema will have different version numbers.
  • Proposal: MPDL metadata schemas are stored in the metadata folder, which leads to the following structure:

ESciDoc Metadata Structure.png

Outcome:

  • MPDL-Taks MPDL has to manage metadata shcemas and versions and has to make them available online.
  • done (25.09.2007)FIZ won't deliver metadata schemas anymore.

XSD coupling[edit]

  • Since framework version 0.8.0034 of 21.05.2007, FIZ object/ressource schemas and MPDL metadatas schema are not coupled anymore.
  • Question: How will the coupling of different schemas happen in the future? Will the "external" (from the FIZ' view) schemas be stored in the framework?

Outcome:

  • will be done later In future, Content type object will hold metadata version information.

Status User Management[edit]

(Prio 2, specific, Bulatovic, Stancheva, Overkamp)

  • How to create new users
  • How to create new user roles
  • Set up a Shibboleth mock federation with two IdPs (MPDL/FIZ)

Outcome:

  • MPDL-Task MPDL will check FIZ concepts and implementation for further discussion.

Validation Service[edit]

(Prio 2, specific, Bulatovic, Franke)

  • Validation Service (re-use for Metadata Handler?)
  • Repository

Outcome:

  • Validation service cannot be used to transform between different metadata formats.
  • MPDL_Task MPDL will specify the requirements to store schemas in the framework.

Citation Manager[edit]

  • General Discussion

Outcome:

  • Item handler service will be used to store citation styles under a new context and content-type.
  • MPDL-Task Requirements will be provided by MPDL.

Unification of server components[edit]

(Prio 2, specific, Bulatovic, Müller, Franke)

  • Running Framework and PubMan together in one JBoss container
    • Known issues:
      • Name of property-file has to be changed
      • name of ear-file has to be changed. MuJ: Neither the FIZ nor the MPDL is developing eSciDoc alone. The EAR name of the framework implementation could be changed to framework.ear, the name of the PubMan EAR could be changed to pubman.ear.
      • Port has to be unified;
        solution: will be specified in properties

Outcome: Agreed.

  • Started, will be an ongoing task Unification of server components: After upgrading JBoss Server (4.0.5) and PostgreSQL (8.2) it should now be possible to deploy Framework and PubMan application together on one server. Should there be a control mechanism to keep these technologies synchronous? What other components can be identified? Datasource names, postgres database and table names...

Outcome:

  • done (25.09.2007) A colab Wiki page will be created that describes all version, modification and naming information for all components.


  • Production environment
    • Persistent identification
    • Timeline, Test environment
    • Source code access

Outcome:

  • done (25.09.2007) MPDL/FIZ SVNs will be opened to each other.
  • MPDL/FIZ will be done later A public SVN will require a clean cut to not deliver code without licence and copyright statement.
  • MPDL/FIZ will be done later A operational concept for updating, tagging and branching of the public SVN should be developed.
  • MPDL/FIZ will be done later Also a concept for releases and patches has to be developed. At first, FIZ and MPDL will discuss this internally.

PIDs[edit]

(Prio 1, specific, Bulatovic)

  • Framework supported functionality

Outcome:

  • MPDL-Taks MPDL will test the current PID assignment implementation and eventually develop new requirements.

Release Q3/2007[edit]

(Prio 3, common)

  • Features
    • Digilib / Image Scaling
  • Authority Files
  • Statistics
  • Modifications/Changes
  • How can we handle further org units, collections, users, roles

Outcome:

  • see September Workshop Authority file handling and Statistics will possibly be discussed in September as there is further specification needed.
  • MPDL will ingest further needed objects themselves, FIZ will still deliver the actual set of persistent objects.
  • done (25.09.2007) Digilib: FIZ will setup a simple digilib installation and test functionalities and possibilities of framework/fedora integration.
  • MPDL-Task Digilib: MPDL will check functional and legal requirements for "Faces".
  • done (25.09.2007) Digilib: Results will be written down in a colab Wiki page: Digilib.

Content-types[edit]

(Prio 3, specific, Bulatovic, Franke)

  • Additional content-types; general approach for content types
  • Content Type service
  • Content Type definition (what comes from Fedora?)

Outcome:

  • done (25.09.2007) Rename every occurance of Content-Type to Content-Model.
  • FIZ-Task, will be done later Use of validation service in addition or as alternative to Content-Model definition is evaluated by FIZ

follow up[edit]

(Prio 1, common)

Outcome:

  • Update from last workshop
  • Digilib
  • Statistics
  • Authority files
  • Performance
  • Production environment
  • Other eSciDoc solutions
  • Status and Planning