ESciDoc Committer Meeting 2009-10-13

Date: 13.10.2009 Start time: 14:30

Location: Karlsruhe, München (Video conference or TelCo)

Participants MPDL: Natasa Bulatovic, Michael Franke

Participants FIZ: Steffen Wagner, Frank Schwichtenberg, Michael Hoppe, Matthias Razum, Harald Kappus

Previous committer meeting
 * ESciDoc_Committer_Meeting_2009-10-06

Next committer meetings
 * ESciDoc_Committer_Meeting_2009-10-20
 * ESciDoc_Committer_Meeting_2009-10-27

=Topics=

ingests

 * upcoming ingests (using standard resource handlers) of ca. 100000 objects
 * further performance impact?
 * measurements?

Outcome

 * currently there is a repository with approx 400 000 objects
 * status not clear: Db cache and index
 * DB Cache might show to be critical, creation of some additional indexes tested, to see how it scales
 * 1 sec per object for indexing of an object
 * FIZ currently checks it
 * Scalability problem is not with Lucene, more or less is in DB-cache - to check for some optimization


 * MPDL will provide information on first tests

merging the REST and SOAP representations

 * discussion
 * search works only with SOAP representation
 * in the cache we need to store both SOAP/REST representation
 * the first step to reduce cache time would be to unify SOAP/REST representation
 * still an ongoing discussion in the team
 * new representation would always contain both object-ID and xlinks
 * any of provided would be used - if both are provided they should not contradict each other
 * check on handling of xlink attributes (was SMC problem, currently we see no big troubles)
 * changes to schemas would be required in this case
 * loosen-up some constraints in the schemas e.g.
 * xlink is mandatory in the REST representation (should be relaxed)
 * MPDL sees as problem still dropping of SOAP interfaces completely
 * used in development

Outcome

 * dropping of SOAP interfaces is off discussion (not to be dropped)
 * example schema on unified representation to be sent to MPDL to run JibX tests

drop latest version from representation

 * latest version in representation is making troubles with the DB Cache in case when both latest release and latest version should be part of the DB Cache

Outcome

 * related to the DB cache
 * how to do it with the DB cache (to be resolved with the DB cache)
 * in any case should not change the item representation if possible

set component title directly

 * derived from DC metadata datastream
 * use for title and filename of the component
 * was heavy to understand why title can not be set-up

Outcome

 * should be related to unification of the representations for both SOAP/REST
 * document current state
 * see how things could be done then
 * if title is to be set-up for components, then to unify all interfaces these may probably affect all title set-ups directly for all objects

=Follow up Topics of Meeting München= http://colab.mpdl.mpg.de/mediawiki/ESciDoc_Developer_Workshop_2009-07-29/30

Fundamental changes

 * 1) Replace atomistic model for Items/Components with compound model and RELS-INT
 * 2) dropping SOAP?
 * 3) Drop latest-version section from object representation
 * 4) set title directly
 * Replace DB-Cache with asynchronous Lucene Index and/or Object Database
 * synchronous Lucene Index
 * Persistent data objects in rel. DB
 * Remove mapping of "escidoc" MD-record to DC record in Components
 * Get rid of content-model-specific properties

Specifics

 * Search and administrative search
 * Admin Tools development
 * Large sets of data ingest
 * how to avoid downtime to recache and reindexing
 * Trying to add/remove members to a very large container fails with 500 Internal eSciDoc System Error
 * in Work

others

 * Alignment of tools and processes (e.g., Maven)
 * Improved and harmonized communication of eSciDoc
 * eSciDoc Blog
 * service names and classification
 * service-architecture board
 * documentation of services
 * installation guides
 * eSciDoc Lab: Colab page gathering experimental modules
 * Exchange of staff members for specific developments or share development

Planning

 * short-term 6 months

=Long term issues=

Release 1.2

 * a first release candidate is scheduled for mid of october

Outcome

 * most features are finalized
 * content models are in tests
 * content relations + filters have to be tested

PubMan clean-Up

 * beginning of november the MPDL-solutions (like PubMan, Faces,....) will run in the same JBoss as the core services

Outcome

 * Info:next PubMan release (begin November) will run in same JBoss with core-service

Topics for joined development

 * start the two groups

eSciDoc Colab

 * domain-redirection for the eSciDoc-colab
 * set up the colab and move the eScoDoc pages from MPDL colab