ESciDoc Committer Meeting 2009-10-13

From MPDLMediaWiki
Jump to navigation Jump to search

Date: 13.10.2009 Start time: 14:30

Location: Karlsruhe, München (Video conference or TelCo)

Participants MPDL: Natasa Bulatovic, Michael Franke

Participants FIZ: Steffen Wagner, Frank Schwichtenberg, Michael Hoppe, Matthias Razum, Harald Kappus

Previous committer meeting

Next committer meetings


Topics[edit]

ingests[edit]

  • upcoming ingests (using standard resource handlers) of ca. 100000 objects
    • further performance impact?
    • measurements?

Outcome[edit]

  • currently there is a repository with approx 400 000 objects
    • status not clear: Db cache and index
    • DB Cache might show to be critical, creation of some additional indexes tested, to see how it scales
    • 1 sec per object for indexing of an object
    • FIZ currently checks it
    • Scalability problem is not with Lucene, more or less is in DB-cache - to check for some optimization
  • MPDL will provide information on first tests

merging the REST and SOAP representations[edit]

  • discussion
    • search works only with SOAP representation
    • in the cache we need to store both SOAP/REST representation
    • the first step to reduce cache time would be to unify SOAP/REST representation
    • still an ongoing discussion in the team
      • new representation would always contain both object-ID and xlinks
      • any of provided would be used - if both are provided they should not contradict each other
      • check on handling of xlink attributes (was SMC problem, currently we see no big troubles)
    • changes to schemas would be required in this case
    • loosen-up some constraints in the schemas e.g.
      • xlink is mandatory in the REST representation (should be relaxed)
    • MPDL sees as problem still dropping of SOAP interfaces completely
      • used in development

Outcome[edit]

  • dropping of SOAP interfaces is off discussion (not to be dropped)
  • example schema on unified representation to be sent to MPDL to run JibX tests

drop latest version from representation[edit]

  • latest version in representation is making troubles with the DB Cache in case when both latest release and latest version should be part of the DB Cache

Outcome[edit]

  • related to the DB cache
  • how to do it with the DB cache (to be resolved with the DB cache)
  • in any case should not change the item representation if possible

set component title directly[edit]

  • derived from DC metadata datastream
  • use for title and filename of the component
  • was heavy to understand why title can not be set-up

Outcome[edit]

  • should be related to unification of the representations for both SOAP/REST
  • document current state
  • see how things could be done then
  • if title is to be set-up for components, then to unify all interfaces these may probably affect all title set-ups directly for all objects

Follow up Topics of Meeting München[edit]

http://colab.mpdl.mpg.de/mediawiki/ESciDoc_Developer_Workshop_2009-07-29/30

Fundamental changes[edit]

  1. Replace atomistic model for Items/Components with compound model and RELS-INT
  2. dropping SOAP?
  3. Drop latest-version section from object representation
  4. set title directly
  • Replace DB-Cache with asynchronous Lucene Index and/or Object Database
  • synchronous Lucene Index
  • Persistent data objects in rel. DB
  • Remove mapping of "escidoc" MD-record to DC record in Components
  • Get rid of content-model-specific properties

Specifics[edit]

  • Search and administrative search
  • Admin Tools development
  • Large sets of data ingest
    • how to avoid downtime to recache and reindexing
  • Trying to add/remove members to a very large container fails with 500 Internal eSciDoc System Error
    • in Work

others[edit]

  • Alignment of tools and processes (e.g., Maven)
  • Improved and harmonized communication of eSciDoc
  • eSciDoc Blog
  • service names and classification
  • service-architecture board
  • documentation of services
  • installation guides
  • eSciDoc Lab: Colab page gathering experimental modules
  • Exchange of staff members for specific developments or share development

Planning[edit]

  • short-term 6 months

Long term issues[edit]

Release 1.2[edit]

  • a first release candidate is scheduled for mid of october

Outcome[edit]

    • most features are finalized
    • content models are in tests
    • content relations + filters have to be tested

PubMan clean-Up[edit]

  • beginning of november the MPDL-solutions (like PubMan, Faces,....) will run in the same JBoss as the core services

Outcome[edit]

  • INFO MPDL:
    • next PubMan release (begin November) will run in same JBoss with core-service

Topics for joined development[edit]

  • start the two groups

eSciDoc Colab[edit]

  • domain-redirection for the eSciDoc-colab
  • set up the colab and move the eScoDoc pages from MPDL colab