Developer Workshop September 22 23 2010 in Karlsruhe

From MPDLMediaWiki
Jump to navigation Jump to search

Developer Workshop[edit]

(September 22/23, 2010 in Karlsruhe)

Participants MPDL[edit]

  • Natasa Bulatovic
  • Wilhelm Frank
  • Michael Franke
  • Vladislav Makarenko

Participants FIZ[edit]

  • Steffen Wagner
  • Michael Hoppe
  • André Schenk
  • Mario Rehkop (on Wednesday)
  • Eduard Hildebrandt (on Wednesday)
  • Marko Voß
  • Alexander Spetko (on Wednesday)
  • Matthias Razum
  • Frank Schwichtenberg (on Thursday)
  • Harald Kappus

Results of Day 1[edit]

  • Administrative search:
    • filters vs Lucene
    • Solr interfaces
  • >>>>> Outcome
    • MPDL wants a method to flush the SearchConfig to be able to activate a new configuration without stopping JBoss.
  • features of next releases
    • Presentation by Matthias
  • >>>>> Outcome
    • remove of SOAP-interface is possible
      • if WADL for REST is available
      • or if usage of Client-Lib is possible (MPDL will check it
  • Scalability
    • Presentation by Steffen and Matthias
  • Performance
    • Presentation by Steffen and Matthias

Results of Day 2[edit]

  • Content Relations
  • >>>>> Outcome
    • Only usage of "old" Content Relations in the moment
    • The new once can be marked as depreciated since the Fedora object isn't needed in that way.
  • RDF Support
  • >>>>> Outcome
    • Metadata delivered as RDF should be stored in the TripleStore
    • But no real uses cases at the moment
    • Will be discussed later when its needed in FACES
  • Content Models
  • >>>>> Outcome

Possible enhancements

    • Status transitions
    • Cascade information
    • PID assignment
    • DC Mapping (exists) should use the complete item (incl. components) not only some data from item
    • Define Name of Main Metadata record
    • Versioning on/off
    • Datastream handled internal/external managed (for metadata DS) because of Batch Updates
    • Validation of object structure (mandatory, optional, which MD-DS, ...) [Priority 1]
    • Validation of MD-records using a schema

to be defined: Priorities, in which release

  • Relational Database and PubMan
  • >>>>> Outcome
    • PubMan doesn't use Postgres
    • Only cohen uses postgres
    • Not much work to substitute Postgres by Oracle
  • Transformation Service (instead of SOAP-Interface / Client Lib)
  • >>>>> Outcome
    • Presentation by Michael
    • It's a framework for several transformations
  • MDStore-batch update service
  • >>>>> Outcome
  • System integrity checks
  • >>>>> Outcome
    • before version upgrade
    • check some areas before start up of Infrastructure
    • check data before data migration
    • check data integrity by request
    • Re-indexing
      • should be split in groups e.g. re-index OU only or all objects which belong to a specific context
      • check the number of objects to be re-indexed
    • would be nice to have a data integrity check from pubman
    • monitor information like used/free space for fedora, DB, lucen on request or start up
  • Read-only mode for backups, migration of coreservice
  • >>>>> Outcome
    • Login should be allowed but no change of data
    • Statistics should be gathered
    • Switch to read-only and allow applications to ask for the status
    • There might be a delay(some minutes) between start to switch to read-only and perform the switch
    • The requirements for "what to back up", "which parts belong to one back up" have to be defined, e.g. statistics could be done separately
    • Split statistics into a different database
    • Split of gathering raw statistics into a different schema
    • Or switch to a different storage system (name-value storage), data source
  • System/Data migration
  • >>>>> Outcome
    • FOXML
    • Fedora-Rebuild
    • DB-Migration
    • Recache
    • Re-index
    • should be smarter, faster, e.g. start all steps after the other automatically, reduce the need for some steps
  • eSciDoc Days: Developer Workshops
  • >>>>> Outcome
    • installation of eSciDoc should be done before start of the tracks
    • and at least one server for testing purposes
  • Fedora - datastream management
  • >>>>> Outcome
    • All datastreams can be treated as internal or external managed
    • Will be used as fits best eSciDoc needs
  • Wrap-up