Developer Workshop September 22 23 2010 in Karlsruhe

Developer Workshop
(September 22/23, 2010 in Karlsruhe)

Participants MPDL

 * Natasa Bulatovic
 * Wilhelm Frank
 * Michael Franke
 * Vladislav Makarenko

Participants FIZ

 * Steffen Wagner
 * Michael Hoppe
 * André Schenk
 * Mario Rehkop (on Wednesday)
 * Eduard Hildebrandt (on Wednesday)
 * Marko Voß
 * Alexander Spetko (on Wednesday)
 * Matthias Razum
 * Frank Schwichtenberg (on Thursday)
 * Harald Kappus

Results of Day 1

 * Administrative search:
 * filters vs Lucene
 * Solr interfaces
 * >>>>> Outcome
 * MPDL wants a method to flush the SearchConfig to be able to activate a new configuration without stopping JBoss.
 * features of next releases
 * Presentation by Matthias
 * >>>>> Outcome
 * remove of SOAP-interface is possible
 * if WADL for REST is available
 * or if usage of Client-Lib is possible (MPDL will check it
 * Scalability
 * Presentation by Steffen and Matthias
 * Performance
 * Presentation by Steffen and Matthias

Results of Day 2
Possible enhancements to be defined: Priorities, in which release
 * Content Relations
 * >>>>> Outcome
 * Only usage of "old" Content Relations in the moment
 * The new once can be marked as depreciated since the Fedora object isn't needed in that way.
 * RDF Support
 * >>>>> Outcome
 * Metadata delivered as RDF should be stored in the TripleStore
 * But no real uses cases at the moment
 * Will be discussed later when its needed in FACES
 * Content Models
 * >>>>> Outcome
 * Status transitions
 * Cascade information
 * PID assignment
 * DC Mapping (exists) should use the complete item (incl. components) not only some data from item
 * Define Name of Main Metadata record
 * Versioning on/off
 * Datastream handled internal/external managed (for metadata DS) because of Batch Updates
 * Validation of object structure (mandatory, optional, which MD-DS, ...) [Priority 1]
 * Validation of MD-records using a schema
 * Relational Database and PubMan
 * >>>>> Outcome
 * PubMan doesn't use Postgres
 * Only cohen uses postgres
 * Not much work to substitute Postgres by Oracle
 * Transformation Service (instead of SOAP-Interface / Client Lib)
 * >>>>> Outcome
 * Presentation by Michael
 * It's a framework for several transformations
 * MDStore-batch update service
 * >>>>> Outcome
 * Presentation by Natasa
 * It's a demonstrator / a study to manipulate MD-records in a fast way
 * See http://colab.mpdl.mpg.de/mediawiki/MD_Store and http://colab.mpdl.mpg.de/mediawiki/MD_Store/Architecture
 * Includes usage of AA
 * Converts MD-XML in MD-RDF and stores it in a triplstore
 * Index contains only the first version of MD but not the updated MD
 * 2 minutes to update 2000 MD using "Jena TDB"
 * Is there a chance to include this Service as a core service?
 * System integrity checks
 * >>>>> Outcome
 * before version upgrade
 * check some areas before start up of Infrastructure
 * check data before data migration
 * check data integrity by request
 * Re-indexing
 * should be split in groups e.g. re-index OU only or all objects which belong to a specific context
 * check the number of objects to be re-indexed
 * would be nice to have a data integrity check from pubman
 * monitor information like used/free space for fedora, DB, lucen on request or start up
 * Read-only mode for backups, migration of coreservice
 * >>>>> Outcome
 * Login should be allowed but no change of data
 * Statistics should be gathered
 * Switch to read-only and allow applications to ask for the status
 * There might be a delay(some minutes) between start to switch to read-only and perform the switch
 * The requirements for "what to back up", "which parts belong to one back up" have to be defined, e.g. statistics could be done separately
 * Split statistics into a different database
 * Split of gathering raw statistics into a different schema
 * Or switch to a different storage system (name-value storage), data source
 * System/Data migration
 * >>>>> Outcome
 * FOXML
 * Fedora-Rebuild
 * DB-Migration
 * Recache
 * Re-index
 * should be smarter, faster, e.g. start all steps after the other automatically, reduce the need for some steps
 * eSciDoc Days: Developer Workshops
 * >>>>> Outcome
 * installation of eSciDoc should be done before start of the tracks
 * and at least one server for testing purposes
 * Fedora - datastream management
 * >>>>> Outcome
 * All datastreams can be treated as internal or external managed
 * Will be used as fits best eSciDoc needs
 * Wrap-up