Developer Workshop September 22 23 2010 in Karlsruhe

Developer Workshop[edit]

(September 22/23, 2010 in Karlsruhe)

Participants MPDL[edit]

Natasa Bulatovic
Wilhelm Frank
Michael Franke
Vladislav Makarenko

Participants FIZ[edit]

Steffen Wagner
Michael Hoppe
André Schenk
Mario Rehkop (on Wednesday)
Eduard Hildebrandt (on Wednesday)
Marko Voß
Alexander Spetko (on Wednesday)
Matthias Razum
Frank Schwichtenberg (on Thursday)
Harald Kappus

Results of Day 1[edit]

Administrative search:
- filters vs Lucene
- Solr interfaces
>>>>> Outcome
- MPDL wants a method to flush the SearchConfig to be able to activate a new configuration without stopping JBoss.
features of next releases
- Presentation by Matthias
>>>>> Outcome
- remove of SOAP-interface is possible
  - if WADL for REST is available
  - or if usage of Client-Lib is possible (MPDL will check it
Scalability
- Presentation by Steffen and Matthias
Performance
- Presentation by Steffen and Matthias

Results of Day 2[edit]

Content Relations
>>>>> Outcome
- Only usage of "old" Content Relations in the moment
- The new once can be marked as depreciated since the Fedora object isn't needed in that way.
RDF Support
>>>>> Outcome
- Metadata delivered as RDF should be stored in the TripleStore
- But no real uses cases at the moment
- Will be discussed later when its needed in FACES
Content Models
>>>>> Outcome

Possible enhancements

- Status transitions
- Cascade information
- PID assignment
- DC Mapping (exists) should use the complete item (incl. components) not only some data from item
- Define Name of Main Metadata record
- Versioning on/off
- Datastream handled internal/external managed (for metadata DS) because of Batch Updates
- Validation of object structure (mandatory, optional, which MD-DS, ...) [Priority 1]
- Validation of MD-records using a schema

to be defined: Priorities, in which release

Relational Database and PubMan
>>>>> Outcome
- PubMan doesn't use Postgres
- Only cohen uses postgres
- Not much work to substitute Postgres by Oracle
Transformation Service (instead of SOAP-Interface / Client Lib)
>>>>> Outcome
- Presentation by Michael
- It's a framework for several transformations
MDStore-batch update service
>>>>> Outcome
- Presentation by Natasa
- It's a demonstrator / a study to manipulate MD-records in a fast way
- See http://colab.mpdl.mpg.de/mediawiki/MD_Store and http://colab.mpdl.mpg.de/mediawiki/MD_Store/Architecture
- Includes usage of AA
- Converts MD-XML in MD-RDF and stores it in a triplstore
- Index contains only the first version of MD but not the updated MD
- 2 minutes to update 2000 MD using "Jena TDB"
- Is there a chance to include this Service as a core service?
System integrity checks
>>>>> Outcome
- before version upgrade
- check some areas before start up of Infrastructure
- check data before data migration
- check data integrity by request
- Re-indexing
  - should be split in groups e.g. re-index OU only or all objects which belong to a specific context
  - check the number of objects to be re-indexed
- would be nice to have a data integrity check from pubman
- monitor information like used/free space for fedora, DB, lucen on request or start up
Read-only mode for backups, migration of coreservice
>>>>> Outcome
- Login should be allowed but no change of data
- Statistics should be gathered
- Switch to read-only and allow applications to ask for the status
- There might be a delay(some minutes) between start to switch to read-only and perform the switch
- The requirements for "what to back up", "which parts belong to one back up" have to be defined, e.g. statistics could be done separately
- Split statistics into a different database
- Split of gathering raw statistics into a different schema
- Or switch to a different storage system (name-value storage), data source
System/Data migration
>>>>> Outcome
- FOXML
- Fedora-Rebuild
- DB-Migration
- Recache
- Re-index
- should be smarter, faster, e.g. start all steps after the other automatically, reduce the need for some steps
eSciDoc Days: Developer Workshops
>>>>> Outcome
- installation of eSciDoc should be done before start of the tracks
- and at least one server for testing purposes
Fedora - datastream management
>>>>> Outcome
- All datastreams can be treated as internal or external managed
- Will be used as fits best eSciDoc needs
Wrap-up

Developer Workshop September 22 23 2010 in Karlsruhe

Contents

Developer Workshop[edit]

Participants MPDL[edit]

Participants FIZ[edit]

Results of Day 1[edit]

Results of Day 2[edit]

Navigation menu

Search