Difference between revisions of "Imeji Performance eSciDoc"

From MPDLMediaWiki
Jump to navigation Jump to search
m
Line 23: Line 23:
|-  
|-  
|'''eSciDoc ContentRelation''' ||-- || --||--|| CR is not under version control || Cannot be updated any more when released once (The documentation says public-status of an CR must not be "released"). Thus, CRs are '''not feasible''' for this purpose || [http://jira.mpdl.mpg.de/browse/FACESBUG-434 Tested]
|'''eSciDoc ContentRelation''' ||-- || --||--|| CR is not under version control || Cannot be updated any more when released once (The documentation says public-status of an CR must not be "released"). Thus, CRs are '''not feasible''' for this purpose || [http://jira.mpdl.mpg.de/browse/FACESBUG-434 Tested]
|-
|'''eSciDoc Item with 1000+ components''' ||?? ||0.9 sec ||--|| faster ingest compared to single item ingest || Retrieval times for item with 1000 components: > 33 sec<br/> Initial filesize: 0,6MB (will increse with each version)<br/> Failed to ingest an item with 10000 components<br/> Initial file size: > 5MB || [http://jira.mpdl.mpg.de/browse/FACESBUG-434 Tested]
|-  
|-  
|'''eSciDoc as archive, MD in Triple Store'''||-- || --||Very fast|| synchronization issues <br/> Evtl. redundant data <br/> aa has to be implemented  || How do we perform status updates? (escidoc has to know the status, not only the triple store) <br/> maybe this alternative can be acceptable in decoupled scenario e.g. ingest/updates are done directly on the triple store, they are stored with delay in eSciDoc core - in this case, AA must be taken seriously as well <br/> see also [[Image:Batch_metadata_update.pptx]] || [http://jira.mpdl.mpg.de/browse/FACESBUG-435 Tested]
|'''eSciDoc as archive, MD in Triple Store'''||-- || --||Very fast|| synchronization issues <br/> Evtl. redundant data <br/> aa has to be implemented  || How do we perform status updates? (escidoc has to know the status, not only the triple store) <br/> maybe this alternative can be acceptable in decoupled scenario e.g. ingest/updates are done directly on the triple store, they are stored with delay in eSciDoc core - in this case, AA must be taken seriously as well <br/> see also [[Image:Batch_metadata_update.pptx]] || [http://jira.mpdl.mpg.de/browse/FACESBUG-435 Tested]

Revision as of 07:15, 16 June 2010

FACES

Scope · Functionalities
Disclaimer and Copyright
Support

Application Profiles
Release Agreement

Specification:
Browse and Display · Search
Albums · Users
Note Pads · Versioning

Related Projects:
Imeji

edit


This is a protected page.

This page contains information about different technology possibilities to implement Faces 4.0. To achieve the requirements the performance of the different technologies is of most interest.

Technology Time to update one item * Time to ingest one item ** Pro Happy.gif Con Sad.gif Open Questions Status
eSciDoc Item Retrieval (SOAP) 0,6 sec 2,65 sec Fast development, as already implemented in other solutions
All eSciDoc services can be used (versioning, statistics, aa etc.)
Very slow
Extra release, pid assignment etc. is necessary
-- Tested
eSciDoc Item Retrieval (REST) 0,5 sec 2,2 sec All eSciDoc services can be used (versioning, statistics, aa etc.)
retrieve Operation is faster (approx. half a second per item)
Slow
Extra release, pid assignment etc. is necessary
-- Tested
eSciDoc IngestHandler -- 0,4 sec -- No PID assigned
User needs special role: ingester
Items seems not to be indexed: blocker!
-- Tested
eSciDoc ContentRelation -- -- -- CR is not under version control Cannot be updated any more when released once (The documentation says public-status of an CR must not be "released"). Thus, CRs are not feasible for this purpose Tested
eSciDoc Item with 1000+ components ?? 0.9 sec -- faster ingest compared to single item ingest Retrieval times for item with 1000 components: > 33 sec
Initial filesize: 0,6MB (will increse with each version)
Failed to ingest an item with 10000 components
Initial file size: > 5MB
Tested
eSciDoc as archive, MD in Triple Store -- -- Very fast synchronization issues
Evtl. redundant data
aa has to be implemented
How do we perform status updates? (escidoc has to know the status, not only the triple store)
maybe this alternative can be acceptable in decoupled scenario e.g. ingest/updates are done directly on the triple store, they are stored with delay in eSciDoc core - in this case, AA must be taken seriously as well
see also File:Batch metadata update.pptx
Tested
eSciDoc Core Performance Tuning -- -- All solutions could profit from this Development has to be together with FIZ, so that we do not develop our own eSciDoc which we have to adopt with every FW release
Development process can be very long
Code seems to be complex to understand
Would FIZ be willing to provide development resources to perform this task? In consideration
No eSciDoc -- -- Can be much faster Services can not be reused
High development effort
What to use as storage? Fedora, DB? Discarded

(*) Only update operation

(**) Whole process form create to release, with eventually necessary retrieves, pid assignment, submit etc...