Difference between revisions of "Imeji Performance eSciDoc"
Kleinfercher (talk | contribs) |
Kleinfercher (talk | contribs) m (added results from rest testing) |
||
Line 9: | Line 9: | ||
: All eSciDoc services can be used (versioning, statistics, aa etc.) | : All eSciDoc services can be used (versioning, statistics, aa etc.) | ||
[[Image:Sad.gif | 20px]] Very slow (approx. 2, | [[Image:Sad.gif | 20px]] Very slow (approx. 2,65 sec for one item (update, submit, pid, release)) | ||
: Extra release, pid assignment etc. is necessary | : Extra release, pid assignment etc. is necessary | ||
Line 17: | Line 17: | ||
All metadata are stored in an eSciDoc item. The item is updated etc. via the eSciDoc REST interface. | All metadata are stored in an eSciDoc item. The item is updated etc. via the eSciDoc REST interface. | ||
[[Image:Happy.gif | 20px]] | [[Image:Happy.gif | 20px]] All eSciDoc services can be used (versioning, statistics, aa etc.) | ||
: retrieve Operation is faster (approx. half a second per item) | |||
[[Image:Sad.gif | 20px]] | [[Image:Sad.gif | 20px]] Slow (approx. 2,2 sec for one item (update, submit, pid, release)) | ||
: Extra release, pid assignment etc. is necessary | |||
'''Open Questions:''' | '''Open Questions:''' |
Revision as of 07:09, 10 June 2010
|
This is a protected page.
eSciDoc ItemHandler (SOAP)[edit]
All metadata are stored in an eSciDoc item. The item is updated etc. via the eSciDoc item handler.
Fast development, as already implemented in other solutions
- All eSciDoc services can be used (versioning, statistics, aa etc.)
Very slow (approx. 2,65 sec for one item (update, submit, pid, release))
- Extra release, pid assignment etc. is necessary
Open Questions:
eSciDoc Item retrival via REST Interface[edit]
All metadata are stored in an eSciDoc item. The item is updated etc. via the eSciDoc REST interface.
All eSciDoc services can be used (versioning, statistics, aa etc.)
- retrieve Operation is faster (approx. half a second per item)
Slow (approx. 2,2 sec for one item (update, submit, pid, release))
- Extra release, pid assignment etc. is necessary
Open Questions:
eSciDoc IngestHandler[edit]
TODO
Open Questions:
eSciDoc ContentRelation[edit]
All metadata are stored in a content relation object, which is related to the item (image).
For a metadata change, the content relation only needs to be updated (no extra release, pid assignment etc.)
- Unfortunately, this is wrong: A content relation also has to be submitted and released. Additionally, it seems it cannot be updated anymore when released once (The documentation says public-status of an CR must not be "released"). Thus, CRs are not feasible for this purpose --MarkusH 14:05, 1 June 2010 (UTC)
- As there's still submit and release necessary, no much difference to an item. And update is not possible, see comment above. --MarkusH 14:05, 1 June 2010 (UTC)
Open Questions:
- Is a content relation under version control?
- No, it is not under version control
- How is the aa for content relations?
- same principle as for items, excluding versions
eSciDoc only as archive[edit]
The item (image) itself will be stored in eSciDoc, together with the technical metadata.
Open Questions:
- Is there a set of 'core' metadata we have to/ should store with the item for LTA reasons?
- any metadata are subject to LTA, in addition, PREMIS event history is tracked for items/containers
MD in Triple Store[edit]
All metadata are stored in a triple store, all operations (search, update etc.) can take place here.
The triple store would know the eSciDoc id of the image item, but the image item would not know its metadata in the triple store.
Very fast (30,000 items in 2 seconds)
- items or triples?
- redundant data
- aa has to be implemented
Open Questions:
- How can we synchronize the two systems? (Do we have to synchronize them at all, or is sufficient to store the md only in the triple store?)
- How do we perform status updates? (escidoc has to know the status, not only the triple store)
- maybe this alternative can be acceptable in decoupled scenario e.g. ingest/updates are done directly on the triple store, they are stored with delay in eSciDoc core - in this case, AA must be taken seriously as well
eSciDoc Core Performance Tuning[edit]
Update eSciDoc core, so that retrieval of items are faster.
All solutions could profit from this
Development has to be together with FIZ, so that we do not develop our own eSciDoc which we have to adopt with every FW release.
- Development process can be very long
- Code seems to be complex to understand
Open Questions:
- Would FIZ be willing to provide development resources to perform this task?
No eSciDoc[edit]
We do not use eSciDoc at all.
- High development effort
Open Questions:
- What to use as storage? Fedora, DB?