Talk:PubMan Func Spec Ingestion

From MPDLMediaWiki
Revision as of 12:39, 22 January 2008 by Robert (talk | contribs) (→‎Phase 2)
Jump to navigation Jump to search

work in progress

Implementation approach:

Note: is currently based on assumption that no workflow engine will be implemented to support more complex ingestion tasks/processing of items

Phase 1[edit]

  • provide multiple item submission (batch import) for local Endnote files and WoS records
  • eDoc format? eSciDoc xml?
  • ingestion done by depositor for any collection where he has depositing rights
  • simple workflow: submit and immediate release
  • no duplicate checking (?, maybe identification?)
  • supported EndNote versions: up to 6, 6.x
  • validation rules?
  • check pubmed as possible provider?
  • no provision of mapping of customizable endnote fields to escidoc

Phase 2[edit]

Phase 3[edit]

duplicate identification, duplicate handling workflow based ingestion, incl. task manager and processing of ingested items

Comments on Functional specification[edit]

in progress!

UC_PM_ING_01 upload file in structured format[edit]

Status/Schedule[edit]

  • Status: in specification
  • Schedule:R3

Motivation[edit]

  • The user wants to upload a locally created EndNote file or a reference file from Web of Science, containing one or more references.

Expected ouctome[edit]

References are batch uploaded to a collection on Pubman and are immmediately released.

The items created on PubMan can be modified via the standard modification workflow.

Actors[edit]

  • Depositor

Pre-Condition[edit]

  • The user has a depositor privilege for at least one collection in status "opened" where the multiple file upload is allowed.

Steps[edit]

  • 1. The user chooses to upload a file in structured format.
  • 2. The system offers a list of supported structured formats for upload. (No need for this step, as end note version1-7 is one structured format in the list and end note version 8 will be another --Natasa 12:54, 22 January 2008 (CET))
    • 2.1. The user selects the structured format for data upload.
  • 3. The user uploads the file.
  • 4. The system displays a list of all collections in state "opened" for which the user has the privileges as Depositor and where the collection settings allow multiple upload of items?.
    • 4.1. (Optionally) The user chooses to view the collection description.
      • 4.1.1. The system displays the collection description.
    • 4.2. The user selects a collection of her choice.
    • 4.3. (optionally) The user decides to de-activate the validation rules for the upload (i.e the validation rules defined for validation point submission for the collection.) [Nicole]: I would suggest to have one collection per institute for ingests. This collection will have no validation point for submit, but for modify item. [Natasa]: Agreed, see also the modification in the second step for this issue. Would remove this step, as it makes no sense that the user deactivates validation rules or not - we do not offer the user the possibility to edit or save the items in "pending" status - therefore no need for this step as well. --Natasa 12:47, 22 January 2008 (CET)
  • 5. The user starts the ingestion.
  • 6. The system processes the uploaded file, checks for completeness, creates items and releases them immediately. The use case ends successfully. --Natasa 12:47, 22 January 2008 (CET)

Alternatives[edit]

4.a. If the user has a privilege to upload files in only one collection the collection is selected by the system automatically.

5.a. The user cancels the ingestion.

5.a.1. The system removes the uploaded file and the use case ends without success.

6. The user gets an error message, indicating type of error (time out during upload, invalid file, validation rules not met).

6a. User tries the upload again. continue with step 3.

6b. User cancels the upload procedure.

Data involved[edit]

endnote files, from endnote version 1.x to 7.

endnote files, from endnote version 8.x

Reference file from Web of science

=> all files are structured format, .txt. file

Actors involved[edit]

user with depositing rights for at least one collection

Constraints[edit]

  • encoding of files depends on endnote version:

1.x to 7 support ASCII

8.x support UTF8

  • Mapping to PubMan Genres depends on endnote version (different mappings needed)
  • the file upload is only successful, if all references have been uploaded. No "partly" upload possible.

Comments DEV Team[edit]

  • Natasa: made some modifications, please check them, otherwise the use-case description itself is fine. Missing mappings from EndNote, WOS--Natasa 13:15, 22 January 2008 (CET)

Comments on Abstract Prototype[edit]

In general some more informtation would be good concerning options e.g. Is the list of options final or does it change/grow significantly for forthcoming releases?

A little bit more background on options/criteria would be fine to decide for suitable controls (probably not in this prototype because it is pretty clear here)

e.g.

- is only one option possible/necessary ore more? - estimates on the options would be good: one is important, one might be rarely used, one depends on ... - is it mandatory to choose here explicitly or is it more optionally (pass this with a good default)

Rupert 17:35, 20 December 2007 (CET)