Talk:ESciDoc Item Container Version History

From MPDLMediaWiki
Revision as of 13:46, 8 March 2010 by Frank (talk | contribs) (→‎Identifier types)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

General[edit]

In general, as eSciDoc only records premis-event information, it would be good to focus on premis-v2 for event only for start. Not clear whether further premis data should be considered when exporting to LTA, but this would be minimal start.


   *  2.1 eventIdentifier (M, NR)
         o 2.1.1 eventIdentifierType (M, NR) --Natasa 16:26, 22 December 2009 (UTC)no changes
         o 2.1.2 eventIdentifierValue (M, NR)--Natasa 16:26, 22 December 2009 (UTC)no changes
   * 2.2 eventType (M, NR) --Natasa 16:26, 22 December 2009 (UTC)changes from controlled vocabulary (to be developed)
   * 2.3 eventDateTime (M, NR) --Natasa 16:26, 22 December 2009 (UTC) no changes
   * 2.4 eventDetail (O, NR) --Natasa 16:26, 22 December 2009 (UTC) change: to be provided by the end user
                               with the operation (additionally for update, create)
   * 2.5 eventOutcomeInformation (O, R)
         o 2.5.1 eventOutcome (O, NR) --Natasa 16:26, 22 December 2009 (UTC)change: in here system generated messages e.g. itemhandler.update
         o 2.5.2 eventOutcomeDetail (O, R) 
               + 2.5.2.1 eventOutcomeDetailNote (O, NR) --Natasa 16:26, 22 December 2009 (UTC)more details
                                                         e.g. itemhandler.addcontentrelations,
                                                          containerhandler.addcontentrelations (more details on the event 
                                                          that actually happened with update operation) - valid for Update
                                                          operation
               + 2.5.2.2 eventOutcomeDetailExtension (O, R)
   * 2.6 linkingAgentIdentifier (O, R) 
        o 2.6.1  linkingAgentIdentifierType (M, NR) --Natasa 16:26, 22 December 2009 (UTC)proposal use:
                                                       "escidoc.user.id" or smth better
        o 2.6.2  linkingAgentIdentifierValue (M, NR)--Natasa 16:26, 22 December 2009 (UTC)no change
        o 2.6.3  linkingAgentRole(O, R) --Natasa 16:26, 22 December 2009 (UTC) two possible values 
                                                                                               ( user, system)
   * 2.7 linkingObjectIdentifier (O, R)--Natasa 16:26, 22 December 2009 (UTC) this information should be
                                         populated only in particular cases in case of eSciDoc repository, for example: 
                                         when a handle is assigned then this is an information of the handle, when item
                                         relation is created in the item - then this is an information on the item that has
                                         been related, when a member is added to the container, then this is an information of
                                         the item related as a member (respectively the object role: identifier, member.. 
                                         in case of relations, then i would suggest the role is the relation predicate itself)
         o 2.7.1 linkingObjectIdentifierType (M, NR) --Natasa 16:26, 22 December 2009 (UTC)proposal use:
                                                              "escidoc.item.id", "escidoc.container.id" or smth better
         o 2.7.2 linkingObjectIdentifierValue (M, NR)--Natasa 16:26, 22 December 2009 (UTC)no change
         o 2.7.3 linkingObjectRole (O, R) --Natasa 16:26, 22 December 2009 (UTC)see comment above

Schema[edit]

http://www.loc.gov/standards/premis/v2/premis-v2-0.xsd

eSciDoc event types[edit]

Note: such event types can be used in both "eventType" and "eventOutcomeDetailNote"

Isn't the detail note more an explanation or further detailed information? (http://www.loc.gov/standards/premis/v2/premis-2-0.pdf, P. 141) I expected the comment to be placed therein. Frank 13:02, 8 March 2010 (UTC)

Identifier types[edit]

  • to be checked exactly what the namespace e.g. controlled vocab should look like (this may be also created in CONE)


The Premis Data Dictionary seems to mean a string to denote the context from which the identifier comes. That primarily means where the identifier is unique. One example is just "local". But the identifier type should be specific as possible. E.g. to use "URL" is not recommended. Frank 13:16, 8 March 2010 (UTC)

The above example might indicate there is an identifier of an user or item etc. of the eSciDoc Infrastructure. On one side that seems to be more specific as an identifier in the eSciDoc Infrastructure and on the other side it is not very specific because the indentifier is not unique in eSciDoc Infrastructure at all but in one specific infrastructure. But the latter seems to apply for "local" too. Frank 13:16, 8 March 2010 (UTC)

See http://www.loc.gov/standards/premis/v2/premis-2-0.pdf#page=17, P. 10, 12. Frank 13:16, 8 March 2010 (UTC)

In another example "URI" -- which is less specific than URL -- is used as identifier type. (http://www.loc.gov/standards/premis/v2/premis-2-0.pdf#page=160, P. 153) Frank 13:41, 8 March 2010 (UTC)

Object Role[edit]

  • URIs can be provided also for object roles (as these are already having property namespaces)

core-service 1.3 potential implementation[edit]

  • migrate version-history data
  • enable item.xml with operation comments by the user, related to changes proposed at unification of the presentation
    • alternative1: provide operation-comment as optional attribute within the <item> element (no change of the external interfaces in this case is needed)
    • alternative2: provide operation-comment in <version><comment> element
    • other?

In any case, these operation-comments would have to be ignored when task-oriented methods are performed.

Does that mean in case of the optional attribute within the <item> element if such an attribute appears in the task param of a task oriented method it should be ignored because there is already the possibility to send a operation-comment in a <comment> element? Frank 08:47, 21 January 2010 (UTC)
Alternatively the functionality for providing the comment can be aligned for both, resource and task oriented methods!? Frank 08:47, 21 January 2010 (UTC)

PREMIS data dictionary[edit]

Semantic unit 2.1 eventIdentifier
Semantic
components
2.1.1 eventIdentifierType
2.1.2 eventIdentifierValue
Definition A designation used to uniquely identify the event within the

preservation repository system.

Rationale Each event recorded by the preservation archive must have a unique

identifier to allow it to be related to objects, agents, and other events.

Data constraint Container
Repeatability Not repeatable
Obligation Mandatory
Creation /
Maintenance notes
The eventIdentifier is likely to be system generated. There is no

global scheme or standard for these identifiers. The identifier is therefore not repeatable.

eSciDoc Each event recorded by the preservation archive must have a unique

identifier to allow it to be related to objects, agents, and other events.


Semantic unit 2.1.1 eventIdentifierType
Semantic
components
None
Definition A designation of the domain within which the event identifier is unique.

ifier to allow it to be related to objects, agents, and other events.

Data constraint None
Examples FDA
Stanford Repository Event ID
UUID
Repeatability Not repeatable
Obligation Mandatory
Creation /
Maintenance notes
For most preservation repositories, the eventIdentifierType will be its

own internal numbering system. It can be implicit within the system and provided explicitly only if the data is exported.

eSciDoc URL


Semantic unit 2.1.2 eventIdentifierValue
Semantic
components
None
Definition The value of the 2.1 eventIdentifier
Data constraint None
Examples [a binary integer]
E-2004-11-13-000119
58f202ac-22cf-11d1-b12d-002035b29092
Repeatability Not repeatable
Obligation Mandatory
eSciDoc Value of the URL
--Makarenko 14:19, 5 January 2010 (UTC): For the moment values like
/ir/container//version-history#v40e1257337841231
/ir/container/escidoc:29780:39version-history#v39e1257337708130
To be consolidated.


Semantic unit 2.2 eventType
Semantic
components
None
Definition A categorization of the nature of the event.
Rational Categorizing events will aid the preservation repository in machine processing of event information, particularly in reporting.
Data constraint Value should be taken from a controlled vocabulary.
Examples E77 [a code used within a repository for a particular event type]
Ingest
Repeatability Not repeatable
Obligation Mandatory
Usage Notes Each repository should define its own controlled vocabulary of eventType values. A suggested starter list for consideration (see also the Glossary for more detailed definitions):
capture = the process whereby a repository actively obtains an object
compression = the process of coding data to save storage space or transmission time
creation = the act of creating a new object
deaccession = the process of removing an object from the inventory of a repository
decompression = the process of reversing the effects of compression
decryption = the process of converting encrypted data to plaintext
deletion = the process of removing an object from repository storage
digital signature validation = the process of determining that a decrypted digital signature matches an expected value
dissemination = the process of retrieving an object from repository storage and making it available to users
fixity check = the process of verifying that an object has not been changed in a given period
ingestion = the process of adding objects to a preservation repository
message digest calculation = the process by which a message digest (“hash”) is created
migration = a transformation of an object creating a version in a more contemporary format.
normalization = a transformation of an object creating a version more conducive to preservation
replication = the process of creating a copy of an object that is, bitwise, identical to the original
validation = the process of comparing an object with a standard and noting compliance or exceptions
virus check = the process of scanning a file for malicious programs.
Note that migration, normalization, and replication are more precise subtypes of the creation event. “Creation” can be used when more precise terms do not apply, for example, when a digital object was first created by scanning from paper.
In general, the level of specificity in recording the type of event (e.g., whether the eventType indicates a transformation, a migration or a particular method of migration) is implementation specific and will depend upon how reporting and processing is done. Recommended practice is to record detailed information about the event itself in eventDetail rather than using a very granular value for eventType.
eSciDoc --Makarenko 14:43, 5 January 2010 (UTC) For the moment the following values are presented:
create
update
submitted
assignVersionPid
released
...

Should be replaced with the list of eventType values from controlled vocabulary.
NS suggestion: http://purl.org/escidoc/versionhistory/ves/0.1/


Semantic unit 2.3 eventDateTime
Semantic
components
None
Definition The single date and time, or date and time range, at or during which the event occurred.
Data constraint To aid machine processing, value should use a structured form. To facilitate exchange of PREMIS-conformant metadata, use of standard conventions, for instance as used in the date elements in the PREMIS schema, is recommended.
Examples 20050704T071530-0500 [July 4, 2005 at 7:15:30 a.m. EST]
2006-07-16T19:20:30+01:00
20050705T0715-0500/20050705T0720-0500 [from 7:15 a.m. EST to
7:20 a.m. EST on July 4, 2005]
2004-03-17 [March 17, 2004, only the date is known]
Repeatability Not repeatable
Obligation Mandatory
Usage Notes Recommended practice is to record the most specific time possible and to designate the time zone.
eSciDoc Example:
2009-11-04T12:30:40.064Z
Controlled by PREMIS type edtfSimpleType


Semantic unit 2.4 eventDetail
Semantic
components
None
Definition Additional information about the event.
Data constraint None
Examples Object permanently withdrawn by request of Caroline Hunt.
Program=“MIGJP2JP2K”; version=“2.2”
Repeatability Not repeatable
Obligation Optional
Usage Notes eventDetail is not intended to be processed by machine. It may record any information about an event and/or point to information stored elsewhere.
eSciDoc Currently populated by machine. Examples:
Object escidoc:29780 created.
Container.addContentRelations
ContainerHandler.release()
objectPid assigned
ContainerHandler.update()
no comment
...
--Natasa 16:26, 22 December 2009 (UTC) change: to be provided by the end user with the operation (additionally for update, create)
--Makarenko 16:17, 5 January 2010 (UTC) My suggestion: can be edited only by user, machine generated message write into eventOutcome and eventOutcomeDetail