ESciDoc Metadata Records Manipulation

From MPDLMediaWiki
Jump to: navigation, search

Introduction

This discussion refers to problem of delivery of all metadata records of the resource by retrieval methods

For example:

  • Item resource has several metadata records (one of them marked as a default i.e. labeled as "ESciDocEnhanced Scientific Documentation")
    • retrieveItem method delivers within the Item.xml all metadata records of this item
  • Item resource has one or more metadata records (one of them marked as a default i.e. labeled as "ESciDocEnhanced Scientific Documentation") and several components associated. Each component has at least one descriptive metadata record and one technical metadata record (generated by JHove service)
    • retrieve Item method delivers within the Item.xml all metadata records of this item, information on all components of this item and all metadata records of all components

Proposal

  • Item.xml (Container.xml) should embed only completely the default metadata records for item (and evtl. components) or container
  • Item.xml should provide the information i.e. IDs/Links to other metadata records (format to be defined)
  • Resource handlers should provide a method with which a particular metadata record should be retrieved i.e. retrieveMetadataRecord(MdRecordID) or retrieveMetadataRecords(ResourceID) -> that provides a list of metadata records (in same manner as in Item.xml - not full metadata records)

Advantages

  • when users define default metadata records that is a conscious decision (in most of the time they would not need other metadata records, except on special ocasions)
  • in most use-cases other metadata records are originally taken metadata records during e.g. ingestion of external objects into the system
  • especially delivery of technical metadata records for the component is not needed by default
  • less network traffic (especially in item lists)
  • faster (smaller) cache (only default metadata records are cached)
  • only properties/default metadata records are filterable
  • users/clients are not annoyed with a need to additionally process the xml in situations such as e.g. export
  • client code and interfaces can be more stable, as addition of new metadata records would not affect the client code
  • retrieve<Resource>Complete - separate method can actually deliver complete metadata records (if users would need to have them in case of i.e. posting data to an LTALong-term Archiving interface)

Disadvantages

  • not seen to much, except if users would like to filter by non-default metadata records (but I would assume this as not primary use-case) (but even this is not a blocking factor - as in future this can be enabled - by request also for non-default metadata records)
  • extra request to the core services needs to be issued in order to retrieve a non-default metadata records
  • see above proposal for retrieve<Resource>Complete method

Searching

  • Searching can be made configurable
    • by default: only default metadata records are indexed for searching
    • if user explicitly asks: other metadata records can be indexed for searching, but that should be separate index database