Difference between revisions of "MD Store"

From MPDLMediaWiki
Jump to navigation Jump to search
 
(36 intermediate revisions by 4 users not shown)
Line 1: Line 1:
{{ESciDoc Solutions}}
= Introduction =
= Introduction =


The MD Store is a triple store for medatada storage and management.  
The MD Store is a triple store for metadata storage and management.  
The MD Store manages resources which are items or containers in an eSciDoc Repository. The MDStore is designed to be used independently from eSciDoc Repository (in case such a need arises) or in relation with another, non-eSciDoc repository.
The MD Store manages resources which are items or containers in an eSciDoc Repository. The MDStore is designed to be used independently from eSciDoc Repository (in case such a need arises) or in relation with another, non-eSciDoc repository.
The following scenarios are covered with the MD Store:
The following scenarios are covered with the MD Store:
Line 7: Line 9:
*batch metadata update
*batch metadata update
*linked data publishing
*linked data publishing
*see also [[ESciDoc_Services_Annotator|Annotator service]]


= Technology =
= Technology =
* The triple store technology is Jena SDB (TDB doesn't support transactions)
* The triple store technology is planned to be Jena SDB (TDB doesn't support transactions)
**note: current imeji demonstrator implements Jena TDB
* be aware of [http://jena.sourceforge.net/how-to/concurrency.html Concurrency issues in Jena]


= Data model =
= Data model =
This data model refers only to the MDStore internal data model. As the MDStore will be used in Faces, more information for Faces data model can be found [[Faces_Data_Model|here]].


Draft visualization of how resources would be stored in the MDStore is given in the mage below:<br/>
Draft visualization of how resources would be stored in the MDStore is given in the mage below:<br/>
Line 17: Line 24:
[[Image:MDStoreDataModel.png|700x700px|Center|Draft MDstore data model visualization]]
[[Image:MDStoreDataModel.png|700x700px|Center|Draft MDstore data model visualization]]


* The MD Store defines 2 rdf graphs:
** Metadata graphs, where metadata triples are stored
** Property graphs with following properties:
***context-id
***public-status
***lock-status
***content-model-id
***last-modification-date
***created-by
***modified-by
***version-status
***collection-id


==Mapping to eSciDoc resources==
==Mapping to eSciDoc resources==
Line 22: Line 43:
* '''Metadata Record'''  is the full content of a metadata record that is stored externally from eSciDoc Items and Containers. The eSciDoc items/containers would only reference the metadata record stored in the MDStore. For this purpose for each metadata record of an Item or a Container there is a special (RDF/XML) metadata profile that links further to the metadata record stored in the MDStore.
* '''Metadata Record'''  is the full content of a metadata record that is stored externally from eSciDoc Items and Containers. The eSciDoc items/containers would only reference the metadata record stored in the MDStore. For this purpose for each metadata record of an Item or a Container there is a special (RDF/XML) metadata profile that links further to the metadata record stored in the MDStore.


= Interface =
**Note: the names of the properties are same as in eSciDoc core. However, in case when eSciDoc Core is not used, these may be set-up by the external system
 
* '''Searching from eSciDoc''' - the eSciDoc indexing component is extended with the possibility to fetch external XML content and index it
 
= Interfaces =
 
For now we assume that the MD Store will implement 3 interfaces:
* MDStore Basic interface - to manage the resource updates and synchronize with metadata from the origin repository
* MDStore search interface - SPARQL endpoint
* MDStore linked data publishing interface


For now we assume that the MD Store will implement 2 interfaces: one to manage the metadata updates and synchronize with metadata from the origin repository, and another to publish linked data.
During implementation this may possibly change, e.g. make linked data interface as a separate service. At present we see no need for it.
During implementation this may possibly change, e.g. make linked data interface as a separate service. At present we see no need for it.


==MDStore basic interface==
==MDStore basic interface==
The MD Store implements a REST interface. The REST interface methods can be grouped as:
The MD Store implements a REST interface. The REST interface methods are:
 
*CRUD-based methods
*Task-oriented methods


*CRUD-based methods
** GET: retrieves the requested resource
** GET: retrieves the requested resource
** POST: create a new resource
** POST: create a new resource
Line 39: Line 64:
** DELETE: Deletes a resource
** DELETE: Deletes a resource


*Task-oriented methods (to be discussed)
==MDStore search interface==
**findDifference  - to check the difference between the properties graph in MD-Store and origin repository ?
*is the SPARQL endpoint coming from [http://joseki.sourceforge.net/ Jena Joseki]
**synchronize - to update the properties graph from the origin (input: single resource), shall it also try to e.g. delete a metadata graph in case when this is removed from the repository origin ?
==MDStore linked data publishing interface==
**CQL-enabled (SPARQL enabled) - search ?
*TBD
 


==URL definition==
==URL definition==
Line 56: Line 80:
     http://coreservice.mpdl.mpg.de/md-store
     http://coreservice.mpdl.mpg.de/md-store


===Interface Methods URL===
 
*Note:  at present MDStore requires Java 1.6 which is not available at dev-coreservice-url
 
===MDStore Basic Interface Methods URL===


*For retrieval of the complete resource from the MD store (properties + all metadata record graphs)
*For retrieval of the complete resource from the MD store (properties + all metadata record graphs)
Line 92: Line 119:
       <base-url>/escidoc:1234/properties
       <base-url>/escidoc:1234/properties


==MDStore linked data interface==
= Architecture =
*TBD
 
 
* eSciDoc: The MD Store uses eSciDoc security management (AA)
** It should be generic to enable other services to log in into MDstore


= Data Model=
*see [[MD_Store/Architecture|MDStore Architecture]]
* The MD Store defines 2 rdf graphs:
** Metadata graphs, where metadada triples are stored
** Property graphs with following properties:
***context-id
***public-status
***lock-status
***content-model-id
***last-modification-date
***created-by
***modified-by
***version-status
**Note: the names of the properties are same as in eSciDoc core. However, in case when eSciDoc Core is not used, these may be set-up by the external system


=Questions=
=Questions=
Line 122: Line 140:
**MDStpore would contain additional descriptive metadata of a CoNE Resource (provided by the end user) and a link to CoNE Resource?
**MDStpore would contain additional descriptive metadata of a CoNE Resource (provided by the end user) and a link to CoNE Resource?


= Architecture =
[[Category:MD Store]]
 
[[Category:ESciDoc]]
 
* eSciDoc: The MD Store uses eSciDoc security management (AA)
** It should be generic to enable other services to log in into MDstore
 
 
[[Category:Faces_4.0]]

Latest revision as of 10:14, 19 August 2013

eSciDoc Solutions

PubMan:
Overview · Functionalities
Interfaces · Support

Faces:
Overview · Functionalities
Scope · Support

ViRR:
Overview · Functionalities
Scope · Support

imeji
Digitization Lifecycle

edit


Introduction[edit]

The MD Store is a triple store for metadata storage and management. The MD Store manages resources which are items or containers in an eSciDoc Repository. The MDStore is designed to be used independently from eSciDoc Repository (in case such a need arises) or in relation with another, non-eSciDoc repository. The following scenarios are covered with the MD Store:

Technology[edit]

  • The triple store technology is planned to be Jena SDB (TDB doesn't support transactions)
    • note: current imeji demonstrator implements Jena TDB
  • be aware of Concurrency issues in Jena

Data model[edit]

This data model refers only to the MDStore internal data model. As the MDStore will be used in Faces, more information for Faces data model can be found here.


Draft visualization of how resources would be stored in the MDStore is given in the mage below:

Draft MDstore data model visualization


  • The MD Store defines 2 rdf graphs:
    • Metadata graphs, where metadata triples are stored
    • Property graphs with following properties:
      • context-id
      • public-status
      • lock-status
      • content-model-id
      • last-modification-date
      • created-by
      • modified-by
      • version-status
      • collection-id

Mapping to eSciDoc resources[edit]

  • Resource can be any eSciDoc resource of type Item or Container.
  • Metadata Record is the full content of a metadata record that is stored externally from eSciDoc Items and Containers. The eSciDoc items/containers would only reference the metadata record stored in the MDStore. For this purpose for each metadata record of an Item or a Container there is a special (RDF/XML) metadata profile that links further to the metadata record stored in the MDStore.
    • Note: the names of the properties are same as in eSciDoc core. However, in case when eSciDoc Core is not used, these may be set-up by the external system
  • Searching from eSciDoc - the eSciDoc indexing component is extended with the possibility to fetch external XML content and index it

Interfaces[edit]

For now we assume that the MD Store will implement 3 interfaces:

  • MDStore Basic interface - to manage the resource updates and synchronize with metadata from the origin repository
  • MDStore search interface - SPARQL endpoint
  • MDStore linked data publishing interface

During implementation this may possibly change, e.g. make linked data interface as a separate service. At present we see no need for it.

MDStore basic interface[edit]

The MD Store implements a REST interface. The REST interface methods are:

    • GET: retrieves the requested resource
    • POST: create a new resource
    • PUT: Updates a resource
    • DELETE: Deletes a resource

MDStore search interface[edit]

MDStore linked data publishing interface[edit]

  • TBD

URL definition[edit]

Base URL[edit]

  • Base URL of the core-service instance is appended with the URL of the new-service
    <core-service-url>/md-store 

e.g.

    http://coreservice.mpdl.mpg.de/md-store


  • Note: at present MDStore requires Java 1.6 which is not available at dev-coreservice-url

MDStore Basic Interface Methods URL[edit]

  • For retrieval of the complete resource from the MD store (properties + all metadata record graphs)
    <base-url>/<resource-id>


e.g.

     <base-url>/escidoc:1234
  • For all metadata records graph (NOTE: there can be several metadata records managed in the MD Store for a resource)
    <base-url>/<resource-id>/md-records 

e.g.

     <base-url>/escidoc:1234/md-records
  • For single metadata record graph
    <base-url>/<resource-id>/md-records/md-record/<md-record-id> 

e.g.

     <base-url>/escidoc:1234/md-records/md-record/escidoc


  • For properties graph
    <base-url>/<resource-id>/properties

e.g.

     <base-url>/escidoc:1234/properties

Architecture[edit]

  • eSciDoc: The MD Store uses eSciDoc security management (AA)
    • It should be generic to enable other services to log in into MDstore

Questions[edit]

  • shall the interface methods understand the resource version-id?
  • three alternatives could be implemented:
    • content-stream
      • is indeed for binary content only (together with the item object, must be base-64 encoded, most probably not applicable)
    • md-record - good as in this case we could deal with items, containers ... even originally considered as cumbersome, may be good approach e.g.
      • create new MD-Profile (Externally-reference
    • externally-referenced component
  • the Model itself-> when things come from CoNE -> we treat them as CoNE resources or not?
    • MDStore would not contain any descriptive metadata from CoNE in addition, only a link to CoNE resource? (cleaner)
      • CoNE changes?
    • MDStpore would contain additional descriptive metadata of a CoNE Resource (provided by the end user) and a link to CoNE Resource?