ESciDoc Content Relations Concept

From MPDLMediaWiki
Revision as of 12:25, 28 September 2009 by ROF (talk | contribs) (→‎Ontology Handler)
Jump to navigation Jump to search

Concept: Content Relation Linking and Tagging with eSciDoc

Requirements (see also ESciDoc_Content_Relations)[edit]

It should be possible to link between two resources, whereas the link has to carry meta data. Furthermore is a tagging mechanism required to add tag an existing resource. It is important that this functionality is not restricted to the owner/modifier rule of the related resources (with exception of the relation resource itself). Relation could be set to whole resources and specific versions.

Functional Overview[edit]

Linking and tagging within eSciDoc Core could be provided by Content Relations. An eSciDoc Content Relation is a resource that creates a relation between two resources. The relation between resources is expressed through source and target elements.

The types of resource which could be set into relation is not restricted. Each Content Relation consists of three resources. In according to RDF manner are the both related resources called subject and object. The Content Relation resource is the third resource. The Content Relation resource defines, besdide other values, the predicate to express the relation in SPO/RDF.

- include figure 1: Linking

- include figure 2: Tagging

The difference between linking and tagging is the parameter type of the Content Relation which is a fixed predicate of an system internal Ontology.

Restrictions for a Content Relation should be expressed through a Content Model. A Content Relation has (for now) no Content Model. The creator of a Content Relation dictates which ontology is to use via the content relation predicate. Multiple ontologies be used, stored and extend with/though the infrastructure, but each Content Relation use exact one ontology.

The direction of Relations are defined through RDF from left side (subject) to the right (object). An automatically inverse relation will not be given. This restriction does not exclude effective search requests ala “Which resource points to Resource Y with following relationName …?”.

Content Relations are not part of the resource(s) which it is linked on, it is to discover by a separate request. Furthermore are it not related to any Context and can therefore set resources, over context borders, in relation.

Versions and Relation Types[edit]

Content Relations are aware of versions of source and target. This means that a link or tag could be set to a certain version. This means not that older versions of a modified Content Relation could be retrieved. The Content Relation resource is non-versioned. According the requirements extends Content Relation the both -well known- framework internal references (fixed and floating). References could be set for one certain or from and to certain versions of the resources.

A fixed relation is a reference to or from a certain version number of the resource. E.g. A relation between the version 3 of an Item X and the version 1 of the Item Y, whereas both sides of the relation are fixed references.

A floating relation is a reference which has the whole resource (with all versions) as source and/or target. The resource(s) can be developed further and the relation still reference to the newest version. E.g. A relation between version 2 of Item X and the resource Item Y. The reference to Item Y is a floating reference. Version 3 of the Item X is continuously reference the newest version of Item Y, even if Item Y is updated in the meantime.

Searching for Content Relations[edit]

Content Relations are not part of the tagged or linked resource. That’s why it is not to expect to find all relations to a resource via search. Depending on search index could Content Relations itself be found with the search.

Searching for Content Relations will be handled through a filter interface of the Content Relations handler. This search index is synchronal with the resource status and base on TripleStore. Solutions/users are not provided with a direct access to the TripleStore.The handler provides method to retrieve list with selected parameters to filter and returns lists with resource references.

A Content Relation has same states and public-status workflow as Item or Container. Public-status rules has to be the same than by Item and Container.

Meta data sets are handled different to Item and Container. Content Relation will not have a meta data section as required. Md-records are complete optional. If a md-record with the name of the default md-records is set, than tries the core an automatically mapping to DC.

Content Relation Resource XML Representation[edit]

Three major sections structure the Content Relation resource. Section one contains the resource properties, which describes –like in all other resources- the resource itself. Values of this section are: creator, modifier, creation-date, content-model, … The second section contains the relations to resources with optional elements for version restrictions. The third section contains descriptions about the relation of the specified resources. This includes beside the type of the Content Relation a general description element and an optional set of meta data.

<?xml version="1.0" encoding="UTF-8"?>
<escidoc:content-relation 
       xmlns:escidoc="http://www.escidoc.de/schemas/content-relation/0.1"
	xmlns:prop="http://escidoc.de/core/01/properties/"
	xmlns:srel="http://escidoc.de/core/01/structural-relations/"
       xmlns:xlink="http://www.w3.org/1999/xlink"
       xmlns:xml="http://www.w3.org/XML/1998/namespace"
       objid="escidoc:123" last-modification-date="2009-07-21T08:36:49.626Z">

<prop:properties>
      <prop:creation-date>2009-07-21T08:36:49.625Z</creation-date>
      <srel:created-by objid="escidoc:exuser1"/>
      <srel:modified-by objid="escidoc:exuser1"/>
      <prop:public-status>released</public-status>
      <prob:public-status-comment>Changed to released.</prop:public-status-comment>
      <prop:visibility>public/private/organizational-unit</prop:visibility>
      <prop:pid>hdl:12345/6789</prop:pid>
</prop:properties>

 <!- - The actually relation section 
 - if objid (or href) is given without version number, than is a floating reference selected. If objid   
   has version suffix, than is a  fixed relation selected. -->
<content-relation:subject objid="escidoc:124" />
<content-relation:object objid="escidoc:127:3" />

 <!- - The type of the relation is defined through an ontology. -->
<content-relation:type>http://my.ontology/content-relation#isTranslationOf</content-relation:type>
<content-relation:description>Description for the relation.</content-relation:description>

<content-relation:md-records>
   <content-relation:md-record name="escidoc"/>
</content-relation:md-records>

</escidoc:content-relation>

Content Relation Interface[edit]

Content Relations are supported over the Content Relations Handler. Beside the standard create, update, retrieve and delete methods are methods supported to find Content Relation.

  • create (relation.xml) (REST: PUT /content-relation/)

creates an Content Relation resource, checks if structure and values are conform to content model, if related resources exists

  • retrieve(relation-objid) (REST: GET /content-relation/<objid>)

checks permissions and delivers representation

  • update(relation-objid, relation.xml) (REST: UPDATE: /content-relation/<objid>)

checks permissions, if structure and values conform with content model, if related resources exists

  • delete (relation-objid) (REST: DELETE /content-relation/<objid>)

marks resource as deleted

  • submit(objid, taskParam)
  • release(objid, taskParam)
  • revise(objid, taskParam)
  • withdraw(objid, taskParam)
  • assignObjectPid(objid, taskParam)
  • retrieveRelatedResourcesRefs(..)

these methods translate the method parameter to an TripleStore request, result rendering as resource references has to be specified (REST: POST method like filter or GET method like SRW?) • ..

Response messages from methods which resolve resource relation will only return resource references and not full representation. The solution/user has to start a second request to retrieve certain representations.

<?xml version="1.0" encoding="UTF-8"?>
<escidoc:content-relation-list 
       xmlns:escidoc="http://www.escidoc.de/schemas/content-relation-list/0.1"
	xmlns:srel="http://escidoc.de/core/01/structural-relations/"
	xmlns:xlink="http://www.w3.org/1999/xlink"
       xmlns:xml="http://www.w3.org/XML/1998/namespace"
       last-modification-date="2009-07-21T08:36:49.626Z">

    <srel:item objid="escidoc:201" />
    <srel:container objid="escidoc:202" />
    <srel:context objid="escidoc:203" />
    <srel:organizational-unit objid="escidoc:201" />

</escidoc:content-relation-list>

Ontology Handler[edit]

The concept based on ontologies. These ontologies are either build-in or created by users. It is expected that the set of attributes of an ontology has to grow with the repository. This mean the used ontologies are to update/extend. Altering ontologies is therefore a base for the Content Relation concept. The concept of ontologies so generic and useful for other resources that it worth an own handler. An ontology handler has to provide methods to create, update, delete and retrieve whole ontologies. A filter for certain attributes is in addition a basic requirement.

Ontology Methods

  • createRelationOnotology(ontology.rdf)
  • retrieveRelationOntology(ontology-id ??)
  • updateRelationOntology(ontology-id ??, ontology.rdf)


  The Release 1.2 does not offer the ontology methods. The eSciDoc-Core framework supports only one ontology. This is either a default ontology, or an ontology configured by indicating its URL in the escidoc config file.

Core Internal Representation[edit]

Core internal equals each Content Relation one Fedora object. Properties and related resource values and used ontology are written to RELS-EXT with the reification method. Each md-record is kept as Fedora datastream. Version number limitations are implemented as triple for start and end.

  RELS-EXT of a Fedora Object representing a Statement (aka Content Relation):


<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns:prop="http://escidoc.de/core/01/properties/"
  xmlns:srel="http://escidoc.de/core/01/structural-relations/"
  xmlns:crel="http://escidoc.de/core/01/content-relations/">
 <rdf:Description about="info:fedora/escidoc:1234">
   <rdf:type rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Statement"/>
   <rdf:subject rdf:resource="info:fedora/escidoc:persistent11"/>
   <rdf:predicate rdf:resource="http://mpdl.escidoc-project.de/controlled/relations/isPredecessorOf"/>
   <rdf:object rdf:resource="info:fedora/escidoc:persistent1"/>
   <crel:subject-version-number>2</crel:subject-version-number>
   <crel:object-version-number>5</crel:object-version-number>
   <srel:created-by rdf:resource="info:fedora/escidoc:systemadministrator"/>
   <prop:created-by-title>Test System Administrator User</prop:created-by-title>
   <srel:modified-by rdf:resource="info:fedora/escidoc:user42"/>
   <prop:modified-by-title>Peter</prop:modified-by-title>
   <prop:public-status>submitted</prop:public-status>
 </rdf:Description>
</rdf:RDF>

Concept limitations[edit]

  • Non-versioned
  • Automatically bi-directional relations not supported
  • Transition/migration from Tag to Link (and vise-versa)
  • CQL as filter language (translation to TripleStore requests unclear)

Changes[edit]

21. Aug:

  • Content Relation between more than two resources are disagreed during video conference on 18. August
  • Rework Tag: Tags are now nearly identical to links. Even the type of the Content Relation and the Content Model of the subject resource qualifies and tag