Talk:ESciDoc Content Relations

From MPDLMediaWiki
Revision as of 13:47, 17 December 2008 by Frank (talk | contribs) (→‎MPDL input after VidCo 16.12.2008)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Intro[edit]

The complete previous content of the article page is moved in this discussion page for better reference. --Natasa 12:39, 10 November 2008 (UTC)

Definition[edit]

A content relation in eSciDoc is a resource that relates two other resources. A content relation can relate:

  • Content item with another content item
  • Content container with another content container
  • Content item with a content container


Note: content relations are binary and bi-directional relations i.e.

  • a content relation can be established only by two content objects
  • a content relation a->b also assumes a content relation b->a
  • directions a->b and b->a may have different labels (e.g. isRevisionOf, hasRevisions) that are expressed with a specific relation ontology

What content relations are not[edit]

Content relation should in general not be mixed with structural (isMemberOf, hasMembers) and administrative relations (isContextFor, hasContext) as they are reserved in the present architecture for relations that describe:

  • relations between Items or Containers and a Context
  • relations between Containers and members of Containers
  • relations between parent and child organizational units
Why is "isParentOf" a structural relation and "isPredecessorOf" a content relation? The nature of this relations are quite similar to me --Inga 22:06, 5 December 2007 (CET)
"hasParent" is a relation/predicate that is used in eSciDoc to state the hierarchy of OUs and can not be explicitly set by the solution. Note: That is not true for "isParentOf". Frank 08:38, 6 December 2007 (CET)
isn't "isParentOf" is the opposite direction of "hasParent"? We need to think of how to make certain that it is not reused for other purposes i.e. for "content relations" --Natasa 18:33, 7 December 2007 (CET)
I think that is done by defining an ontology (explicit or implicit). The relation "hasParent" has a particular "namespace". If the inverse relation "isParentOf" is valuable/importent it should be defined (in the same namespace) and marked as inverse of "hasParent". Someone else may define a relation "isParentOf" for another purpose in another namespace and we really don't care. For example: The usage of <http://example.de/ontologies/escidoc-content-relations/isParentOf> does not collide with the usage of <http://escidoc.de/ontologies/structural-relations/isParentOf>. Maybe we should be more formal in the usage of "relation-names"; if someone says "has parent" and later "is parent of" i recognize it as different directions of the same thing but that depends on the spoken words. In a formal system the semantic should not depend on words (better: not on the semantic of words as they are used in a natrual language) but on formal definitions. Therefore "isParentOf" is NOT the opposite of the structural relation "hasParent" because it is not defined formally. As i understand, in your proposal such a definition for content-relations may be done by the relation-type-object. Frank 19:12, 7 December 2007 (CET)
i really don't get, why one would need a relation "isParentOf" - which should be "inverse" of "hasParent", if the "hasParent" relation is already defined. that seems simply redundant. the only additional information is the naming, which may simply be handled by a presentation layer. Robert 20:22, 7 December 2007 (CET)
As I understand the inverse is primary meant to be able to communicate a relation in the other direction. E.g. in the item xml all relations are listed in which the item is "source". If the inverse of "isRevisionOf" is defined as "hasRevision" and there is a relation <x> <isRevisionOf> <y> it would be possible to list <hasRevision target="x"/> in the xml representation of y. For query purposes the inverse is indeed not necessary. Frank 11:29, 10 December 2007 (CET)
To avoid more discussions my understanding (and the concept given on relations) is that <http://example.de/ontologies/escidoc-content-relations/isParentOf> is different from <http://example.de/ontologies/external-content-relations/isParentOf>; in addition <http://example.de/ontologies/escidoc-content-relations/isParentOf> as a relation type has the label "isParentOf" and the "inverse direction" label "hasParent" (nice to have for display to the end user - that is what is actually described with Relation type - the query still uses $subject "isParentof" <thisobject> - to get the information from the data). So yes - it is a single relation type for relations which state A is parent of B (and in case when relations are not separate resources)! --Natasa 15:22, 10 December 2007 (CET)
We need better to understand what actually happens when relations are separate resources and act as objects on their own (which is actually the reason for relation handler proposed in here) :). But maybe then we need to really use some more comprehensive example - let's take for example the use case for create new revision for item. Users starts from Item1 and would like to create a new revision of Item1. That would mean that user creates Item2, creates a relation object (containing the revision comment, and the date when this relation is created). The relation object contains the link to Item1 and the link to Item2. The relation object in this case represents the relation "isRevisionOf-hasRevision" . The relation object is the source for both directions and Item1 and Item2 are targets. Now, if we do not want to loose the semantic of the relation then we need to really name both directions as they are labeled respectively. --Natasa 15:30, 10 December 2007 (CET)

Relation types[edit]

  • defines the meaning of how two resources (e.g. items, container) are related. This meaning is expressed with an ontology (such as Fedora relation ontology, eSciDoc relation ontology) and labels for both directions of the relation
  • defines the versioning rules for relation i.e. floating or static relation type
    • floating-versions relation type is valid for all versions of two related resources which were created after the time the relation was created
    • exact-versions relation type is valid only for explicitly related versions of two related resources. Previous and later versions than those related with the relation are not considered as participants in the relation.
  • (optionally) relation type can define which resources can be related (by use of cmodel property) and in which cardinality they can be related e.g.

Examples for content relation types[edit]

  • Annotations (isAnnotationOf, hasAnnotations)
  • Revisions (isRevisionOf, hasRevisions)
  • History (isPredecessorOf, isSuccessorOf)
  • Translation (isTranslationOf, hasTranslation) [ulla]
  • WALS Project:
- hasFeature/hasLanguage (between Language, LanguageFeature); 
  • Format (isFormatOf, hasFormat) [ulla..still not sure on the semantics. what i mean is eg the scenario, that an article is related to a talk at event, where the same content is presented, but in different media/format/mimetype and maybe for different target group]
This requirement is not clear for me, i.e. how to distinguish this from "isRevisionOf". Could we agree to postpone this content relation type? --Inga 22:06, 5 December 2007 (CET)

Example definitions of a relation type[edit]

  • Type (SourceToTarget label): isRevisionOf
  • TargetToSource label: hasRevisions
  • Versioning rule: floating

(Optionally can be defined)

  • Source CModel: PubItem
  • SourceToTarget cardinality: 0..M
  • Target CModel: PubItem
  • TargetToSourceCardinality: 0..M

Relation[edit]

A relation (object) is a resource with following structure:

  • Relation properties (mandatory)
    • relation type
    • date of creation
    • date of last modification
    • user who created the relation
    • status (pending, submitted?, released) (TBD)
    • visibility (public, organizational unit, private) (TBD)
  • Relation metadata (TBD) (optional)
    • relation comment
    • persistent identifier

Note: relation metadata are dependent on the relation type in general

  • Component(TBD) (optional)

Note: not yet clear if one would like to enrich the relation also with a component content. Annotations are such resources

Unlike content items and containers the relation resources are not limited within a context, their virtual "context" is rather the relation type associated and the resources they relate.

Resource handler interface[edit]

Below is a proposal for Resource handler interface of the Content Relations. Not certain if we need to make another interface within the same resource handler that deals with the Relation Ontologies or this should be separate handler. My vote is for simply having a separate interface with same handler and relation Ontologies.

Relation ontology methods[edit]

  • createRelationOntology(relation-ontology.xml*)
    • Description: creates a relation ontology (together with all types of relations)
  • retrieveRelationOntology(relation-ontology-id*)
    • Description: retrieves a relation ontology (together with all types of relations)
  • updateRelationOntology(relation-ontology.xml*)
    • Description: updates a relation ontology (together with all types of relations) (in case when the relation-ontology.xml is released, this should actually mean only allow for creation of new types of relations)
  • deleteRelationOntology(relation-ontology-id*)
    • Description:deletes a relation ontology if not in status "released"
  • releaseRelationOntology(relation-ontology-id*)
    • Description:releases a relation ontology if in status "created" - After a release a relation ontology and its relation types can be used to relate items in the system
  • withdrawRelationOntology(relation-ontology-id*)
    • Description:withdraws a relation ontology if in status "released" - After a withdrawal a relation ontology and its relation types can not be used to relate items in the system (existing relations are however still valid?).
  • retrieveRelationOntologies()
    • Description: retrieves a short list of all relation ontologies available in the system (short list to be defined)
  • retrieveRelationTypes (relation-ontology*)
    • Description: retrieves a list of relation types (and their definition) available for a specified ontology
  • retrieveRelationTypesForModel(cmodel-id*)
    • Description: retrieves a list of relation types (and their definition) available for a specified content model (if any)

Content relation methods[edit]

Note: As this is only a conceptual work, is subject to change until agreed (* for mandatory parameters, ? for optional parameters)

  • create (relation.xml*)
  • retrieve(relation-id*)
  • update(relation-id*, relation.xml*)
  • delete (relation-id*)

To discuss(1)[edit]

Task methods below to be discussed (otherwise, standard task-params)

  • submit(relation-id*)
  • release(relation-id*)
  • withdraw(relation-id*)

To discuss(2)[edit]

  • retrieveRelatedResourceRefs(resource-id*, relation-ontology*, relation-type?, direction<target, both>?)
    • Description: retrieves a list of Id of resources related to specified resource-id and specified relation-ontology, (optionally) specified relation-type and (optionally) specified direction.
      • If a relation-type is not specified retrieves a list of Id of all resources related to specified resource-id within the given relation-ontology (i.e. the resource-id is either target or source of the relation).
      • If direction is specified as "target" it retrieves relations where the resource-id is the source for the relation of (optionally) specified relation-type
      • If direction is specified as "both" it retrieves relations where the resource-id is both the source and target for the relation of (optionally) specified relation-type
      • The output should contain the relation type for each related resource within a relation-ontology and the exact direction


MPDL input after VidCo 16.12.2008[edit]

  • we realize that there is probably no need to have relation handler as separate handler:
    • item handler can be used to define relation objects
    • item handler must make certain to validate properly agreed:
      • relation object modification can be only done on the metadata and evtl. components of the relation
      • if relation object has to be modified with evtl. target and source, this would not be allowed as this would have to be a new relation object
      • item handler must make sure that it does not allow target and source properties
      • Question: shall item.xml again be modified to allow for target/source properties? How will item.xml look like? --Natasa 15:15, 16 December 2008 (UTC)
    • having relation objects via item handler has several advantages: components, member of containers (e.g. my collection of annotations)
    • method such as: retrieveRelatedResourceRefs can be basically a search on items with cmodel of the relation object
  • relation ontology - will be created in special ontology context as item and it will contain all relation types
    • properly structuring it will enable nice searches
Kind and procedure of storing and managing ontologies may be further discussed. FIZ will check different approaches. Frank 11:47, 17 December 2008 (UTC)

Use cases[edit]


Bi-directional / inverse Relations[edit]

  • From Definition: "a content relation a->b also assumes a content relation b->a" and "directions a->b and b->a may have different labels (e.g. isRevisionOf, hasRevisions) that are expressed with a specific relation ontology"
    • Does that mean, we have to store two relations for every added relation and that we have to check provided ontologies for completeness in the sense that every predicate must have a counterpart? Frank 17:52, 5 December 2007 (CET)
    • I would assume that only one relation is created (either isRevisionOf or hasRevisions). Anyway, the addition of an entry in the RELS-EXT datastream may require access rights to the source object. Therefore, both "directions" might be required - even they are equal from functional point of view --Inga 22:06, 5 December 2007 (CET)

The later example shows: a relation has a relation-type and this type should state one and the inverse name of a ralation. If we think about defining ontologies to describe possible content relations and to store relations as triples, there may be some drawbacks. On the other hand semantic technologies may help.

  • An ontologie may define
    • the inverse of a predicate (relation),
    • which type/entity has a predicate (more in the sense of which predicates have a type)(here an instance of the type/the entity is the subject in a statement/triple with that predicate),
    • of which type is the "value" (the object of the statement/triple) of the predicate and
    • if the predicate may repeatedly occur in an (subject) entity (cf. owl:functionalProperty).

Content relations are defined by the solution - they are not predefined in the infrastructure. Therefore we decided to allow the "registration" of ontologies in the infrastructure. Those ontologies describe the content relations (of a particular solution/application) and the infrastructure is able to verify concrete relations on creation or update.

If it is a precondition of a content relation to be bi-directional we must refuse ontologies that do not define the inverse of a predicate. On the other hand: If an inverse is defined in the ontologie and we have stored nice triples in a nice triple-store we may retrieve the inverse of a statement by infering it. Frank 10:00, 6 December 2007 (CET)

As far as I remember, binary relations may be symmetric or not, but "bi-directional" does not seem to be a valid property of a relation. Of course, if a is related to b, b is related to a, that's the very meaning of "relation". So i guess, the whole discussion is one of labels. 1 < 2 means: 1 is less than 2, i.e. 1 and 2 are related via the "less than" relation, no matter whether we find a nicer way to describe that 2 in this case is "on the right side" of the relation or not. Robert 11:27, 6 December 2007 (CET)

I absolutly agree. I understand from Natasas description, that "bi-directional" means: If there is a relation with a "source" and a "target" there is ANOTHER one where the "source" is the "target" and the "target" is the "source". In an ontology the direction is clear from the definition of the relation as property of a resource. Even if the property does not define a domain and/or a range, a statment does NOT indicate the existence of an inverse property. There is a special case of symmetric property. A relation R is symmetric if for any x,y, R(x,y) iff R(y,x). A "bi-directional content relation" seems to consist of two different relations. Frank 13:57, 6 December 2007 (CET)

This "inverse relation" thing would mean a big limitation, right? If the "inverse predicate" is nothing but the passive voice, i.e. "a beats b" and "b is beaten by a", it doesn't add much; if not, it possibly shouldn't be anything automatic - in particular if creating a relation requires privileges to update the source object. Robert 15:10, 7 December 2007 (CET)

I think there is too much discussion about "inverse relation" => by carefully reading of the proposal (probably was not entirely clear): There is relation type, and all else are "labels" -> these do not mean inverse relations, these only mean comfort features. That's how I understood it in fact.--Natasa 12:38, 10 November 2008 (UTC)

TODO: With mulgara: store ontology with definition of property A is inverseTo B, sore a statement <x> <A> <y>, query for <B> and get <y> <B> <x> as answer.

which resources can be related[edit]

The type (or class) of related resources may be defined in the ontology too.

    <rdf:Property rdf:about="isSomethingOf">
        <rdfs:domain rdf:resource="http://www.escidoc.de/ontologies/resources/Item"/>
        <rdfs:range rdf:resource="http://www.escidoc.de/ontologies/resources/Container"/>
        <owl:inverseOf rdf:resource="hasSomething"/>
    </rdf:Property>

how to store relations as triples[edit]

In order to validate a particular relation (statement) with the ontology definition there must be a triple: <source-uri> <relation> <target-uri>. With relation objects i see the following cases of implicit triples:

  • A relation object that holds all informations (incl. the name) of a single relation. For example (format is N-Triples):
<http://escidoc.de/content-relation/escidoc:1> <http://escidoc.de/ontologies/properties/name> "isSomethingOf".
<http://escidoc.de/content-relation/escidoc:1> <http://escidoc.de/ontologies/properties/source> <http://escidoc.de/ir/item/escidoc:77>.
<http://escidoc.de/content-relation/escidoc:1> <http://escidoc.de/ontologies/properties/target> <http://escidoc.de/ir/item/escidoc:88>.
  • A relation object that references a relation type object. For example (format is N-Triples):
<http://escidoc.de/content-relation/escidoc:1> <http://escidoc.de/ontologies/relations/type> <http://escidoc.de/relation-type/escidoc:2>.
<http://escidoc.de/content-relation/escidoc:1> <http://escidoc.de/ontologies/properties/source> <http://escidoc.de/ir/item/escidoc:77>.
<http://escidoc.de/content-relation/escidoc:1> <http://escidoc.de/ontologies/properties/target> <http://escidoc.de/ir/item/escidoc:88>.
<http://escidoc.de/relation-type/escidoc:2> <http://escidoc.de/ontologies/properties/name> "isSomethingOf".
<http://escidoc.de/relation-type/escidoc:2> <http://escidoc.de/ontologies/relations/inverse> "hasSomething".

The following triple may be explicitly stored but than must be hold in sync with the relation object(s):

<http://escidoc.de/ir/item/escidoc:77> <isSomethingOf> <http://escidoc.de/ir/item/escidoc:88>.

Note: In the latter example the "relation-name" needs a namespaces which than must occur in first two examples too. Frank 15:02, 6 December 2007 (CET)

Idea: Reification Based[edit]

One may think of a Content Relation as a statement related to two objects. So, Content Relations can just be added by adding a predicate to the RELS-EXT of a Fedora object. In eSciDoc a Content Relation should be able to have metadata (which can be seen as statements about the Content Relation) and should be bound to a controled vocabulary. That can be achieved by using RDF, RDFS etc..

without reification[edit]

Stored:

<info:fedora/escidoc:persistent11> <http://mpdl.escidoc-project.de/controlled/relations/isPredecessorOf> <info:fedora/escidoc:persistent1>

Query:

* <http://mpdl.escidoc-project.de/controlled/relations/isPredecessorOf> <info:fedora/escidoc:persistent1>

with reification[edit]

Stored reification:

<rdf:Statement about="a">
  <rdf:subject rdf:resource="info:fedora/escidoc:persistent11"/>
  <rdf:predicate rdf:resource="http://mpdl.escidoc-project.de/controlled/relations/isPredecessorOf"/>
  <rdf:object rdf:resource="info:fedora/escidoc:persistent1"/>
  <dc:title>Following Persistent 1</dc:title>
  <dc:identifier>a</dc:identifier>
</rdf:Statement>

Reificated query:

SELECT $s
WHERE  $Statement <rdf:Subject> $s
AND    $Statement <rdf:Predicate> <http://mpdl.escidoc-project.de/controlled/relations/isPredecessorOf>
AND    $Statement <rdf:Object> <info:fedora/escidoc:persistent1>

in Fedora[edit]

RELS-EXT of a Fedora Object representing a Statement (aka Content Relation):

<rdf:RDF>
  <rdf:Description about="info:fedora/escidoc:1234">
    <rdf:type rdf:resource="rdf:Statement"/>
    <rdf:subject rdf:resource="info:fedora/escidoc:persistent11"/>
    <rdf:predicate rdf:resource="http://mpdl.escidoc-project.de/controlled/relations/isPredecessorOf"/>
    <rdf:object rdf:resource="info:fedora/escidoc:persistent1"/>
    ...
  </rdf:Description>
</rdf:RDF>

Note: In the above example 'rdf:Statement' is incorrectly used as attribute value instead of the full-qualified URI of RDF Statement.

Ontology[edit]

Which Content Relations can be added for specific objects in eSciDoc may be defined by one or more Ontologies which are simply stored in the infrastructure (and are retrievable, updateable, deleteable). An ontology is usually identified by a namespace. A predicate (what is here seen as the relation thing between two objects) defined by an ontology is identified by a namespace and a (local) name. In best case, the ontologies stored in the infrastructure are managed in a kind of TripleStore to enable CRUD operations on predicates.

An ontology may define one or more predicates. The following example is just the definition of an predicate - and therefore of a Content Relation (Type) - out of an ontology.

<rdf:Property rdf:ID="isSomethingOf">
	<rdfs:label>Is Something Of</rdfs:label>
	<rdfs:comment>A definition of the relationship between an Item and a Container.</rdfs:comment>
	<rdfs:domain rdf:resource="http://escidoc.de/core/01/resources/Item" />
	<rdfs:range rdf:resource="http://escidoc.de/core/01/resources/Container" />
	<rdfs:subPropertyOf rdf:resource="info:fedora/fedora-system:def/relations-external#fedoraRelationship"/>
</rdf:Property>

To discuss(2) from above[edit]

retrieveRelatedResourceRefs(resource-id*, relation-ontology*, relation-type?, direction<target, both>?)

resource-id: The id of the object involved in the requested relations.

relation-ontology: The part of an ontology describing the requested relations.

relation-type: The type of the requested relations.

direction: Whether the the object should be subject of the relation or both (subject or object).

Maybe the following is sufficient:

retrieveRelations(resource-id*, relation-uri?, direction<subject,object>)

resource-id: The id of the object involved in the requested relations.

relation-uri: The full quallified name of the predicate (which is defined in the ontology to allow that relation).

direction: Whether the the object should be subject of the relation or both (subject or object).

Outcome[edit]

  • On VidConf clarified. Was not clear how the ontology-relation type will be related. Assuming that each relation type will have in the uri the namespace from its ontology there will be no need to have additional parameter for relation-ontology. --Natasa 15:10, 16 December 2008 (UTC)

Problems/Questions[edit]

  • Fixed vs. floating references.
<info:fedora/escidoc:persistent11:7> <http://mpdl.escidoc-project.de/controlled/relations/isPredecessorOf> <info:fedora/escidoc:persistent1>
<info:fedora/escidoc:persistent11> <http://mpdl.escidoc-project.de/controlled/relations/isPredecessorOf> <info:fedora/escidoc:persistent1:7>
<info:fedora/escidoc:persistent11:7> <http://mpdl.escidoc-project.de/controlled/relations/isPredecessorOf> <info:fedora/escidoc:persistent1:7>
  • Versioning
  • Is Content Relation Object an Item of content model relation?