Generic handling of metadata/Technology basics

From MPDLMediaWiki
Jump to navigation Jump to search


Discussion Points[edit]

  • exact requirements (known data/unknown data), use cases
  • defining structure, screen config and vocabulary mapping in one file or separated
  • technology decision
  • restrictions on incoming data

Requirements in General[edit]

  • view
  • edit
  • search
  • store in repository
  • define metadata
    • define interoperable metadata (resources and statements)= compatibility to RDF, DCAP
    • encoding schemes (controlles vocabulary, syntax restrictions)
    • mapping to Standard vocabulary terms
  • screen configuration
    • label
    • positioning
    • occurence
    • form element
  • validation of data against a schema/DSP
  • additional requirements see known data

Requirement Matrix[edit]

Requirements\Data Unknown data Known data
view + +
edit + +
search + +
Screen configuration + +
store in repository + +
define Metadata in general ? +
define interoperable metadata (resources and statements) ? +
define encoding schemes ? +
define Mapping to Standard Vocabulary Terms ? +
Validation of data against a schema - +

Data to handle (Known Data/Unknown Data)[edit]

  • Definition (Known data):
    • xml format (only well-formed and valid xml???)
    • schema or DSP is available
    • related concepts: Handling of Semistructured Data might be interesting[1] ???
  • Definition (Unknown data):
    • heterogenous or unstructured data
    • any format
    • no schema od DSP available

Considered Techologies[edit]


RDF Data Model

  • triples consisting of resources (literal or non-literal) and statements/properties
  • subject predicate object. where subject and object are resources, predicate is statement
  • described structure is a directed graph


  • Data: variants of rdf triple syntax, xml syntax
  • Schema: RDFS [2] (XML Syntax)

RDF Sample 1: ex:name "Albert Einstein". 

RDF Sample 2: ex:affiliation ex:name "MPDL".


  • RDFS can be extended to Web Ontology Language (OWL)[3]
  • 2 versions of OWL:
    • OWL Lite: definition of classes/subclasses for resources and properties, property range and domain, only cardinality 0 and 1 allowed (occurence)
    • OWL DL: OWL Lite + set operations (intersect, union, complement), unrestricted cardinality, enumeration classes, disjoint classes

XML, XML Schema, Relax NG[edit]

Data Model

  • described structure is a ordered tree (graph possible with references)
  • basic components: elements and attributes
  • difference to RDF, DSP: basic components are just syntactical constructs, they don't give any information about the content


  • Data: XML
  • Schema: XML Schema, Relax NG, maybe also Schematron

Description Set Profile: DC-DS-XML, DSP[edit]

Data Model

  • basic components: resources (or classes), properties and constraints (Vocabulary Encoding Scheme, Syntax Encoding Scheme)


  • for Data (Instances): DC-DS-XML [4], DC-Text Syntax [5]
  • for Definition/Schema: DSP-XML [6], DSP-RDF [6], DSP-WIKI [7], DC-Text Syntax [5]

DSP-XML Sample 1:

<?xml version="1.0" ?>

<DescriptionSetTemplate xmlns="">
  <DescriptionTemplate ID="" minOccurs="1" maxOccurs="1" standalone="yes">    
    <StatementTemplate minOccurs="1" maxOccurs="1" type="literal">

DC-DS-XML Sample 1:

<?xml version="1.0" encoding="UTF-8" ?>
<dcds:descriptionSet  xmlns:dcds="">
                                    <!-- Description Element -->
  <dcds:description dcds:resourceURI="http://example/einstein">
    <dcds:statement dcds:propertyURI="ex:name">
      <dcds:literalValueString>Albert Einstein</dcds:literalValueString>

Overview (pros and cons)[edit]

Requirements\Technology RDF/RDFS XML/XSD/RelaxNG DSP
describe interoperable Metadata/compatibility with DCAP, RDF implicit (+) possible, but not implicit (-) implicit (+)
Screen config -- -- --
Technology/Tool Support -- -- --
Mapping to Standard Vocabulary Terms -- -- --
Validation mechanism -- -- --