Metadata Infrastructures Seminar Preparation

From MPDLMediaWiki
Jump to: navigation, search

MPGMax-Planck-Gesellschaft EScience Seminar on Metadata Infrastructures, October 14-15, 2008: Preparation page


Tuesday, October 14th, 2008

11.00 - 11.45
Introduction. All participants present themselves

11.45 - 13.00
MPIMax-Planck-Institut presentations:
Frank Toussaint (MPIMax-Planck-Institut Meteorology, Hamburg; World Data Center for Climate): The CERA-2 meta database and needs for a common information model (Slides: 3.9 MB)

Sylvia Kortüm (MPIMax-Planck-Institut for Intellectual Property, Competition and Tax Law, Munich): Law-related MPIs: CMSContent Management System project (Slides: 3.5 MB)

Wolfgang Voges (MPIMax-Planck-Institut for Extraterrestrial Physics, Garching; MPDLMax Planck Digital Library): Metadata in the case of an astronomical RoR [Registry of Registries]

Peter Wittenburg, Wolfgang Voges: MPGMax-Planck-Gesellschaft Registry of Registries (Slides: 0.7 MB)

Peter Wittenburg, Daan Broeder (MPIMax-Planck-Institut for Psycholinguistics, Nijmegen): ISOCat. Component model (Slides: 0.2 MB)

(13.00 - 14.00

14.00 - 15.30
Brian Matthews (Rutherford Appleton Laboratory, UKUnited Kingdom): Practical experiences from e-science applications and related metadata solutions (Slides: 18.8 MB)

(15.30 - 16.00

16.00 - 17.30
Pete Johnston (Eduserv Foundation, UKUnited Kingdom): Metadata standard and best practice developments: Dublin Core Abstract Model, SWAPScholarly Works Application Profile, OAI-OREOpen Archives Initiative Object Reuse and Exchange etc. (Slides: 6.9 MB)

17.40 - 18.15
Tom Baker (DCMIDublin Core Metadata Initiative): Metadata engineering methodology (Slides: 5.3 MB)

Wednesday, October 15th, 2008

9.00 - 10.30
Jacco van Ossenbruggen (Centrum voor Wiskunde en Informatica and Vrije Universiteit Amsterdam, Netherlands): Semantic interoperability of data values, use and matching of ontologies and unstructured vocabularies (Slides: 15.9 MB or from CWI: slides)

(10.30 - 11.00

11.00 - 12.30
Breakout groups (w. Introduction to tasks and presentation of a MPG-wide metadata registry project).
Topics: Enumeration of metadata-related problems; cooperation options, MPG-wide projects, potential support needed; necessity of common policies and standards

(12.30 - 13.30

13.30 - 14.10
Reports from breakout groups

14.10 - 14.40
Martin Stricker (Helmholtz-Zentrum fuer Kulturtechnik der Humboldt-Universitaet zu Berlin): Developing an ontology for academic disciplines (Slides: 1.5 MB)

(14.40 - 15.10

15.10 - 15.45
Tom Baker (DCMIDublin Core Metadata Initiative): Recent developments reg. web-enabled vocabularies. SKOSSimple Knowledge Organisation System, tagging, microformats etc. (Slides: 4.4 MB)‎

15.45 - 16.00

All time-slots cover both the presentation and questions/discussion. Exact times may be adapted, the sequence of events will be kept, however.

How to contribute to this page

This is the preparation page for the metadata seminar. Please add your name and the date to your comments, ideas and questions. Do not remove somebody else's text. Add to and edit in this page and do not use a separate discussion page. Help pages for mediawiki (the same software as used in wikipedia) editing are available from the main page of CoLab. This page will be updated until the event has taken place.

Thanks for your participation.
Traugott Koch ( and Peter Wittenburg (MPIMax-Planck-Institut Nijmegen)

Short description of the event

Responsible for contents: Traugott Koch (MPDLMax Planck Digital Library) and Peter Wittenburg (MPIMax-Planck-Institut Nijmegen)

Background: All Max Planck Institutes have to cope with the management of an increasing amount of data and its storage for at least 10 years. Metadata descriptions are essential to the solution of the management problem. Metadata can also be used to support resource discovery, to perform scientific data-mining and to generate virtual collections.

Goal: The seminar will present and discuss the role of metadata in the context of management, use and reuse of scientific data (e-Science). Presenters from ongoing large international e-Science projects will describe issues, experiences, problems and solutions. International and German experts will talk about standard-developing efforts regarding necessary infrastructure components and demonstrate feasible methodologies , i.e. metadata application profiles, linking between datasets and publications, treatment of aggregations of web resources. Presentations from MPIMax-Planck-Institut's are intended to document metadata related needs and experiences and to further the discussion on cooperation and a strategy for future work in the MPGMax-Planck-Gesellschaft.

Place: Harnack House, Berlin
Date: October 14-15, 2008

Web page of the whole seminar series


This seminar will have the following focus:
1) It deals with metadata in e-science applications only, i.e. metadata related to scientific datasets and their aggregations.
2) It does not focus on metadata needs for internal purposes, but on metadata enabling use and reuse of the data outside limited projects, i.e. on metadata for wider discovery services, aggregation and combined services.
3) A main goal is to start discussing possible cooperation and development efforts and possibly a strategy for future work in the MPGMax-Planck-Gesellschaft.
Metadata requirements for data (and metadata) preservation is on the agenda of the seminar in June and will not be dealt with in any detail during this October event.


Initially, the organizers have been thinking of the following groups of topics and issues reg. the content of the seminar. It is not meant to describe the structure of the event, at this stage.

1 Metadata in e-Science applications [external talk]

1.1 Purpose:

  • what is metadata, which kind do we talk about?
  • for which purposes is metadata used in science?, what are its functions?
    • description; interoperability; discovery and (cross-) search
  • what granularity of the metadata is needed?

1.2 Experiences and Solutions in ongoing e-science projects

1.3 Tools: [minor topic]

  • how to represent metadata (XMLExtensible Markup Language, RDFResource Description Framework, ...)?
  • different views for different joint domains?
  • are there automatic support methods?

2 MPGMax-Planck-Gesellschaft needs and experiences [based on a call: several invited talks from MPIMax-Planck-Institut's]

  • do MPIs need help and what kind of help is needed?
  • who has (long) practical experience in the MPGMax-Planck-Gesellschaft?
  • what efforts and approaches occur in MPIMax-Planck-Institut's? Are there international activities in certain scientific domains?
  • what standards are used/preferred?

3 Accomplishing improved interoperability

3.1 Goal: [discussion]

  • what is the level of standardization we should achieve?

3.2 Common metadata profile: [external talk]

  • approaches to bibliographic metadata (e.g. Scholarly Works Application Profile SWAPScholarly Works Application Profile)
  • how important are domain dependent and domain independent metadata elements? (domain view vs global view)
  • do we just need to offer metadata elements or also restrict to schemas?

3.3 Necessary infrastructure: [talk]

  • what are the essential components of an infrastructure? Can we use the metadata life-cycle as guidance?
    • metadata schema/vocabulary registry
    • flexible metadata registry
    • portals
    • creating, adapting and storing the metadata incl. workflow mechanisms
    • description and exchange of aggregations of web resources (structure, relationships, semantics of compound information objects. OAI-OREOpen Archives Initiative Object Reuse and Exchange)
    • searching, browsing and GISGeoinformationssystem integration
    • bridges to all sorts of service providers (OAI-PMHOpen Archives Initiative Protocol for Metadata Harvesting, ...)
  • how to achieve a joint domain fostering interdisciplinarity? A single joint domain is probably not very useful. Maybe rather consider various (very loosely interrelated) domains of related disciplines. Look for ways to cluster possible partners.

3.4 Semantic interoperability of the data values: [external invited talk]

  • which thesauri, classifications, ontologies etc. are used?
  • how to improve semantic interoperability? (Semantic mappings, structural conventions etc?)

4 Developments: [maybe not a separate topic; invited talk possible]

  • what are the major trends in the library world?

5 MPGMax-Planck-Gesellschaft strategy, further work and cooperation [discussion]

Breakout groups

We would like to stimulate participation during the seminar and to get more detailed results by reserving some time for breakout groups with a specified task. Comments and ideas regarding such a feature are especially welcome.


Four speakers so far have accepted the invitation to present at and participate in our seminar:

1) Brian Matthews from Rutherford Appleton Laboratory, Harwell Science and Innovation Campus, Didcot UKUnited Kingdom. Brian has been involved in exploring and developing tools to support scientific infrastructure, including metadata for data management and digital libraries. He has been involved in a number of JISCJoint Information Systems Committee UK and European projects such as the JISCJoint Information Systems Committee UK Claddier projects. He works in the Scientific and Technical Facilities Council e-Science Centre, leading the Information Management Group.

Brian will be the main speaker in topic 1 Metadata in e-Science applications, being in a position to know the leading e-Science projects in the UKUnited Kingdom and abroad and having detailed experience of several advanced project implementations in the UKUnited Kingdom and their metadata solutions. He will concentrate on the motivations and design of scientific metadata as developed and deployed on scientific facilities within STFCScience and Technology Facilities Council.

2) Pete Johnston is a Technical Researcher at Eduserv Foundation in Bath, UKUnited Kingdom. He is a member of the OAI-OREOpen Archives Initiative Object Reuse and Exchange (Object Reuse and Exchange, an OAIOpen Archives Initiative initiative) Technical Committee, a co-developer of SWAPScholarly Works Application Profile (Scholarly Works Application Profile), Collection Description Metadata Solutions, the Dublin Core Abstract Model and related stuff, one of the leading metadata experts on the technical, encoding and standards side. His blog, presented together with Andy Powell, provides many details and insights relevant to our seminar theme.

Pete will present about relevant developments in the mentioned contexts with a certain e-science application perspective, as a main speaker on topic 3.2 and parts of 3.3.

3) Thomas Baker, having a great general overview in the fields of Metadata and Semantic Web applications, will present on a metadata engineering methodology and provide an overview on recent developments reg. web-enabled vocabularies. He is Director Specifications and Documentation of the Dublin Core Metadata Initiative, Chair of KIM-AG (Kompetenzzentrum Interoperable Metadaten) and a Co-chair of the W3C Semantic Web Deployment Working Group.

4) Frank van Harmelen is one of the most outstanding Semantic Web experts (cf. his book 'Semantic Web Primer"). As a professor at the Vrije Universiteit Amsterdam, he leads the Knowledge Representation and Reasoning Group at the Artificial Intelligence Department.

As a co-designer of the Web Ontology Language OWL, he will cover our topic 3.4 Semantic interoperability of the data values, drawing on his early experience with expert systems and his recent research on approximate matching of ontologies and unstructured vocabularies using background knowledge, featuring alternative forms of reasoning under 'suboptimal circumstances'.

4) Jacco van Ossenbruggen is a senior researcher with the Semantic Media Interfaces group at the Centrum voor Wiskunde en Informatica (CWI), and affiliated as an assistant professor with the Intelligent systems and services research group at Vrije Universiteit Amsterdam. His research interests include semantic web interfaces (/facet browser), multimedia on the Semantic Web (SMIL), and the automatic generation of user-tailored hypermedia presentations. Jacco is currently active in the MultimediaN E-Culture project and the K-Space European Network of Excellence.

MPIMax-Planck-Institut presentations

Initial contacts re. presentations (cf. topic 2 above) have been made with the following institutions/people and discussions are ongoing with:

1) MPIMax-Planck-Institut Meteorology, Hamburg; World Data Center for Climate. Frank Toussaint.
The CERA-2 Meta Database and Needs for a Common Information Model.
CERA specifications, data model and output; metadata mapping. Participating in the international BADC effort.

2) MPIMax-Planck-Institut for Human Cognitive and Brain Sciences, Leipzig and Hermann-von-Helmholtz-Zentrum für Kulturtechnik der Humboldt-Universität zu Berlin. Martin Stricker and Marion Schmidt
Project: "Developing an Ontology for Academic Disciplines".

3) MPIMax-Planck-Institut for Psycholinguistics, Nijmegen.
Peter Wittenburg: ISOCat.
Daan Broeder: Component model.

4) Law-related MPIMax-Planck-Institut's: CMSContent Management System project.
Sylvia Kortuem, MPIMax-Planck-Institut for Intellectual Property, Competition and Tax Law, Munich.

5) MPIMax-Planck-Institut for Extraterrestrial Physics, Garching. Wolfgang Voges, MPDLMax Planck Digital Library/MPEMax Planck Institute for Extraterrestrial Physics.
Metadata in the case of an astronomical RoR (Registry of Registries).

Probably to be withdrawn: MPIMax-Planck-Institut Molecular Cell Biology and Genetics, Dresden. Jeffrey Oegema.
Potentially representing the MPGMax-Planck-Gesellschaft Biology and Medicine sections project "MPG-wide archiving system for scientific data" as well.
Potential presentation of ontology-related problems.

Your comments, ideas, proposals

(please add here)

Recommendation by Pete Johnston 14 Feb 2008 to include a topic/presentation about "Open Data Linking and Publishing" as part of the "Web of Data" and "Linking Open Data" initiatives of W3CWorld Wide Web Consortium. Potential presenters could be recruited from FUFreie Universität Berlin Berlin or Leipzig University. Traugott 14:05, 11 March 2008 (CETCentral European Time)

I'd second this recommendation. Someone who i recentliy got in contact with: Robert 10:43, 22 April 2008 (CESTCentral European Summer Time)

Would like to propose under topic 3.3 metadata validation ... does reuse of metadata standards also means the metadata quality? --Natasa 00:13, 27 June 2008 (UTCCoordinated Universal Time)