Metadata Infrastructures Seminar Preparation

MPG EScience Seminar on Metadata Infrastructures, October 14-15, 2008: Preparation page

Agenda
Tuesday, October 14th, 2008

11.00 - 11.45

Introduction. All participants present themselves

11.45 - 13.00

MPI presentations:

Frank Toussaint (MPI Meteorology, Hamburg; World Data Center for Climate): The CERA-2 meta database and needs for a common information model ([[media:ESci08_Sem_3_CERA-2_Toussaint.pdf|Slides: 3.9 MB]])

Sylvia Kortüm (MPI for Intellectual Property, Competition and Tax Law, Munich): Law-related MPIs: CMS project ([[media:ESci08_Sem 3_Kortuem_CMS_Project.pdf|Slides: 3.5 MB]])

Wolfgang Voges (MPI for Extraterrestrial Physics, Garching; MPDL): Metadata in the case of an astronomical RoR [Registry of Registries]

Peter Wittenburg, Wolfgang Voges: MPG Registry of Registries ([[media:ESci08_Sem_3_RoR_Wittenburg.pdf|Slides: 0.7 MB]])

Peter Wittenburg, Daan Broeder (MPI for Psycholinguistics, Nijmegen): ISOCat. Component model ([[media:ESci08_Sem_3_ComponentModel_Broeder.pdf‎ |Slides: 0.2 MB]])

(13.00 - 14.00

Lunch)

14.00 - 15.30

Brian Matthews (Rutherford Appleton Laboratory, UK): Practical experiences from e-science applications and related metadata solutions ([[media:ESci08_Sem_3_Metadata_for_Information_Management_in_Large-Scale_Science_Matthews.pdf|Slides: 18.8 MB]])

(15.30 - 16.00

Coffee)

16.00 - 17.30

Pete Johnston (Eduserv Foundation, UK): Metadata standard and best practice developments: Dublin Core Abstract Model, SWAP, OAI-ORE etc. ([[media:ESci08_Sem_3_ORE_and_SWAP_Composition_and_Complexity_Johnston.pdf|Slides: 6.9 MB]])

17.40 - 18.15

Tom Baker (DCMI): Metadata engineering methodology ([[media:ESci08_Sem_3_EngineeringMethodology_Baker.pdf|Slides: 5.3 MB]])

Wednesday, October 15th, 2008

9.00 - 10.30

Jacco van Ossenbruggen (Centrum voor Wiskunde en Informatica and Vrije Universiteit Amsterdam, Netherlands): Semantic interoperability of data values, use and matching of ontologies and unstructured vocabularies ([[media:ESci08_Sem_3_SemanticInteroperabilityOfDataValues_van_Ossenbruggen.pdf |Slides: 15.9 MB]] or from CWI: slides)

(10.30 - 11.00

Coffee)

11.00 - 12.30

Breakout groups (w. Introduction to tasks and presentation of a MPG-wide metadata registry project).

Topics: Enumeration of metadata-related problems; cooperation options, MPG-wide projects, potential support needed; necessity of common policies and standards

(12.30 - 13.30

Lunch)

13.30 - 14.10

Reports from breakout groups

14.10 - 14.40

Martin Stricker (Helmholtz-Zentrum fuer Kulturtechnik der Humboldt-Universitaet zu Berlin): Developing an ontology for academic disciplines ([[media:ESci08_Sem_3_Ontology_for_Academic_Disciplines_Stricker.pdf|Slides: 1.5 MB]])

(14.40 - 15.10

Coffee)

15.10 - 15.45

Tom Baker (DCMI): Recent developments reg. web-enabled vocabularies. SKOS, tagging, microformats etc. ([[media:ESci08_Sem_3_VocabularyTrends_Baker.pdf|Slides: 4.4 MB]])‎

15.45 - 16.00

Conclusion

''All time-slots cover both the presentation and questions/discussion. Exact times may be adapted, the sequence of events will be kept, however.''

How to contribute to this page
This is the preparation page for the metadata seminar. Please add your name and the date to your comments, ideas and questions. Do not remove somebody else's text. Add to and edit in this page and do not use a separate discussion page. Help pages for mediawiki (the same software as used in wikipedia) editing are available from the main page of CoLab. This page will be updated until the event has taken place.

Thanks for your participation. Traugott Koch (traugott.koch@mpdl.mpg.de) and Peter Wittenburg (MPI Nijmegen)

Short description of the event
Responsible for contents: Traugott Koch (MPDL) and Peter Wittenburg (MPI Nijmegen)

Background: All Max Planck Institutes have to cope with the management of an increasing amount of data and its storage for at least 10 years. Metadata descriptions are essential to the solution of the management problem. Metadata can also be used to support resource discovery, to perform scientific data-mining and to generate virtual collections.

Goal: The seminar will present and discuss the role of metadata in the context of management, use and reuse of scientific data (e-Science). Presenters from ongoing large international e-Science projects will describe issues, experiences, problems and solutions. International and German experts will talk about standard-developing efforts regarding necessary infrastructure components and demonstrate feasible methodologies, i.e. metadata application profiles, linking between datasets and publications, treatment of aggregations of web resources. Presentations from MPI's are intended to document metadata related needs and experiences and to further the discussion on cooperation and a strategy for future work in the MPG.

Place: Harnack House, Berlin Date: October 14-15, 2008

Web page of the whole seminar series Registration

Focus
This seminar will have the following focus: 1) It deals with metadata in e-science applications only, i.e. metadata related to scientific datasets and their aggregations. 2) It does not focus on metadata needs for internal purposes, but on metadata enabling use and reuse of the data outside limited projects, i.e. on metadata for wider discovery services, aggregation and combined services. 3) A main goal is to start discussing possible cooperation and development efforts and possibly a strategy for future work in the MPG. Metadata requirements for data (and metadata) preservation is on the agenda of the seminar in June and will not be dealt with in any detail during this October event.

Topics
Initially, the organizers have been thinking of the following groups of topics and issues reg. the content of the seminar. It is not meant to describe the structure of the event, at this stage.

1 Metadata in e-Science applications  [external talk]

1.1 Purpose:
 * what is metadata, which kind do we talk about?
 * for which purposes is metadata used in science?, what are its functions?
 * description; interoperability; discovery and (cross-) search
 * what granularity of the metadata is needed?

1.2 Experiences and Solutions in ongoing e-science projects

1.3 Tools:  [minor topic]
 * how to represent metadata (XML, RDF, ...)?
 * different views for different joint domains?
 * are there automatic support methods?

2 MPG needs and experiences [based on a call: several invited talks from MPI's]
 * do MPIs need help and what kind of help is needed?
 * who has (long) practical experience in the MPG?
 * what efforts and approaches occur in MPI's? Are there international activities in certain scientific domains?
 * what standards are used/preferred?

3 Accomplishing improved interoperability

3.1 Goal:  [discussion]
 * what is the level of standardization we should achieve?

3.2 Common metadata profile:  [external talk]
 * approaches to bibliographic metadata (e.g. Scholarly Works Application Profile SWAP)
 * how important are domain dependent and domain independent metadata elements? (domain view vs global view)
 * do we just need to offer metadata elements or also restrict to schemas?

3.3 Necessary infrastructure: [talk]
 * what are the essential components of an infrastructure? Can we use the metadata life-cycle as guidance?
 * metadata schema/vocabulary registry
 * flexible metadata registry
 * portals
 * creating, adapting and storing the metadata incl. workflow mechanisms
 * description and exchange of aggregations of web resources (structure, relationships, semantics of compound information objects. OAI-ORE)
 * searching, browsing and GIS integration
 * bridges to all sorts of service providers (OAI-PMH, ...)
 * how to achieve a joint domain fostering interdisciplinarity? A single joint domain is probably not very useful. Maybe rather consider various (very loosely interrelated) domains of related disciplines. Look for ways to cluster possible partners.

3.4 Semantic interoperability of the data values: [external invited talk]
 * which thesauri, classifications, ontologies etc. are used?
 * how to improve semantic interoperability? (Semantic mappings, structural conventions etc?)

4 Developments:  [maybe not a separate topic; invited talk possible]
 * what are the major trends in the library world?

5 MPG strategy, further work and cooperation  [discussion]

Breakout groups

We would like to stimulate participation during the seminar and to get more detailed results by reserving some time for breakout groups with a specified task. Comments and ideas regarding such a feature are especially welcome.

Speakers
Four speakers so far have accepted the invitation to present at and participate in our seminar:

1) Brian Matthews from Rutherford Appleton Laboratory, Harwell Science and Innovation Campus, Didcot UK.  Brian has been involved in exploring and developing tools to support scientific infrastructure, including metadata for data management and digital libraries.  He has been involved in a number of JISC and European projects such as the JISC Claddier projects.  He works in the Scientific and Technical Facilities Council e-Science Centre, leading the Information Management Group.

Brian will be the main speaker in topic 1 Metadata in e-Science applications, being in a position to know the leading e-Science projects in the UK and abroad and having detailed experience of several advanced project implementations in the UK and their metadata solutions. He will concentrate on the motivations and design of scientific metadata as developed and deployed on scientific facilities within STFC.

2) Pete Johnston is a Technical Researcher at Eduserv Foundation in Bath, UK. He is a member of the OAI-ORE (Object Reuse and Exchange, an OAI initiative) Technical Committee, a co-developer of  SWAP (Scholarly Works Application Profile), Collection Description Metadata Solutions, the Dublin Core Abstract Model and related stuff, one of the leading metadata experts on the technical, encoding and standards side. His blog, presented together with Andy Powell, provides many details and insights relevant to our seminar theme.

Pete will present about relevant developments in the mentioned contexts with a certain e-science application perspective, as a main speaker on topic 3.2 and parts of 3.3.

3) Thomas Baker, having a great general overview in the fields of Metadata and Semantic Web applications, will present on a metadata engineering methodology and provide an overview on recent developments reg. web-enabled vocabularies. He is Director Specifications and Documentation of the Dublin Core Metadata Initiative, Chair of KIM-AG (Kompetenzzentrum Interoperable Metadaten) and a Co-chair of the W3C Semantic Web Deployment Working Group.

4) Frank van Harmelen is one of the most outstanding Semantic Web experts (cf. his book 'Semantic Web Primer"). As a professor at the Vrije Universiteit Amsterdam, he leads the Knowledge Representation and Reasoning Group at the Artificial Intelligence Department.

As a co-designer of the Web Ontology Language OWL, he will cover our topic 3.4 Semantic interoperability of the data values, drawing on his early experience with expert systems and his recent research on approximate matching of ontologies and unstructured vocabularies using background knowledge, featuring alternative forms of reasoning under 'suboptimal circumstances'.

4) Jacco van Ossenbruggen is a senior researcher with the Semantic Media Interfaces group at the Centrum voor Wiskunde en Informatica (CWI), and affiliated as an assistant professor with the Intelligent systems and services research group at Vrije Universiteit Amsterdam. His research interests include semantic web interfaces (/facet browser), multimedia on the Semantic Web (SMIL), and the automatic generation of user-tailored hypermedia presentations. Jacco is currently active in the MultimediaN E-Culture project and the K-Space European Network of Excellence.

MPI presentations
Initial contacts re. presentations (cf. topic 2 above) have been made with the following institutions/people and discussions are ongoing with:

1) MPI Meteorology, Hamburg; World Data Center for Climate. Frank Toussaint. The CERA-2 Meta Database and Needs for a Common Information Model. CERA specifications, data model and output; metadata mapping. Participating in the international BADC effort.

2) MPI for Human Cognitive and Brain Sciences, Leipzig and Hermann-von-Helmholtz-Zentrum für Kulturtechnik der Humboldt-Universität zu Berlin. Martin Stricker and Marion Schmidt Project: "Developing an Ontology for Academic Disciplines".

3) MPI for Psycholinguistics, Nijmegen. Peter Wittenburg: ISOCat. Daan Broeder: Component model.

4) Law-related MPI's: CMS project. Sylvia Kortuem, MPI for Intellectual Property, Competition and Tax Law, Munich.

5) MPI for Extraterrestrial Physics, Garching. Wolfgang Voges, MPDL/MPE. Metadata in the case of an astronomical RoR (Registry of Registries).

Probably to be withdrawn: MPI Molecular Cell Biology and Genetics, Dresden. Jeffrey Oegema. Potentially representing the MPG Biology and Medicine sections project "MPG-wide archiving system for scientific data" as well. Potential presentation of ontology-related problems.

Your comments, ideas, proposals
(please add here)

Recommendation by Pete Johnston 14 Feb 2008 to include a topic/presentation about "Open Data Linking and Publishing" as part of the "Web of Data" and "Linking Open Data" initiatives of W3C. Potential presenters could be recruited from FU Berlin or Leipzig University. Traugott 14:05, 11 March 2008 (CET)

I'd second this recommendation. Someone who i recentliy got in contact with: http://www.informatik.uni-leipzig.de/~auer/ Robert 10:43, 22 April 2008 (CEST)

Would like to propose under topic 3.3 metadata validation ... does reuse of metadata standards also means the metadata quality? --Natasa 00:13, 27 June 2008 (UTC)