Digitization Lifecycle Mapping the Landscape of eResearch

From MPDLMediaWiki
Jump to navigation Jump to search

Mapping the Landscape of eReseach

Text - Image - Annotation
Harnack Haus, Berlin, Germany February 22-23, 2012

The workshop at a glance[edit]

To extend insight about current eResearch applications and to leverage broader discussions, several Max Planck Institutes within the Humanities and Social Sciences Section launched this workshop organized by the Max Planck Digital Library.

The workshop aims to get researchers and scholars, IT-professionals, "cybrarians" and project staff in touch with each other. Selected professionals will present their projects and findings in the context of text, image and annotation to an expert audience.

It is an explicit goal of the workshop that participants think out of the box and beyond their institutional and technical context. Contributors and participants are asked to report in depth on their topics, solutions, tools and especially about encountered problems and ways how to address them. Additionally, the workshop wants to provide an open space for discussion and a platform for networking. Communication, exchange and potentially resulting collaboration are intended and desirable.

Mapping the Landscape of eReseach addresses core fields, key issues and problems of digitization projects and Virtual Research Environments that appear recurrently in very different research contexts when such projects are planned and realized:

  • Status and future development of the Text Encoding Initiative (TEI)
  • Linguistic tools and processes, linguistic computing
  • Images: Viewing specifications, image administration and presentation tools, visualization
  • Referencing: reference data and annotations, layer solutions, markup tools


This workshop was planned and realized on behalf of the MPG project Digitization Lifecycle. For more information about Digitization Lifecycle please visit our project website (text in German) or download the project description in English.


22.02.2012 Topic




Introducing Digitization Lifecycle (DLC)

  • Malte Dreyer (Max Planck Digital Library): Technical Implications
  • Jan Simane (Kunsthistorisches Institut in Florenz, Max-Planck-Institut): Scientific Implications


Introduction: Text in DLC (Moderation: Klaus E. Werner, Bibliotheca Hertziana, Max-Planck-Institut für Kunstgeschichte)


  • Sebastian Rahtz (Oxford University Computing Services): The TEI: private and public concerns.


Coffee Break


Introduction: Images in DLC (Moderation: Anette Creutzburg, Kunsthistorisches Institut in Florenz, Max-Planck-Institut)


  • Ute Dercks (Photo Library of the Kunsthistorisches Institut in Florenz, Max-Planck-Institut): Cutting-edge technology meets 

     the Middle Ages. CENOBIUM - A Project for the Multimedia Representation of Romanesque Cloister Capitals in the

     Mediterranean Region

  • Martin Warnke (Leuphana University of Lüneburg, Institute for Culture and Aesthetics of Digital Media): Oberservations on



Conference Dinner (*)

(*) Please pay individually

23.02.2012 Topic


Introduction: Synthesis First Day, Text in DLC (Moderation: Ingo Caesar, Max Planck Institute for European Legal History)


  • Georg Vogeler (University of Graz): Lessons from Monasterium.net: More Efficient Cooperation between Science

     and Cultural Heritage Institutions through Online Collaboration

  • Christian Thomas (Berlin-Brandenburg Academy of Sciences and Humanities): DTAE: Enlarging the Reference Corpus of the

     Deutsches Textarchiv (DTA) - Production, Conversion and Interchange of XML/TEI Encoded Full Text


Coffee Break


Introduction: Annotations in DLC (Moderation: Malte Dreyer, Max Planck Digital Library)


  • Carsten Blüm (Goethe University Frankfurt): Sandrart.net: An Enriched Online Edition of a 17th Century Text
  • Erhard Hinrichs, Kathrin Beck (Eberhard Karls University Tübingen): Web-Based Linguistic Annotation: Current

     Practise and Future Directions




Introduction: Annotations in DLC, Part 2 (Moderation: Andrea Kulas, Max Planck Digital Library)


  • Rainer Simon (Austrian Institute of Technology): Collaborative Media Annotation with YUMA
  • Georg Schelbert (Humboldt-University Berlin): The Topography of Knowledge. On Georeferencing of Cultural History Data


  • Malte Dreyer (Max Planck Digital Library): Final Remarks and Farewell


  • Beck, Kathrin; Hinrichs, Erhard (Eberhard Karls University Tübingen): Web-Based Linguistic Annotation: Current Practise and Future Directions

In this talk, we will discuss the potential and the challenges involved in web-based linguistic annotation in an eHumanities context. We will introduce the virtual research environment WebLicht as a case study in order to illustrate the general issues that arise in web-based annotation. WebLicht is available as part of the ESFRI infrastructure project CLARIN, whose mission it is to establish an integrated and interoperable research infrastructure of language resources and its technology. It aims at lifting the current fragmentation, offering a stable, persistent, accessible and extendable infrastructure.


  • Blüm, Carsten (Goethe University Frankfurt): Sandrart.net: An enriched online edition of a 17th century text

Sandrart.net is a cooperation project between the Goethe-Universität Frankfurt am Main and the Kunsthistorisches Institut in Florence (Max-Planck-Institut), funded by the Deutsche Forschungsgemeinschaft. The initial goal was a web-based edition of Joachim von Sandrart’s “Teutsche Academie der Bau-, Bild- und Mahlerey-Künste” (1675/1679/1680), which meanwhile has evolved into a website where the original content has been augmented with metadata, images, translations, annotations and accompanying tools. Technically, the project is a TEI-/database-backed hybrid application which has been created largely using web-based tools.


  • Dercks, Ute (Photo Library of the Kunsthistorisches Instituts in Florenz, Max-Planck-Institut): Cutting-edge technology meets the Middle Ages. CENOBIUM - A Project for the Multimedia Representation of Romanesque Cloister Capitals in the Mediterranean Region

The use of new technologies in the documentation and study of Cultural Heritage sites has been an important issue since the 19th Century. The invention and diffusion of new means to acquire and visualize information has brought revolutionary changes in the way objects and monuments have been analyzed in art history and other contexts. The CENOBIUM project combines new techniques of visual representation with web technology in the pursuit of new insights regarding the artifacts focused on trans-cultural contacts in twelfth and thirteenth centuries’ architectural decoration.


  • Dreyer, Malte (Max Planck Digital Library): Workshop Introduction - Technical Implications


  • Rahtz, Sebastian (Oxford University Computing Services): The TEI: private and public concerns

The Text Encoding Initiative was designed from the start as a dynamic model which could provide both a firmly-anchored model for well-understood structural components and analyses of digital texts, and a framework in which scholars could freely record in an open-ended and non-prescriptive way. Underlying this was an assumption that the results would be interoperable, but only relatively recently has this been tested in large-scale practice. Tensions have now started to emerge between those who want the TEI to be entirely prescriptive, or to have more mandatory components, and those who argue that it is a purely descriptive decoration whose appearance of general machine interoperability was never a real possibility.

In this talk we will look at some of the components of the TEI which cause tensions (loose and multi-choice content models, short cuts, open-ended attribute values etc), and some of the ways the TEI community can consider safely exposing texts to interchange (data extraction to RDF, mapping equivalences to simplify markup, manifesting constraints in ODD etc).


  • Martin Warnke (Institut für Kultur und Ästhetik digitaler Medien, Leuphana Universität Lüneburg): On the Structural Richness of Art Historical Discourse – Observations on Images


  • Schelbert, Georg (Humboldt-University Berlin): The Topography of Knowledge. On Georeferencing of Cultural History Data

In geography and its fields of application, GIS systems have become indispensable for many years. With the increasing use also in archaeology or in the domain of monument preservation, such systems have reached the territory of historical studies as well. As documentation theory always tends more towards network models (linked data, semantic web), the concept of place- as a language-independent entity and reference - gains increased importance in general, too. However, hitherto used GIS systems are often equipped only with a relatively simple database that is not able to handle complex metadata models. In addition, information technology expertise and standards are not yet widespread in the cultural history disciplines, so that there is still a considerable development work to do, as I would like to show with the help of a few practical examples from the domain of Art History and the History of Architecture.

  • Simane, Jan (Kunsthistorisches Institut Florenz): Workshop Introduction - Scientific Implications


  • Simon, Rainer (Austrian Institute of Technology): Collaborative Media Annotation with YUMA

The practice of annotation has traditionally been playing a crucial role in scholarly research: on the one hand, annotations enable scholars to share and exchange knowledge, and work collaboratively in the interpretation and analysis of source material. On the other hand, annotations are a valuable addition to traditional metadata, which is essential for organising and cataloguing, as well as for searching and retrieving objects within collections. The YUMA Universal Media Annotator (YUMA) is an end-user annotation toolkit for different types of digital media content. With YUMA, users create 'Post-It'-style free-text annotations, as well as Semantic Tags to add structured context information. YUMA was developed as a prototype in the scope of the EU-funded EuropeanaConnect project, and is currently in the transition phase to an Open Source community project, located at http://yuma-js.github.com.


  • Thomas, Christian (Berlin-Brandenburg Academy of Sciences and Humanities): DTAE: Enlarging the Reference Corpus of the Deutsches Textarchiv (DTA) - Production, Conversion and Interchange of XML/TEI Encoded Full Text

In the course of the project (which runs until 2014), the DTA aims to publish around 1,300 volumes on its own. To even enhance this ›core collection‹, the software module DTAE (“E” stands for Enlargement or Extension) was developed. With the help of DTAE, external projects can integrate their historical text collections into the DTA reference corpus. They can present their data in a larger context and benefit from the elaborate linguistic search engine and text processing routines of the DTA. In addition, external contributors can integrate resp. re-import the processed text and metadata into their own web site via <iframe>. DTAE provides routines for uploading metadata, text and images, as well as semiautomatic conversion tools from different source formats (plain text, MS Word, TUSTEP, HTML, TEI-XML, …) into the XML/TEI conformant ›base format‹ of the DTA. DTAE thus demonstrates how interchange and interoperability among projects can work on a large scale. The presentation illustrates the described approach by different examples of text interchange resp. text production partnerships between the DTA and its external partners, i.e. the MPI, the HAB Wolfenbüttel and the Göttingen Academy of Sciences and Humanities. Possibilities and challenges of the exchange of XML/TEI documents will be discussed. (read the full abstract)


  • Vogeler, Georg (University of Graz): Lessons from Monasterium.net: More Efficient Cooperation between Science and Cultural Heritage Institutions through Online Collaboration

Crowdsourcing and online collaboration are “hype” words in the current public discussion. Monasterium.net is the largest charter database in Europe. It tried to implement an environment supporting the ideas of online cooperation between archives and their users from the very beginning. The talk reports on the experiences made on the way to the current state of the project. It presents the concepts of the core application in this approach – the Monasterium Collaborative Archive (MOM-CA). Finally it will discuss why important obstacles for the development of an effective cooperation between cultural heritage institutions and scholars cannot be solved technically.