Trip Report: TextGrid Summit

From MPDLMediaWiki
Jump to navigation Jump to search

Event: TextGrid Summit

Göttingen, 21.-22. January 2009

  • Wednesday, 21 January: Symposium with international speakers, TextGrid demo, use cases, and discussion of prospects in the e-Humnanities (conference language: English)
  • Thursday, 22 January: Developers' Workshop and Hands-On-Session

Participants MPDL: Heike, Melanie, Kristina

For details, please check the Agenda for the Symposium with all presentations available.


As the TextGrid project now comes to its end after 5 years of work, the participants presents their results and talked about the future of TextGrid.
Further on, two keynote sessions were given:

  1. With and Beyond Text - Current Practices and Future Possibilities in e-Humanities
    Tobias Blanke (Arts and Humanities e-Science Support Centre, Centre for e-Research, King's College London)
  2. The Crisis in Scholarly Publishing, the DHO, and the Idea of an Irish Digital Scholarly Imprint
    Susan Schreibman (Digital Humanities Observatory, Dublin)

TextGrid: Introduction and Overview[edit]

  • Presentation of the TextGrid beta version
  • TextGrid is the first German grid computing project for the eHumanities
  • Goal of the project:
    • to develop a virtual research infrastructure (based on distributed resources)
    • to develop generic grid services, which can also be used by other grid projects
  • The basis is to modulate a service orientated infrastructure (SOA) with user interfaces and the usage of standards when ever that's possible.
  • Live demonstration of the TextGrid Lab:
    Different tools are available:
  1. Workflow Editor
  2. XML Editor
    • for the creation of transcriptions based on different schemes (e.g. TEI)
    • links the transcription (text) with the corresponding scan (image) and displays both representations next to each other
  3. Text-Image-Link Editor
    • allows to link a selected part of a scan with the corresponding xml code
  4. Project and User Management
  5. Collation Tool
    • front end for the end user
    • next to the scans, the transcriptions will be displayed in HTML (with all connections displayed as links)
  • (Newly created) data will be stored in the grid, but can also be kept locally

TextGrid: TextGrid Tools and Services[edit]

  • Available services:
    1. Dictionaries
      • several dictionaries are integrated and searchable
    2. Lemmatizer
      • words can be linked directly with the corresponding entry in an integrated dictionary
    3. Search within the Grid for metadata and fulltextes
    4. Tokenizer
    5. Streaming editor
  • TextGrid is a D-Grid project (D-Grid: creates a huge grid infrastructure of data in Germany)
  • The aim of TextGrid is the building of a community which will put their data in TextGrid, work with the provided tools and develop new tools. They do not offer an out of the box solution

Use Cases[edit]

1. Musicology

Daniel Röwenstrunk (University Paderborn)

2. Psycholinguistics

Peter Wittenburg (Max-Planck-Institute for Psycholinguistics)
  • Presentation of two tool which are used by the institute:
  1. Lexica
    • uses LMF (Lexical Markup Framework, standard for lexicas)
    • local tool
  2. ELAN tool for annotated media (videos)
    • background: scientists shoot videos about rare cultures
    • webbased
  • Both tools shall be integrated into TextGrid


  • Next steps for TextGrid:
    • Test users for the software are needed to polish it up --> user community shall be developed
    • A business model has to be created (how can the infrastructure be financed in the long term)
    • TextGrid will be integrated in eSciDoc, CLARIN and DARIAH
  • Applications for new projects are planed:
  1. Wiss-Grid (grid for scientists)
    Aim: to guarantee the sustainability of the grid infrastructure and to develop a business model
  2. TextGrid 2
    Aim: development of new tools, teaching and qualification of scientists and scholars, so that they can bring their own ideas in the development of TextGrid
Funding decision for both projects will be taken in March 2009


In a short discussion it was talked about what exactly TextGrid is and shall be. Different ideas were mentioned: TextGrid could be the tools, the know-how, the defined standards. The participants of TextGrid understands their project as the creation of an ecosystem (infrastructure), which depends on everything together.
Further on, it was talked about the long term preservation of the data within TextGrid. Currently, the data (bitstream preservation) will be stored for at least 50 years (this service is provided by the GWDG). But that does not include any data migration.