ViRR

From MPDLMediaWiki
Jump to navigation Jump to search

This is a protected page.

Note: This page is still under construction!

ViRR (Virtueller Raum Reichsrecht) is a collaboration of the MPDL with the Max Planck Institute for European History of Law.


Introduction[edit]

The institute already has some experiences with digital collections, e.g in the scope of their Digital Library.

Vision[edit]

A lot of information about the law of the Holy Roman Empire exists, but is distributed over several institutions and libraries. For decentralization, an overall internet platform is desirable.

The beginning should be to provide the already existing digital scans of 15 books (together more than 16000 pages) online with free browsing functionalities. Later on, the collection should be expanded with further pictures, publications and digitized works from other institutions and projects like the DRW, the cooperation between BSB and GoogleBooks, information from library catalogs, databases of the institute and further documentations like the Polizeyordnungen.

Further on, a overall collection of links suitable for the "Reichsrecht" should be integrated analog to the link collection of the institute concerning the whole European History of Law.

Last but not least a possibility for an overall search in all integrated sources is desirable. This search should include an index and a list of synonyms for the different spellings of the titles.


Aim[edit]

The "Virtueller Raum Reichsrecht" should be a tool to work with. Possible working scenarios are:

  • Assignment of keywords and synonyms to the digitized work
  • Transcription and markup of the digitized work
  • Submission of documents
  • Annotations (with different visibilities) of the digitized work

1. Basisanforderungen für den Aufbau von Digitalen Sammlungen / Bibliotheken:

1.1 Rohdaten-Produktion

  • xml-Editor zur Erfassung von Strukturdaten (physikalische und inhaltliche Beschreibung
  • OCR zur Produktion von Volltexten (auch für Frakturschriften)
  • Umwandlung der Archiv-Bilddateien (Tiff) in Internet-geeignete Formate (gif, jpeg, PDF), inkl. automatisierter Bildbearbeitung

1.2 Webserver

  • Images
  • Metadaten: Strukturdaten (xml), bibliographische Daten aus beliebigen Bibliothekssystemen (MAB, MARC, systemeigene Formate)
  • Volltext
  • Suchmaschine: Bibliographische Daten, Strukturdaten, Volltext
  • Anzeige der Metadaten in flexibel zu definierenden Formaten
  • Navigationsinstrumente (Aufruf einer bestimmten Seite, Blättern, etc.)
  • Download-Funktion (statisch und dynamisch)

1.3 Langzeitarchivierung

  • Tools zur Produktion von Datenpaketen für die Langzeitarchivierung (Bilddateien und Metadaten)

1.4 Ausbau von Digitalen Sammlungen / Bibliotheken zu Informationsplattformen

  • Metadaten: Verlinkungen von Quellen, Sekundärliteratur, Bibliographien, Bildsammlungen etc.
  • Kommentierungen
  • Einbindung von Bibliothekskatalogen
  • Einbindung von Norm- und Synonymdateien
  • Dynamische Erweiterbarkeit

Bei vielen Punkten hat das Institut fuer vergangene Projekte inhouse-Loesungen gefunden, die jedoch nicht verallgemeinerbar sind. Es wuenscht sich, das die MPDL ihm bessere Werkzeuge liefert, die mit wenig Aufwand an neue Projekte verschiedener MPIs angepasst werden koennen.

Current Status[edit]

Currently, only the digitized work ("Digitalisate") of the relevant books are available here (belong other digitized work from the cooperation of the institute with the Heidelberger Akademie). All pictures are saved in four different sizes: fil < film < indiv < max. The original Tiffs are hosted by the GWGD. The names of the folders and files are derived from the signatures of the institute (analog to the list of digitized work).


Requirements[edit]

Functional Requirements[edit]

Visibility

Two different views of the data are required. One for registered users and one for unregistered users.

The digitized works are all public visible.

The privileges for a registered user are still unclear.

Indexing

If available following information should be indexed:

  • Transcriptions
  • Table of Content and registers
  • Titles of the laws
  • Metadata (details need to be specified)

Fields for an advanced search:

  • Search for keywords (based on list of synonyms and thesauri)
  • Search for bibliographical data like
    • Form of the law (e.g. order, mandate, edict) - unstandardized terms
    • Place, where the law was created - normed terms
    • Date (period), when the law was ratified (Problem: until 1692, two calenders exist)
    • Lawmaker
    • Legislature ("Körperschaft", e.g. imperator, parliament) - normed terms
    • Genre (e.g. digitized work, secondary literature, collections of images like logos)

Download

Report creation for downloading (and printing) several digitized work in one document (conversion from several tiffs to one pdf)

Relations

  • Translations are not the norm (will only occur in special cases)
  • One item is part of an other item
  • Relations between different versions of one law

Persistent Identifier (PID)

PIDs are needed not only for the books, but also for special parts of the books (details have to be specified).


Technical Requirements[edit]

Data Formats

The images are available as tiff, pdf, jpeg (for colored pictures) and giff (for black and white pictures).

During the digitalization, uncompressed images with a high resolution and a 24 bit color depth were created. This images serve as archive format. Based on them, the other formats for the internet presentation and the download are created.


Corporate Design Requirements[edit]

  • The institute don't have an own logo
  • The name of the institute should be visible on every page


Metadata[edit]

Technical Metadata[edit]

One component of the eSciDoc framework called JHOVE is able to extract all technical metadata currently saved in the pictures.


Structural Metadata[edit]

The bibliographical data of the digitized works can be delivered by the institute in sisis MAB ("Maschienenlesbares Austauschformat für Bibliotheken"). MAB describes one special data structure which is used on libraries for the exchange of metadata.


Markup[edit]

The aim is to display a hierarchical structure of the texts (chapter, subchapter). The detailedness of this structure still have to be specified.


Metadata Standards[edit]

Several metadata standards for the representation of the structural metadata are in discussion. TEI (Text Encoding Initiative) is already eliminated, because it needs a lot of effort for the creation of the markups.


eBind[edit]

Hintergrund: eBind (1996) wurde nie als Standard abgeschlossen, da es aber ein kleines Tool gab (eBind2HTML), das aus eBind-Dateien direkt HTML-Seiten erzeugen konnte (mit Navigation usw.) wurde eBind dennoch oft eingesetzt. 1998 wurde als bessere Ergänzung dann "MOA2" (Making of America) entwickelt. Aus MOA2 wurde dann (2001/2002)im Rahmen der Standardisierungsbemühungen METS!

Eine gute Veranschaulichung findet sich mit dem Tool eBind2HTML.

Pro:

  • eBind enables to structure the texts in paragraphs.

Con:

  • eBind does not enable to mark up journals, because its not possible to define one author per article, only per book.

METS[edit]

Pro:

  • METS enables the mark up of front, cover, etc.
  • METS is an international standard. It displays the hierarchical structure, the name and the location of the data storage and the metadata of objects. --> METS is like a container.

If working with METS, a METS editor is needed!


Further information[edit]

Copyright[edit]

The source of the digitized works are in the public domain. The digitized works them selfs don't have a copyright, they are free for further usage.

Only integrated secondary literature have to be checked for copyright licenses.


Digitalization[edit]

Later on, further books should be digitalized. Therefore, a collaboration with the DTA (Deutsches Text Archiv) will be discussed. The aim of the DTA is to create a digital collection with several hundred million tokens (words) from German documents. This collection should contain the scans and transcriptions of the documents and should reflect a representative picture of the linguistical and cultural development of starting from the middle of the 17th century until now.


Project Management[edit]

Meetings[edit]

18.06.07 (Frankfurt): Kick off Meeting

22.10.07 (Frankfurt): Follow up Meeting


Work in Progress[edit]

ViRR - Schedule


Project Documentation[edit]