Difference between revisions of "Talk:MPDL Project XML Workflow"

Revision as of 11:27, 5 November 2008

Agenda Meeting 16.10.2008[edit]

Project and current status[edit]

The XML Workflow project started September 2008. There are two main parts of the project:

Defining a working process (workflow) for production of XML texts (and documenting this process so that it can be reused)
- Digitization of e.g. manuscripts
- Transcription of the text presented on the manuscripts
- Markup of parts of the XML texts
- Enrichment of the XML texts

Development aspect by enabling tools and infrastructure to
- enable access to documents, linking between documents and internal parts of the documents
- building functions for searching, indexing and retrieval of relevant results

The main motivation is to standardize the working processes and develop a Center of competence that provides guidelines for transcription of texts.

Currently the project is in the initial design phase
One of the main goals is to use a repository functionality for all artefacts and enable easier reuse such as: importing the texts on which a work needs to be done into tool installed on a local system, as well as easily submitting the modified texts back to the repository

Introduction&status overview of eSciDoc project[edit]

High-Level Requirements (functional, technical) for XMl Workflow project[edit]

Repository - the functions that need to be provided are basically:
- persistent storage for resources that come from various projects
- versioning of resources
- persistent identification of resources
- possibility to access arbitrary functions of XML Documents
- transformation of XML documents to XHTML for presentation purposes
- enrichment of dsata such as:
  - links to language specific functionality
  - links to sources (available on the web)
searching functionality
- project team considers as a core system to search within the XML documents the following: eXist database, Lucene or Oracle 11g
- two types of queries need to be supported:
  - structural queries of XML documents (in particular trees, subset of trees)
  - Full-text searching (integrated language technology)
  - support for different languages/scripts such as: Latin, Greek, Chinese, European languages, Sanscrit
Digilib - to be enabled as a service for viewing in-line images such as figures, diagrams
- need to have the possibility to use quite mature/robust tools for working with images

@@ Line 3: / Line 3: @@
 *https://itgroup.mpiwg-berlin.mpg.de:8080/tracs/mpdl-project-content
 *https://itgroup.mpiwg-berlin.mpg.de:8080/tracs/mpdl-project-software
+The [[MPDL_Project_XML Workflow | XML Workflow project]] started September 2008.
+There are two main parts of the project:
+*Defining a working process (workflow) for production of XML texts (and documenting this process so that it can be reused)
+**Digitization of e.g. manuscripts
+**Transcription of the text presented on the manuscripts
+**Markup of parts of the XML texts
+**Enrichment of the XML texts
+*Development aspect by enabling tools and infrastructure to
+**enable access to documents, linking between documents and internal parts of the documents
+**building functions for searching, indexing and retrieval of relevant results
+The main motivation is to standardize the working processes and develop a Center of competence that provides guidelines for transcription of texts.
+*Currently the project is in the initial design phase
+*One of the main goals is to use a repository functionality for all artefacts and enable easier reuse such as: importing the texts on which a work needs to be done into tool installed on a local system, as well as easily submitting the modified texts back to the repository
 ===Introduction&status overview of eSciDoc project===
-===Requirements (functional, technical) for XMl Workflow project===
+===High-Level Requirements (functional, technical) for XMl Workflow project===
+*Repository - the functions that need to be provided are basically:
+**persistent storage for resources that come from various projects
+**versioning of resources
+**persistent identification of resources
+**possibility to access arbitrary functions of XML Documents
+**transformation of XML documents to XHTML for presentation purposes
+**enrichment of dsata such as:
+***links to language specific functionality
+***links to sources (available on the web)
+*searching functionality
+**project team considers as a core system to search within the XML documents the following: eXist database, Lucene or Oracle 11g
+**two types of queries need to be supported:
+***structural queries of XML documents (in particular trees, subset of trees)
+***Full-text searching (integrated language technology)
+***support for different languages/scripts such as: Latin, Greek, Chinese, European languages, Sanscrit
+*Digilib - to be enabled as a service for viewing in-line images such as figures, diagrams
+**need to have the possibility to use quite mature/robust tools for working with images
 ===Relation between the two projects and possibility for reuse===

Difference between revisions of "Talk:MPDL Project XML Workflow"

Revision as of 11:27, 5 November 2008

Contents

Agenda Meeting 16.10.2008[edit]

Project and current status[edit]

Introduction&status overview of eSciDoc project[edit]

High-Level Requirements (functional, technical) for XMl Workflow project[edit]

Relation between the two projects and possibility for reuse[edit]

Navigation menu

Search