ESciDoc Developer Workshop 2008-03-11

ESciDoc  Restricted Access to eSciDoc group

Date: March 11.03.2008

Location: Karlsruhe, München (Video conference)

Participants MPDL: Wilhelm Frank, Michael Franke

Participants FIZ:

Start time: 15.00 11.03.2008

Component metadata and properties

 * Input
 * PubMan_File_Properties

-  <escidocComponents:components xmlns:escidocMetadataRecords="http://www.escidoc.de/schemas/metadatarecords/0.4" xmlns:escidocComponents="http://www.escidoc.de/schemas/components/0.4" xmlns:version="http://escidoc.de/core/01/properties/version/" xmlns:release="http://escidoc.de/core/01/properties/release/" xmlns:escidocItem="http://www.escidoc.de/schemas/item/0.4" xmlns:prop="http://escidoc.de/core/01/properties/" xmlns:srel="http://escidoc.de/core/01/structural-relations/" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xml="http://www.w3.org/XML/1998/namespace">   2008-02-28T07:52:47.531Z
 * Proposal
 * missing "size" as property?
 * May be in tech-md. Frank 11:47, 11 March 2008 (CET)
 * missing "PID" as property?
 * Right, added pid to proposal below. Frank 11:47, 11 March 2008 (CET)
 * question: element "content" allows only for inline content?
 * No, when storage is set to "internal-managed" inline content and referring content by URL is allowed. If storage is set to "external-url" only referring content by URL is permitted. Frank 12:39, 11 March 2008 (CET)
 * mime-type. IANA media type: http://www.iana.org/assignments/media-types/



The eSciDoc banner JPEG used for header in webpages etc.

valid

public

hdl:12345/6789

image</prop:content-category>

<prop:file-name>escidoc-banner.jpg</prop:file-name>

<prop:mime-type>image/jpeg</prop:mime-type>

</escidocComponents:properties>

<escidocComponents:content storage="internal-managed" xlink:type="simple" xlink:title="escidoc-banner.jpg" xlink:href="/ir/item/escidoc:ex5/components/component/escidoc:ex6/content"/>

<escidocMetadataRecords:md-records > <escidocMetadataRecords:md-record name="escidoc"> <some-md-root>

<some-title/> <some-description/> </some-md-root> </escidocMetadataRecords:md-record> </escidocMetadataRecords:md-records>

</escidocComponents:component> </escidocComponents:components>

Outcome 2008103-11

There are two elements which will remain as properties:
 * 1) mime-type (required)
 * 2) a list of allowed values will be used
 * 3) as a starting point for this list, IANA media type: http://www.iana.org/assignments/media-types/ will be used
 * 4) new values may be added in time to this escidoc-list
 * 5) content-category (required)
 * 6) in the moment there are no restrictions for the values "any string"
 * 7) later they may be defined with a content model.

ToC XML schema
See also
 * ESciDoc_Developer_Telco_2008-02-26. The option b) from described alternatives is agreed.
 * METS struct map

The discussion should agree on the format list of the TOC. Proposal sent, see below: <?xml version="1.0" encoding="UTF-8"?> <toc:toc ID="meins" TYPE="LOGICAL" LABEL="Table of Content" xml:base="http://localhost:8080" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://escidoc.de/toc TOC-v1.xsd" xmlns:toc="http://escidoc.de/toc" xmlns:xlink="http://www.w3.org/1999/xlink" > <toc:div> <toc:ptr ID="cont" xlink:href="/ir/container/escidoc:10"/> <toc:div ID="chap1" ORDER="1" ORDERLABEL="1." LABEL="Erstes Kapitel" TYPE="chapter"> <toc:ptr ID="chap1link" LOCTYPE="URL" xlink:href="/ir/container/escidoc:11" xlink:type="simple" xlink:title="Erstes Kapitel"/> <toc:div ID="chap1-1" ORDER="1" ORDERLABEL="1.1" LABEL="Ein Kapitel zweiter Ebene" TYPE="chapter"> <toc:ptr ID="a"/> <ns:any xmlns:ns="http://some.ns"/> </toc:div> </toc:div> <toc:div ID="chap2" ORDER="2" ORDERLABEL="2." LABEL="Zweites Kapitel" TYPE="chapter"> <toc:ptr ID="b"/> <ns:any xmlns:ns="http://some.ns"/> <toc:div> <toc:ptr ID="c"/> <ns:any xmlns:ns="http://some.ns"/> <toc:div> <toc:ptr ID="d"/> <ns:any xmlns:ns="http://some.ns"/> </toc:div> </toc:div> </toc:div> </toc:div> </toc:toc>

Explanation: This TOC proposal is on the basis of METS. Every TOC-entry consists of two elements: a div-element (labels, order etc.) and a ptr-element (the reference to the object). Every div has a ptr as first child and every ptr is the first child of a div. (Usually one would make one single element with labels and href). If a div (in combination with its ptr) refers to a container, then the div may have div-elements as direct children. If a div (in combination with its ptr) refers to an item there are no child-div-elements allowed.

Proposal extension: A div-element should have an additional attribute that indicates the visibility of that TOC-entry. E.g. the infrastructure automatically inserts a new member in the TOC but set it invisible. An author may set it visible, change the order or remove the automatically added TOC-entry. Or an author want to add TOC-entries now but they should become visible later.

Outcome 2008-03-11:

There might be two kind of "Table of Contents"
 * TOC: created by persons who have edit-access to the container(s) the TOC references,
 * 1) there might be more than one such TOC
 * 2) these TOCs are normaly created together with the containers and their content and structure
 * 3) these TOCs are treated as "special" members of the container to which they are related evan thy are TOC-object (no item or container).
 * 4) Reference of Contents: created by persons which have NO edit-access to the container
 * 5) these objects are not part of the referenced container

--Natasa 16:37, 17 March 2008 (CET)
 * It was aggreed that only TOCs of catagory 1 will be implemented since there exists use-cases for such TOCs
 * Type 2 will be shifted up to the time when use-cases and a detailed functional concept exist.
 * A simple XML-example will be created (Container with 2 or 3 books, each book has chapters)
 * there might be levels within a TOC which are not related to an object, so no pointer can be set.
 * a description will be prepared (by Frank) and sent to Natasa for final agreement.
 * to precise the description also includes the clarification on versioning, pid, releasing when:
 * add new member (not TOC) in a container
 * remove member (not TOC) from a container
 * update member (not TOC) in a container

Item List Format

 * PubMan_Display
 * ESciDoc_Item_List
 * Assumptions & considerations:

1.) item lists are by default loaded in short title display Result: - All bibliographic information required for this format can be mapped successfully to DC simple - Information on number of files attached could be specified by repeating dc:format element Open: - Sorting requirements cannot be fulfilled with elements in DC simple/DC qualified Questions: - Does sorting depends on item list (or can it be specified with search/filter)? - Can sorting requirements for result list be restricted to DC simple elements

2.) user can select to see items in medium title display Result: - Bibliographic information required for this format CANNOT be mapped successfully to DC simple (and even DC qualified is not possible, e.g. regarding event information) Question: - Is it possible to request complete metadata if user requests medium display? Or display short list first and in the background start to cache complete records for the corresponding items.

Outcome 2008-03-11

There was a long discussion which raised a lot of new issues. At the end no decission could be taken.

One result could be defined as: eSciDOc-Core will have to deliver
 * list of items with all properties, metadata (evan from component)
 * very fast (10.000 item in a minute)
 * with paging and sorting
 * rights checking

FIZ will re-think the complete issue.

Prioritized Issue List
Discussion about the status, issues.


 * (1)	Performance
 * (1)	Item Lists Format
 * (1)	AA Institutional Visibility
 * (1)	AA Service Authorization
 * (1)	AA Identity Provider
 * (1)	AA Fast Lists
 * (1)	Ingestion
 * (1)	Migration
 * (1)	Scalability
 * (1)	Productive Environment
 * (1)	Relations/Handler
 * (1)	Relations /Revisions
 * (1)	Administrative Searches
 * (1)	Batch Updates (Status/props/MD)
 * (2)	Workflow Services
 * (2)	Event Logging
 * (2)	JHOVE
 * (2)	Digilib
 * (2)	Stable interfaces
 * (3)	Virus Checks

Next meeting agenda proposal
Outcome 2008-03-11

The agenda will be set up via email.

next VideoCall will be march 25, 14:00