Talk:ESciDoc Services FledgedDataService

MPDL

FDS



 * What are the use cases for this service
 * Imeji: No special use case
 * PubMan
 * DLC: Not yet a requirement
 * Protocols/Formats we want to implement:
 * oai-pmh (http)
 * unapi (TODO: move/redirect escidoc part from data acquisition service) (http)
 * sitemap (http://www.sitemaps.org/)
 * Do we want to use eSciDoc oai-provider (http://colab.mpdl.mpg.de/mediawiki/PubMan_Func_Spec_Export/OAI_Data_Provider, https://www.escidoc.org/JSPWiki/en/OaiPmh)
 * Possibility to define own sets
 * predefined sets for context and ou (single item?)
 * To be tested if it is working properly
 * Are all metadata formats we provide stored as own md-record or do we make a transformation on the fly
 * Only escidoc_md and dc_md are stored in metadata record (except imeji) all other transformations are on the fly
 * Does the syndication service fit into FDS?
 * What needs to be configurable?
 * md formats to expose plus related xslts (isn't this part of the transformation service???)
 * set definition will remain part of escidoc oai-provider, in Imeji we will use implicit set definitions
 * interval to update oai db will remain part of the escidoc oai-provider
 * Sitemap creation interval
 * location of the oai-provider
 * db/triplestore connection plus xslt (from jena format or rdf) (is it necessary to make this configurable? probably only an interims solution till imeji md is in escidoc)
 * solution uses service or service uses solution (does the service has to know about the solution or does it only talk to the coreservice?)
 * Integrate registration of dataprovider (http://www.openarchives.org/data/registerasprovider.html), register oai identifier. On startup?
 * Interfaces:
 * 1) oai (all metadata in xml)
 * 2) unapi (all metadata, all objects, sitemaps)


 * There will be two ways to access oai md of a solution:
 * 1) via escidoc oai-provider
 * 2) via FDS service
 * Should we overwrite/ redirect link of oai-provider??


 * reuse some code from:
 * http://code.google.com/p/oaicat/ (oai-pmh implementation from oclc)
 * http://code.google.com/p/joailib/ (not an active community so I would just reuse code not library)

FDS PubMan (publication items)

 * Which metadata do we want to expose via oai?
 * Which metadata do we want to expose via unapi?
 * See dataacquisition service
 * VuLib integration via 'old' interface?

FDS DLC (book/ structured elements items)

 * What do we want to expose via oai
 * native (what is native? mods, tei, escidoc_xml?)
 * dc
 * edm (for europeaner harvesting)
 * zvdd_mets (zvdd)


 * PROBLEM: the toc is stored as a container which can not be provided with the escidoc oai_provider


 * escidoc oai provider can provide mods (bibl. metadata) and mets (structmap), these two md records are stored in the item


 * We want to provide data for the zvdd, europeaner
 * http://www.europeanaconnect.eu/documents/01_Europeana_OAI_PMH_Infrastructure.pdf
 * http://pro.europeana.eu/web/guest/provide-data
 * http://europeanalabs.eu/wiki/EDMObjectTemplatesProviders
 * http://www.zvdd.de/dms/oai-repositories
 * http://www.zvdd.de/fileadmin/AGSDD-Redaktion/METS_Anwendungsprofil_2.0.pdf
 * http://www.zvdd.de/fileadmin/AGSDD-Redaktion/zvdd_MODS_Application_Profile_2008-11-13.pdf
 * http://www.zvdd.de/dokumentation/datenformate-im-ueberblick/strukturdaten/#c133
 * http://colab.mpdl.mpg.de/mediawiki/ViRR_and_METS#Requirements_for_ZVDD


 * How do we integrate the transformation service? (Important, only consider transformations where target format has mimetype application/xml)-FUTURE DEVELOPMENT


 * Decided with Willy:
 * 1) a link to the tei_sd will be integrated in the mods xml as 'related item'
 * 2) the escidoc oai provider will be installed for dlc
 * 3) the FDS will communicate with the oai provider (and the transformation service (here all transformations are stored))-FUTURE DEVELOPMENT
 * 4) The transformation service will not be used for start as it rely on outdated common_logic.

=> Therefore the FDS will be the communication endpoint for oai requests for dlc

OAI eSciDoc

 * Do we need an explicit set definition? Implicit would be:
 * collection id => all items of a collection
 * org id => all items of an org etc.
 * oaiprovider exposes only data which is already stored in md record
 * oaiprovider caches the objects
 * Use escidoc oai as one input source for fledged data service? (static or dynamic?)

Sitemap

 * Create sitemap with oai-pmh call (would make sense when we use own db for oai, then it should be much faster)?
 * Impelmented
 * Check which search engine specific info we want to integrate (e.g. http://www.google.com/support/webmasters/bin/topic.py?hl=en&topic=20986).
 * Image sitemaps of google (http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=178636)
 * How to register sitemap?
 * Will be part of repository, not service