ESciDoc Developer Workshop 2007-12-19
Date: December 19-20, 2007
Location: FIZ, Karlsruhe
Participants(MPDL): Michael, Robert, Wilhelm, Natasa
Start time: 12.00 19.12.2007
End time: 17.00 20.12.2007
Topics[edit]
Update of last workshop results[edit]
- Update of last workshop, see also ESciDoc Workshop 2007-09-24, ESciDoc Developer Workshop 2007-08-07
- eSciDoc Brainstorming results
- TOC in Container: format, restrictions, occurrence (see also ESciDoc_Container_Toc)
- Authority Files, see also (ControlledVocab, Talk:ControlledVocab#Prototyping)
- Workflow Manager, see also ESciDoc_Workflows
- Item lists - prototype, results, see also ESciDoc_Item_List
- Item lists prototype, mandatory metadata item properties to be in the minimal item list for publication items are defined at ESciDoc_Item_List#Requirements_.28revised.29
- Item lists - sorting order, see requirements for Search Topic below.
- Open search - check the possibility for enabling open search as described in ESciDoc_Item_List#OpenSearch
XML Schemas[edit]
- create a namespace and schema for common properties element of all resources --Natasa 13:07, 13 December 2007 (CET)
Consolidation of schemas for framework objects: the common elements in properties of framework resources should really be common, i.e. in the same namespace across resources; no more item:created-by, ou:created-by, ... but only escidoc:created-by. I also have a feeling that if this is not happening soon, it never will. -Robert 11:41, 14 November 2007 (CET) outcome
- MPDL agrees with this change.
- will be done with V1.0 eSciDoc Core
- MPDL wants schedule, should be done in Januray
- communicated via "Verteilerliste"
Release procedures and Data migration[edit]
- Release procedures (releases, release tests)
- decide on strategy for data migration (when XSDs are changed)
- migrate each object on retrieve
- migrate all objects of the repository at the time the new release is installed via XSLT
- create a treansformation service for the migration of objects (on demand or the compelte repository)
outcome
- needed
- MPDL Pilot repository some 1000
- no version change
Authorization[edit]
- Authorization, see also ESciDoc_Authorization_Authentication and related pages
- concept how to map solution actions to core services actions
- how to enable trust if the solution already authorized the action of a user
- how to easily create new policies/modify existing policies
OrganizationalUnits handler[edit]
Note: clarifications moved to Talk:PubMan_Func_Spec_Organizational_Unit_Management to avoid overloading of the current agenda page and as proposed by Harald. --Natasa 16:15, 10 December 2007 (CET) See also: PubMan_Organizational_Unit_Management, PubMan_Func_Spec_Organizational_Unit_Management, Talk:PubMan_Collections_and_Organizational_Units#History_of_organisational_units, Requirements for R3
Context handler[edit]
- additional methods, improvement of existing methods (admin descriptors, possibility to add new context types e.g. CitationStyles, Validation)--Natasa 17:08, 16 November 2007 (CET)Clarified partly: Context types are not limited.
- revisit Admin descriptor (what is relevant and what not in the current admin descriptor structure)
see description on page Context handler talks
- member lists, see also Talk:ESciDoc_Services_ContextHandler#Member_lists
PID[edit]
Proposal to move the topic to the next eSciDoc workshop Jan/Feb 2008--Natasa 10:55, 7 December 2007 (CET)
- In my opinion PID handling is very importent. Especially because we have a (inconsistent) implementation. In consequence i would propose to remove the PID assignment methods till we have the implementation of an agreed concept. Frank 12:09, 7 December 2007 (CET)
- finalize concept for PID-Impelementation in eSciDoc
- There is a Talk Talk:PubMan PID related to that topic but no concept.
- the concept already exists for PID and is agreed between us previously. This concept paper was discussed before. On the Talk page there is also a proposal how to deal with PIDs so please take a look at the
http://colab.mpdl.mpg.de/mediawiki/Talk:PubMan_PID#Issue_316:_feedback and http://colab.mpdl.mpg.de/mediawiki/Talk:PubMan_PID#ObjectPIDs_and_VersionPIDs
- decision is nedded
Statistics[edit]
- Status of Statistics
- possible to gather the current statistics also with additional info on logged-in (anonymous) users?
- no need for concept. http://www.escidoc-project.de/issueManagement/show_bug.cgi?id=347 in Bugzilla is fixed and will not be reopened
- the requirement is to "rework" the definitions and the reports to also provide statisctis for all users and only for registered (i.e. logged-in users)
Searching service[edit]
- mixture of language specific metadata indexes into a one search database
- this requirement needs to be discussed together. The main issue in here is that we do have the following examples for searching:
- search all items where title in german is "wissenschaft" and abstract in english contains "science information" - the results should be found with proper stemming options - that would mean that the exact "phrase" search through current escidoc_all database will not give back a correct set of results.
- this requirement needs to be discussed together. The main issue in here is that we do have the following examples for searching:
- sorting order (p and P are sorted within each other -> case insensitive, german umlaute seem to be handled, as if they where ue, oe, ae, characters like [ are listed before A/a.).
- this requirement simply states that there is a case insensitive sorting order and that we have to treat german umlaute in the following manner when indexing for sorting (ä/Ä should be treated as ae, ö/Ö should be treated as oe etc.)
- search results - how to get items and containers within same search result (after better checking the schemas this is possible as complete item.xsd or container.xsd are in the search result --Natasa 15:12, 17 December 2007 (CET))
- search indexes development
- administrative searches (should index properties of an item, should index also non-released items)
- end user searches (should not index properties of an item, should index released items only)
- special requirements for identifiers (see Indexing requirements for "search-by-identifier" search, Indexing requirements for identifiers in "any-field" search)
Content Model[edit]
- Content Models, in progress (probably not so successfull attempt to make it clearer, lacking diagrams for better readability and extra revision of terminology --Natasa 18:33, 13 December 2007 (CET))
Item handler[edit]
- extend filters to "last-modified-since", "context", "related-to" (not only with id of the item, but also with a set of relation types)
- separation Filter/Search - Frank
- to clarify this topic better: to check the possibility that current filter methods are actually available via standard search interface i.e. the possibility to have separate indexes for all items to enable searching by item properties (or item-lists), having limit/offset and order-by clause--Natasa 18:44, 13 December 2007 (CET)
There should be a description for each new/changed method, its filters and their meanings see page Item Handler
Container handler[edit]
- define usage of Admin descriptor in Container, see also Admin descriptor of Context
- container member list enrichment, see also Talk:ESciDoc_Services_ContextHandler#Member_lists
- List of members?
New service Ingestion[edit]
- posting of named graphs of objects, pushing/pulling functionality?
- see initial requirements and use case specification at Upload file in structured format
- for more advanced ingestion requirements that will come up in future see Named graph posting example
Relations[edit]
- Providing content relations together with the item or better standalone and use addContentRelations, removeContentRelations methods.
- create, retrieve, modify, update relation objects, register new relation types
- see concept on page Content relations
Service repository[edit]
- Service repository (for all, not only for core services)
eSciDoc Infrastructure Road map[edit]
- Release 1.0
- Release 1.1