Talk:PubMan Func Spec History of affiliations

From MPDLMediaWiki
Revision as of 17:27, 9 February 2010 by Natasab (talk | contribs)
Jump to navigation Jump to search

from the definition it seems as if the type of change can readily be inferred from the actual untyped successor/predecessor relations - so it seems redundant.--Robert 10:06, 27 March 2009 (UTC)

Do you really think that can be inferred? That would be great! There seem to be the following types of predecessor: replacement, fusion, spin-off, splitting. Frank 09:17, 31 March 2009 (UTC)

it sure sounds like it:
   * An OU(1) has one direct successor OU(2) and not exists anymore in reality (i.e. is closed). The OU(1) is set in status closed (replacement).
   * An OU is founded as fusion from two or more other OUs (fusion). The new OU has multiple predecessors.
   * A part of an OU(1) is created as spin-off. The OU(1) exists (i.e. is not closed) beside the new OU(2) (spin-off).
   * An OU(1) is split-up into multiple new OUs (OU(3), OU(4), …) and has therefore multiple successors. OU(1) not exist in reality any longer and is therefore set to status ‘closed’. (splitting) 

of course the state to infer the type of change from would have to be timestamped, and either e.g. the closing for a replacement would have to happen at the same time or before creating the replacement. but i still think inference would be better than risking data getting out-of-sync because the user inserts a type of change not matching the actual state.--Robert 09:23, 31 March 2009 (UTC)

i would even say that if the type of change cannot be inferred automatically, it should simply be a freetext note or something like that to avoid home-made inconsistency problems.--Robert 09:27, 31 March 2009 (UTC)

--Steffen 11:27, 31 March 2009 (UTC) Let me try to finsish the defined rule set: 1.Replacement Only one predecessor has to be related. The predecessor has to be in status close. 2.Fusion Two or more predecessors has to be related. Each predecessor has to be in status close. 3.Spin-off One or more predecessors has to be related. Each predecessor has to be in status open. 4.Splitting: Only one predecessors has to be related. The predecessor has to be in status closed. 5.Affiliated One predecessor has to be related. The predecessor has to be in status closed.

If the include the reverse direction into the infer method, than is it possible to derive the type of predecessor from the realtion. But this requieres a fixed workflow where the predecessor OU is in the right status before the successors is set. If a relation could be set “at any time” than is it possible to create predecessor relations with the potential to be wrong interpret. For example is OU(2) set as successor of OU(1) but OU(1) is not closed yet. Until OU(1) is close is OU(2) a spin-off, but with the close of OU(1) is OU(2) a replacement (for requirement definition see http://colab.mpdl.mpg.de/mediawiki/PubMan_Func_Spec_Organizational_Unit_Management#Future_Development). And it seems impossible to infer situations where OUs are created through a combination of multiple of these defined scenarios.

If inconsistency is defined though User/Solution, than wouldn't free text help to prevent inconsistencies.

In my understanding we have agreed to have in next release:
   a) typed relations

to avoid incosistencies, and in future development i.e. when relation objects are created

   b) dates and some comments that would better explain the change

We can not expect that users would actually be able to set-up the right status at right time, therefore it was agreed to have the possibility to set these relations at any time after opening of an OU. --Natasa 08:00, 1 April 2009 (UTC)

Enriching the predecessor/successor relation with additional information (like date or comment) could help to describe the relation. We discussed it in the last video conference and came to the conflusion that this are values which are metadata as well an structural data.

The Organizational Units are currently non-versioned. The date when an organization was founded is not relfect within the structural data of the Orgnaizational Units. It is part of the metadata section. "dates and comments [..] would better explain the change" sounds like values which have to describe the historical process and not only the current status. These values within the properties section would break the abstraction between the administration of OUs in escidoc and the real organization data. In consequence should these values be stored within a data field that reflects the historical process. The Version History keeps the process of the object and not history of the organizations itself. That's why the Version history would be a wrong place for this information. That means that the non-version concept of the OU would be obsolete.

sure, this is understandable. However, the OU version history had another point. It would actually keep track of the information when an OU has been modified (including metadata, adding a predecessor etc. similar like the version history is kept for content resources). --Natasa 17:27, 9 February 2010 (UTC)

The question from the last video conference was: For what are these additional data used?

these additional data are part of the OU as a resource in the infrastructure. If we put already history of affiliations, then it makes sense to produce the complete history, not only the links. On one meeting we've discussed that OU successor/predeseccor relations are structural relations, but they should actually allow for metadata. To me this seems like the core service shall allow these metadata to be set/retrieved in the OU XML, but in the background, create special "internal-OU" relation objects (that could only be retrieved via OU Handler) that actually contain these metadata. Another possibility would be to simply put additional (non-external-user-metadata) datastream in Fedora, but serve it's content properly within the OU XML (with the predecessors, successors). --Natasa 17:27, 9 February 2010 (UTC)

Which values could have an influence of the processing of the infrastructure?

could easily imagine a query such as: give me all publications for an OU with publication date before date of the merge into another OU (of course in this case application should not expect to have indexed dates of OU merge together with the PubItems, but should be able to find this date and include it into the query for the publication items)--Natasa 17:27, 9 February 2010 (UTC)

Maybe the additionally values should influnce search results or release behavior?

there was before an issue of the search by affiliations, similar like we have the OU_Path in items/containers, the OU_Path of the predecessors could be added in search index to allow for searching of all resources of OU X (including or excluding the predecessors). --Natasa 17:27, 9 February 2010 (UTC)

And there is one point more that is to clarify. If the predecessor/successor relation could be enriched with additional data, woulnd't this be an requirement for the parent/children relations too?

it makes sense sure, the concept of implementation is same for both --Natasa 17:27, 9 February 2010 (UTC)

This brings me back to the question if the current Organizational Unit datastructure is powerful and extensible enough to express, at least an usefull subset of, organizational structures? --Steffen 11:15, 9. February 2010

maybe with some small additions, yes :) .. we are not at the moment too deeply aware where else the OU structure history will be used. Another point in here is to allow several repositories to work with single OU structure (i.e. OU Handler can be "proxied" to an OU Handler of another repository? --Natasa 17:27, 9 February 2010 (UTC)