ESciDoc Institutional visibility

Requirement
Access to content of a component should be restricted to users that may
 * retrieve the item and
 * belong to an organizational unit or child-org-unit of a list of Org units that is defined for the component.


 * Question: So System-Administrator will not be able to retrieve content if he is not in the proper orgUnit? -- Mih 06:42, 3 September 2008 (UTC)
 * Is there an answer to that question? If there is really an AND between "retrieve the item" and "belong to an OU ..." it has a huge impact to policy evaluation. The problem is the "visibility policy" is an additional one that is evaluated independent from the policies bound to a role. As I understand, till now access is granted if no "role policy" denies access (after allowing the action by a previous policy (The case there is no policy at all should not affect this argument!?)). So, if the only policy bound to the role system-admin is "allow all" then the system admin is allowed to access everything (also binary content). With this "visibility role" we introduce an additional policy that is now evaluated for system admin too and denies access to binary content in case system admin is not member of a named OU. Frank 15:36, 3 September 2008 (UTC)
 * Would really not dig too much now in the System administrator role, local administrator role. What is now important is that we have one user e.g. roland that is allowed to do any action in the repository. Local administrators are usually people in the institute who take care of creation of users in the institute, maybe creation of org units and evtl. contexts. They are not necessarily people who has certain visibility policy on the content in those contexts. Therefore probably we should approach their privileges from completely different set of actions/resources to which they have privs. (In other words: It would be the policy of the institute to say e.g. my local admin will be moderator for all pubman contexts - then when local administrator creates a context she should assign him/herself moderator policy for that context). --Natasa 10:53, 8 September 2008 (UTC)

The requirement should be extendable so that it is possible to restrict the access also to certain user groups or certain ip-ranges.


 * Question: How can we evaluate the user group or the ip-address of the user if the framework does not know about it? -- Mih 14:48, 2 September 2008 (UTC)
 * One can not be certain at the moment. The fact is probably that somehow we should support the concept of a user group (or incoming IP address) i.e. core services (either via Shib attributes or via new group concept) should know that a certain user is member of a certain user group. AA should be also aware of a "group" handle in this case? (note: user as member of a group has certain group privileges i.e. to view the content, but the very same user has also privileges on e.g. his pending items at the same time). Not certain yet - todo further research --Natasa 10:53, 8 September 2008 (UTC)


 * Shibboleth 2 support for several authentication mechanisms to be used in parallel. See also https://spaces.internet2.edu/display/SHIB2/IdPUserAuthn and https://spaces.internet2.edu/display/SHIB2/IdPAuthIP . Is that something of use?
 * Shibboleth 1.x provides following howtos https://spaces.internet2.edu/display/SHIB/IPIdPAuthN for Ip based access. In this case, IPs are related to an e.g. Guest user who has own privileges. To be checked if we can have different such guest users for different IP ranges.--Natasa 11:16, 8 September 2008 (UTC)


 * New requirement coming up from Digital collections we have to ingest (not really detailed for now, more info will be provided):
 * Fulltext should be accessible also to non-registered users if they use a special "access-key" (that is to be distributed by the person who has rights to issue such key (i.e. is somehow owner/editor of the fulltext)--Natasa 10:58, 10 September 2008 (UTC)

Proposal
Mih 12:26, 4 September 2008 (UTC) if the file/locator points to an external url and not to our internal content, then it will certainly not be possible to apply policies when requesting the content because the content is located external!
 * Invent possibility to attach XACML-Policies to Objects + to attach Attributes to these ObjectPolicies (eg a list of OrgUnit-Ids).
 * Don't store the XACML-Policies + Attributes within the Object but outside in a database.
 * One Database-Table that stores all possible ObjectPolicies (eg OrgUnitContentRestrictionPolicy)
 * One Database-Table that brings together the object and the policy.
 * Fields:
 * objectId
 * policyId (reference to Policies-DB-Table)
 * list of Attributes
 * Mark certain Methods (eg retrieveContent) as Method where ObjectPolicies have to get evaluated.
 * for files / locators: object policies should be certainly evaluated when retrieving the content
 * to my understanding eSciDoc supports the following types of storage:internal-managed, external-url and external-managed. With this in mind object policies are applicable for internal-managed and external-managed and not for external-url. That is also how we understood it, as all our locators have at present "public" visibility. We can not control the access on external-url (our locators). --Natasa 08:33, 5 September 2008 (UTC)

we should have information provided with component-level properties/metadata on component-level visibility. If the visibility is a "policy" then in addition id of the policy that needs to be evaluated (i.e. this information can be provided in retrieveItem and retrieveComponent methods) --Natasa 10:50, 29 August 2008 (UTC) Mih 13:19, 4 September 2008 (UTC)
 * We do not really have a visibility for the content that we could set to any value. We will have policies for 'private' access, 'public' access and 'ou-access'. Later on there will be more policies like eg 'user group'. These policies can get attached to the object and these policies will get evaluated when calling certain methods (retrieve-content). The names of the methods where the policy has to get evaluated is defined within the policy. This means that we can attach any policies to any object. We should not have a reference to the policies in the object-xml. Maybe later on we want to have an object-policy that gets evaluated when calling retrieveItem or retrieveProperties. If someone wants to know the policies that are attached to a specific object, there will be a new handler-method in the AA-Component that can deliver all object-policies for one object.Mih 11:36, 1 September 2008 (UTC)
 * Invent new Handler-Methods into the AA-Component that enable creating, updating, deleting and retrieval of ObjectPolicies + Attributes for one Object.
 * in addition evaluating the policy? --Natasa 10:50, 29 August 2008 (UTC)
 * Invent new Handler-Methods into the AA-Component that enable defining one or more ous for one user. The ous the user belongs to get stored in the database and are used during evaluation of the orgUnit-Object-Policy.


 * Evaluate these Policies in addition to the RolePolicies the user has. If the RolePolicies return a Permit and the ObjectPolicies return a Permit, then the user is allowed to access the Method (eg retrieveContent).Vice-Versa: If one of the Policies returns a Deny, then the user is not allowed to access the Method.


 * If no object-policy is attached to the object, only the role-policies are evaluated.
 * to clarify this better: a role policy for context is evaluated on item level, in addition, each item may have object-level policy for each file. So users do not have to have different object-level policies for items+object-level policies for files. The case is usually, role-level policy on ctx+object-level policy on file (optional, in case visibility is "policy")--Natasa 10:50, 29 August 2008 (UTC)

Mih 12:33, 4 September 2008 (UTC) We decided to additionally have the visibility-element within the component-properties as before. Values of the element can be: 'public', 'private' or the name of the policy used (eg 'orgUnit'). It will not be possible to keep the policy-information attached to each item-version. If the policy changes then the new policy will also get evaluated when requesting older versions of the component!
 * Element visibility in component-properties is not used anymore
 * element visibility may be used as a short-cut to tell: it's public, private or a policy should be evaluated (so that system does not go always to the XACML policy store)--Natasa 10:50, 29 August 2008 (UTC)
 * public and private also will be policies. Instead of parsing the item-xml to retrieve the content of the element visibility, we now have to go to the database and retrieve the policies for the object and then evaluate them. Database-Access hopefully will not be less performant than parsing the xml.--Mih 11:36, 1 September 2008 (UTC)

Examples of fulltext visibility at present on eDoc

 * Metadaten öffentlich + Volltext öffentlich ("Public")
 * Metadaten öffentlich + Volltext für die MPG ("MPG wide access")
 * Metadaten öffentlich + Volltext für das MPI ("Institute level access": Damit schränkt man den Zugriff auf das gesamte MPI ein. Jeder Nutzer, der entweder für das MPI registriert ist – Erkennung über Login – oder sich über einen Computer des Instituts einloggt – Erkennung über IP-Adresse – hat Zugriff)
 * Metadaten öffentlich + Volltext für einen internen Nutzerkreis ("internal access": Damit schränkt man den Zugriff auf einen internen Nutzerkreis ein: local eDoc Manager, Collection Authority, Collection Moderator und Depositor/Owner des Dokuments)
 * Metadaten öffentlich + Volltext nur für privilegierte Nutzer ("Privileged users access": Damit schränkt man den Zugriff auf eine privilegierte Nutzergruppe ein: local eDoc Manager, Collection Authority, Collection Moderator, Nutzer mit Privileged View für diese Collection und Depositor/Owner des Dokuments)

Categorization of access rules for eSciDoc items and its components

 * Work in progress

Note: I struggle with the heterogeneous terminology used in this section, e.g.
 * file versus content versus component - I like "component"
 * (file-)access rules versus file/component visibility versus content access level - I like "component visibilty"
 * content is private versus component visibility is set to private or file-access rule is private - I like "component visibility is set to xxx"
 * Any chance to harmonize it? --Inga 12:59, 12 September 2008 (UTC)
 * we have a problem on this issue with harmonization. So in our functional specifications on COlab, we use term "file". In the core services terminology, this term hay actually two concepts: component (object that depicts the "file" i.e. id+properties+metadata) and the content (actual content of the file). I.e. in our functional specifications and mappings to the core services it is file=component+content. (differences are locators, where locator=component (content does not exist)).

In any case would be happy to try sinchronization. Here is the state therefore at present:


 * Access rules for component=access rules for content.
 * Item.Handler has two methods (retrieveComponent, retrieveContent)
 * In internal functional specification, we use "file" (which is most closest to "component")

So, based on Inga's proposal above, I would use 2 terms: component and content. --Natasa 07:54, 16 September 2008 (UTC)

Assumptions

 * Access rules for item and its components depend on the status of the corresponding item version in the repository workflow:
 * Pending/Submitted item versions: metadata and components are only visible to those users involved in the workflow
 * Released/Withdrawn item versions:
 * metadata are not subject to access control (always public)
 * component/content access rules depend on policy/visibility of the component stated by the responsible users


 * Withdrawn, Released -> For retrieval of component (and its content) there is no difference if the item status is "Withdrawn" or "Released" i.e.:
 * a) user has rights to retrieve the component (and its content) of item I1 that has status "Released"
 * b) Item I1 has been "Withdrawn" (no access rights change happened for the component (and its content)
 * c) user has rights to retrieve the component (and its content) of item I1 that has status "Withdrawn" in the same manner as for case a)


 * Note: this means that we do not hide component(content) of an item when it is Withdrawn unless the access rights for it are changed. (Because the last version still may have status "Released" -> The whole item gets the status "Withdrawn" - we do not change any longer the last version status when the item is withdrawn).


 * I haven't got this point --Inga 13:01, 12 September 2008 (UTC)
 * Changed the example and text, hopefully it's clearer now --Natasa 08:01, 16 September 2008 (UTC)

PubMAn

 * if content is private - it can be viewed from QA persons - to ensure data quality ?


 * should not be the case imho. QA by librarians can be done only if they have access to files, i.e. at least org unit level access assigned--Ulla 22:37, 15 September 2008 (UTC)
 * The QA roles (Moderator) (but also RC roles in future) are assigned on Context level (e.g. PubMan collection) and not on OU level. --Natasa 08:09, 16 September 2008 (UTC)


 * if content is private - it can be viewed from RC persons - to be able to set-up embargo period, ensure correctness of the policy, evtl. change the access level?

same as above imho--Ulla 22:37, 15 September 2008 (UTC)
 * RC/QA persons can change the content access level to a higher (in case the content depositor was not certain on it)


 * why not keep it simple...changing access levels can be treated same as changing typos in metadata. allowed to anyone who is allowed to edit metadata. would not go for any restrictions, because in that case we have to think also on restricting the modification of metadata, eg. abstract, keywords etc.--Ulla 22:37, 15 September 2008 (UTC)
 * Access level for the component(and its content) can not be considered as metadata of item. Is here meant the metadata of the component? --Natasa 08:09, 16 September 2008 (UTC)


 * A scientist may send a link to his/her private content to access (collaboration) (key-based access)

whatever key-based access means, sounds good to allow access to certain, named users. basic idea of "private collectiosn" in the context of linguistic literature project, to my understanding.--Ulla 22:37, 15 September 2008 (UTC)
 * This scenario came-up from the Digital collections, and is not for "named" users -> its for not known users. --Natasa 08:09, 16 September 2008 (UTC)


 * A scientist may give access rights to his collaboration group (alternative to key-based access or separate scenario?)

what is difference between collaboration group and group of named users? --Ulla 22:37, 15 September 2008 (UTC)
 * I would also think here there is no difference --Natasa 08:09, 16 September 2008 (UTC)


 * Admin user may give "privileged-view-role" access rights to a user on context level or container level (e.g. Faces)

FACES

 * Owner may give access and modification rights for private (pending/submitted) albums (containers) to known users (UC sharing an album --> Faces R 3.5) --Kristina 07:14, 11 September 2008 (UTC)
 * Ok, this is something different, as this is Container level access (not component-level access). But here as well a question: if the owner has rights to update the members of the "album" shall this be also allowed for these known users? --Natasa 08:09, 16 September 2008 (UTC)
 * Yes. The idea is that these known users are not allowed to change the metadata of the container but only to add and remove items from the container. Further on, the right to delete or release the container stays only with the owner. --Kristina 08:35, 27 November 2008 (UTC)

Technical implementation considerations

 * performance - not critical for single access request to the content, critical in cases of e.g. thumbnails (e.g. Faces)
 * if access rights are "public" - no extra policy should be checked
 * is visibility the right term? (check-policy-required may actually be the real meaning of current visibility property of the file)

Access to Object can get restricted to groups of users

 * each component-content can have access-rights for one ore more groups.
 * a group contains a list of users.
 * the users that belong to the group are retrieved by group-rules:
 * rule could be: all users from org-units 1,2 and 3
 * rule could be: users with id 1,2 and 3
 * within the policy of a role it can be defined if group-access-rights should be evaluated when calling method retrieve-content.
 * when calling the method retrieve-content is is additionally checked if the user may retrieve the item!!!
 * eg user has role author. when calling method retrieve-content it is evaluated if (s)he has the right to retrieve the item and if (s)he belongs to the group(s) that have rights to access the component.
 * eg user has role moderator. when calling method retrieve-content it is only evaluated if (s)he has the right to retrieve the item. No evaluation of groups is done.

Configuration

 * We need a group-handler that allows:
 * create, delete, update, retrieve group
 * attach, detach component to a group
 * The second point has nothing to do with the group handler, but imho is about editing of a policy? Maybe I got it wrong?--Natasa 10:25, 19 September 2008 (UTC)

Use Cases

 * item is not in status released/withdrawn
 * no groups are attached to the component, everybody who may retrieve the item may also retrieve the content.
 * item is in status released/withdrawn, content is private - it can be viewed from QA/RC persons
 * attach group with no users to component.
 * QA/RC-role-policies dont check group-access-rights, so they can access the content if they may retrieve the item.
 * All other policies check groups, so access gets denied because user doesnt belong to the group.
 * item is in status released, only persons that belong to specific ous may access the content.
 * define group for people that belong to the specific ous.
 * users that belong to one of the orgUnits may access the content. Additionaly all users with a role that doesnt evaluate groups have access-rights (eg the depositor of the item or the moderator of the context).
 * A scientist may give access rights to his collaboration group
 * define group for the users of the collaboration-group
 * these users then may access the content (note: only if they may retrieve the item!).
 * Admin user may give "privileged-view-role" access rights to a user on context level or container level
 * this is not done by group-access-rights but by granting the user the role privileged-viewer for the context.

Questions

 * key-based-access
 * i think this is a separate discussion and propably does not fit in the concept. We first would have to discuss the concrete functionallity (key only valid once? only valid for one a certain user? only valid for a certain time?...).
 * Owner may give access and modification rights for private (pending/submitted) albums (containers) to known users.
 * This is about granting rights for the whole item. we are talking about access-rights to content(files)
 * which are the roles for QA/RC? For these roles we must not do group-access-checking.