ESciDoc Administrative Search

From MPDLMediaWiki
Jump to navigation Jump to search

Introduction[edit]

Discussion[edit]

Input FIZ[edit]

What are the requirements for the administrative Search, using Lucene?

Currently we use the filters for admin-search, accessing the db-cache that contains all fedora-objects.

  • The db-cache provides search-capabilities for all properties and all metadata of an object.
  • Only the last version of an object is searchable.
  • It is possible to sort by all searchable fields
  • Additionally it is possible to apply the special filter-criteria "user" and "role" which filters

the list of retrieved objects with access-rights of the given user with the given role.

  • If no "user" and "role" filter is applied, the list of retrieved objects is filterd

with the access-rights of the current user with all of his granted roles.

  • The new administrative search will use Lucene as underlying search-database.
  • The Lucene administrative indexes will contain additional fields for access-rights filtering

Access-Rights are filtered during search by expanding the search-query with the access-rights filter.

  • Are the requirements stated above also the requirements for the administrative search using a lucene-index?
  • Are there additional requirements?
    • search in fulltext?
    • search older versions?
    • custom search-result schemas?
  • Index design (one lucene-index containing all object-types or one lucene-index per object-type)
  • Indexing Performance (requirement: synchronous indexing)
    • reindexing of complete trees when members are added/removed from container
  • Proposal:
    • Only use lucene administrative index for fedora-objects (items, containers, contexts, org-units, content-relations, content-models).
    • Leave old filter methods for objects in internal database (user-accounts, user-groups, grants, roles, statistics).

Input MPDL[edit]

On functionalities:

  • DB Cache allows filtering by exact value.
  • Administrative search should allow search as regular search (also supported wildcards)
  • Additional requirements
    • fulltext searching:
      • administrative search shall also allow searching in fulltexts depending on user privileges
        • if that would be resolved with administrative search, maybe is good to understand implications for extension of the normal search with respect to privileges on fulltexts
    • custom search-results schemas: not clear
  • searching older versions
    • so far we did not have any special requirement to search for older versions of a resource
    • proposal: stick with this rule, however, we need to actually be able to search through both latest versions and latest releases with admin search
  • index design: not certain on implications - first impressions:
    • items/containers - single index;
    • OUS, contexts, content-models-> separate indexes;
    • content-relations: start with separate index and see - we have not solution development at the moment based on content relations .. we need to check what are real scenarios (e.g. get me all resources tagged created after 2010 and tagged as "my publications") ... this kind of query would require more complex indexing strategy ..
  • reindexing of complete trees when members are added/removed from container ... not clear why .. maybe some more explanation in here
    • related: see collaborator-role descriptions at JIRA

Requirements[edit]

  • search all statusses