Difference between revisions of "Talk:Control of Named Entities"

From MPDLMediaWiki
Jump to navigation Jump to search
(→‎Potential future projects: removed potential future projects - already in progress)
Line 1: Line 1:
== Potential future projects ==
'''Working group on authority files'''
*build a working group on authority files (out of PubMan pilot group and other interested Max Planck Institutes). Possible tasks:
**sample creation of controlled entries of MPG-related authors (maybe of one institute) according to standard guidelines and in Library of Congress Authority File format.
'''Further development of CoLab page'''
*to clarify terminology:
**create a general page (Übersichtsseite) on ''controlled vocabulary'' (introductory text about different areas in the domain of controlled vocabularies (including thesauri, classifications, subject headings, ontologies, etc.)
**create the following subpages:
***rename current page on ''ControlledVocab'' into ''control of named entities'' - done--[[User:Sabine|Sabine]] 09:53, 17 April 2008 (CEST)
***(technical) service currently developed might be called something like ''service for control of named entities'' - done--[[User:Sabine|Sabine]] 11:20, 16 April 2008 (CEST)
== History ==
== History ==
Under this heading bits and pieces will be collected that have been arisen during discussions etc. and that should be kept for the sake of completeness.
Under this heading bits and pieces will be collected that have been arisen during discussions etc. and that should be kept for the sake of completeness.

Revision as of 17:07, 24 August 2009

History[edit]

Under this heading bits and pieces will be collected that have been arisen during discussions etc. and that should be kept for the sake of completeness.

Naming[edit]

There had been a discussion what kind of term we should use instead of authority files/authority records/etc. On 30th of November it has been agreed to use the term „control of named entities / controlled named entities“. During the discussion the following alternative terms have been proposed:

  • normalizing metadata/data entries
  • managing controlled vocabularies
  • harmonizing metadata/data entries
  • controlling metadata entries
  • terminology management
  • reference information service (can be split in: reference person service, reference affiliation service, reference journal service, etc.)
  • (proposal) master data management is another term that can be considered (though it is not an exact same meaning like used in ERP, CRM systems)
  • (proposal) metadata value domains/metadata domain value
  • controlled metadata values
  • see also ISAAR(CPF):http://www.ica.org/en/node/30230

If i'm not mistaken CDS Invenio (the software of the CERN document server) calls the concept knowldege base. It's also worth mentioning, how it functions: No normalization of data is performed on input, i.e. the data in the database will always be what was inserted by the metadata editor. Knowledge bases do only come into play when outputting data. In this case, output formatting templates can associate certain fields with knowledge bases and thus force normalization of data. This concept is due to a requirement which should be familiar from eDoc: Scientists want to be able to get their data out exactly as it was inserted - e.g. author names in all-caps. Obviously this approach has it's own share of problems. Basically all methods which investigate the data (searching, duplicate detection, etc.) must take knowledge bases into account, or will only work in idiosyncratic ways.

Remark from Traugott: CDS invenio is based on different usage and business model and therefore their features are not really applicable to our scope.