Difference between revisions of "PubMan Indexing"

From MPDLMediaWiki
Jump to navigation Jump to search
Line 22: Line 22:
===Organizations===
===Organizations===
*all Organization.Name for each language separately and for all languages at once
*all Organization.Name for each language separately and for all languages at once
*Organization.PID and PIDs of all Organizations in the organizational path hierarchy to the top level organizations (i.e. the PID of the authors affiliations and PIDs of all parent organization)
*'''''Example:'''''
**A PubItem "Title1" has authors:
***Müller, affiliated to Max-Planck Institute for Psycholinguistics (PID2)
***Müllerin, affiliated to Department1 (PID4) of Max-Planck Institute for Plasmaphysics
**If the organizational unit structure is as follows:
MaxPlanckSociety (PID1)
  |---Max-Planck-Instutue for Psycholinguistics (PID2)
  |---Max-Planck Institute for Plasmaphyics (PID5)
      |__Department 1 (PID4)
PlasmaphysicsSociety (PID6)
  |---Max-Planck Institute for Plasmaphyics (PID5)
      |__Department 1 (PID4)
 
**Outcome:
The index on Organization.PID should contain the following values for the PubItem "Title1":
    PID2, PID1, PID5, PID4, PID6
Even if the Author affiliations in the descriptive metadata are directly related only to PID2 and PID4
===Title===
===Title===
*Publication.Title and Publication.AlternativeTitle for each language separately and for all languages at once
*Publication.Title and Publication.AlternativeTitle for each language separately and for all languages at once

Revision as of 10:11, 14 February 2008

The last versions of items in state released have to be indexed with the following indexes:

Any Field[edit]

  • All elements defined in the functional Metadata set specification (see [SSESDMD]) are indexed for the search.
  • In addition, following attributes are indexed: <Id of item>, <PID of item>, <PID of file>
  • Files of any type are indexed, except of type "correspondence" and "copyright transfer agreement".

Special requirements for "any-field" index[edit]

In "any-field" index identifiers should be indexed as "identifier-type" and "identifier-value" as two separate tokens

e.g.

  • ISSN 2323-123123
  • ISBN 2323-23184738
  • URI http://mpdl.mpg.de/123

Users should be able to search for e.g. ISSN or ISBN* or ISBN 1234*

Genre[edit]

  • Publication.Genre

Persons[edit]

  • all Creator.Person.CompleteName with Creator.CreatorType = “Person”

Organizations[edit]

  • all Organization.Name for each language separately and for all languages at once
  • Organization.PID and PIDs of all Organizations in the organizational path hierarchy to the top level organizations (i.e. the PID of the authors affiliations and PIDs of all parent organization)
  • Example:
    • A PubItem "Title1" has authors:
      • Müller, affiliated to Max-Planck Institute for Psycholinguistics (PID2)
      • Müllerin, affiliated to Department1 (PID4) of Max-Planck Institute for Plasmaphysics
    • If the organizational unit structure is as follows:
MaxPlanckSociety (PID1)
 |---Max-Planck-Instutue for Psycholinguistics (PID2)
 |---Max-Planck Institute for Plasmaphyics (PID5)
      |__Department 1 (PID4)
PlasmaphysicsSociety (PID6)
 |---Max-Planck Institute for Plasmaphyics (PID5)
      |__Department 1 (PID4)
 
    • Outcome:

The index on Organization.PID should contain the following values for the PubItem "Title1":

   PID2, PID1, PID5, PID4, PID6

Even if the Author affiliations in the descriptive metadata are directly related only to PID2 and PID4

Title[edit]

  • Publication.Title and Publication.AlternativeTitle for each language separately and for all languages at once

Topic[edit]

  • Publication.Title, Publication.AlternativeTitle, Publication.TableOfContents, Publication.Abstract and Publication.Subject for each language separately and for all languages at once

Dates[edit]

  • Publication.Date

Event[edit]

  • all Event.Title,Event.AlternativeTitle and Event.Place for each language separately and for all languages at once

Identifier[edit]

  • ID and PID of item, PID of files

Special requirements for indexing identifiers as "any-identifier" index[edit]

identifier type and identifier value should be indexed together so that users are able to find itemA with identifier type ISSN and identifier value 123-234 in all of following "search by identifier" criteria:

ISSN:123-234 ISSN:12* ISSN:* ISSN 123-234

Source[edit]

  • all Source.Title and Source.AlternativeTitle for each language separately and for all languages at once

Components[edit]

We to index also some properties of the components in the search, such as:

  • content-category
  • visibility
  • component-name
  • pid (of component)

Maybe the best option to index all component properties by default.