Talk:EDoc to PubMan migration

From MPDLMediaWiki
Jump to navigation Jump to search

Journal vocab: Documentation of Despoina's work[edit]

FIRST STAGE: PROCEDURE for cleaning NON-SFX JOURNAL NAMES.
Filter text: coalesce(sfxid,)= and rm=0.
Sorting: edoctitle ascending

  1. Run the filter and check each row (search entry in ZDB, Web, etc.). If check in ZDB not successful: check edoc record and google for edoc record title
    1. For entries which are not journals: set rm=1
    2. For entries which appear to be a series (but not a journal): set rm=2
    3. For entries which neither has been found via ZDB or edoc title Google search: set rm=3
    4. For entries which neither has been found in eDoc: set rm=5
  2. Replace journal abbreviations with full-journal name. TIPP: Synchronization in case not certain or in case when journaltitle is abbreviation: check journal abbreviation in ZDB


(PLEASE NOTE: Titles starting with the word "Proceedings" were left out at this phase; they should be dealt with at another point)

SECOND STAGE: PROCEDURE for merging existing journal entries
THIRD STAGE: Define "Ansetzungsregeln"


Queries to match names[edit]

  • Placed on colab, not to loose them

The query returns the number of possible authors, and the number of entries in docs. Those with more entries may possibly have different name variants (for first names).

select substr(p2.uml_idx,1,position(',' in p2.uml_idx)), count(*) from people p2 where p2.col=73 and p2.rm is null and archivalgrp(p2.grp)=1 group by substr(p2.uml_idx,1,position(',' in p2.uml_idx)) order by 2 desc