Difference between revisions of "Talk:EDoc to PubMan migration"

From MPDLMediaWiki
Jump to navigation Jump to search
Line 26: Line 26:
Those with more entries may possibly have different name variants (for first names).
Those with more entries may possibly have different name variants (for first names).


''select substr(p2.uml_idx,1,position(',' in p2.uml_idx)), count(*)  
''select substr(p2.uml_idx,1,position(',' in p2.uml_idx)), count(*) from people p2 where p2.col=73 and p2.rm is null and archivalgrp(p2.grp)=1 group by substr(p2.uml_idx,1,position(',' in p2.uml_idx)) order by 2 desc
from people p2
where p2.col=73 and p2.rm is null and archivalgrp(p2.grp)=1  
group by substr(p2.uml_idx,1,position(',' in p2.uml_idx))
order by 2 desc
''
''

Revision as of 13:23, 5 September 2008

Journal vocab: Documentation of Despoina's work[edit]

FIRST STAGE: PROCEDURE for cleaning NON-SFX JOURNAL NAMES.
Filter text: coalesce(sfxid,)= and rm=0.
Sorting: edoctitle ascending

  1. Run the filter and check each row (search entry in ZDB, Web, etc.). If check in ZDB not successful: check edoc record and google for edoc record title
    1. For entries which are not journals: set rm=1
    2. For entries which appear to be a series (but not a journal): set rm=2
    3. For entries which neither has been found via ZDB or edoc title Google search: set rm=3
    4. For entries which neither has been found in eDoc: set rm=5
  2. Replace journal abbreviations with full-journal name. TIPP: Synchronization in case not certain or in case when journaltitle is abbreviation: check journal abbreviation in ZDB


(PLEASE NOTE: Titles starting with the word "Proceedings" were left out at this phase; they should be dealt with at another point)

SECOND STAGE: PROCEDURE for merging existing journal entries
THIRD STAGE: Define "Ansetzungsregeln"


Queries to match names[edit]

  • Placed on colab, not to loose them

The query returns the number of possible authors, and the number of entries in docs. Those with more entries may possibly have different name variants (for first names).

select substr(p2.uml_idx,1,position(',' in p2.uml_idx)), count(*) from people p2 where p2.col=73 and p2.rm is null and archivalgrp(p2.grp)=1 group by substr(p2.uml_idx,1,position(',' in p2.uml_idx)) order by 2 desc