Difference between revisions of "Talk:EDoc to PubMan migration"

Revision as of 08:05, 8 September 2008

Journal vocab: Documentation of Despoina's work[edit]

FIRST STAGE: PROCEDURE for cleaning NON-SFX JOURNAL NAMES.
Filter text: coalesce(sfxid,)= and rm=0.
Sorting: edoctitle ascending

Run the filter and check each row (search entry in ZDB, Web, etc.). If check in ZDB not successful: check edoc record and google for edoc record title
1. For entries which are not journals: set rm=1
2. For entries which appear to be a series (but not a journal): set rm=2
3. For entries which neither has been found via ZDB or edoc title Google search: set rm=3
4. For entries which neither has been found in eDoc: set rm=5
Replace journal abbreviations with full-journal name. TIPP: Synchronization in case not certain or in case when journaltitle is abbreviation: check journal abbreviation in ZDB

(PLEASE NOTE: Titles starting with the word "Proceedings" were left out at this phase; they should be dealt with at another point)

SECOND STAGE: PROCEDURE for merging existing journal entries
THIRD STAGE: Define "Ansetzungsregeln"

Queries to match names[edit]

Placed on colab, not to loose them

- The query returns the number of possible authors, and the number of entries in docs.

Those with more entries may possibly have different name variants (for first names).

select substr(p2.uml_idx,1,position(',' in p2.uml_idx)), count(*) from people p2 where p2.col=73 and p2.rm is null and archivalgrp(p2.grp)=1 group by substr(p2.uml_idx,1,position(',' in p2.uml_idx)) order by 2 desc

- The query returns all mpg authors that match the uml_idx criteria above (extended to mpg only)

select distinct p1.name, p1.fname, substr(p1.uml_idx,1,position(',' in p1.uml_idx)) from people p1 where substr(p1.uml_idx,1,position(',' in p1.uml_idx)) in ( select substr(p2.uml_idx,1,position(',' in p2.uml_idx)) from people p2 where p2.col=73 and p2.rm is null and archivalgrp(p2.grp)=1 and p2. mpgpeople=1 group by substr(p2.uml_idx,1,position(',' in p2.uml_idx)) ) and p1.mpgpeople=1 and p1.rm is null and p1.col=73 and archivalgrp(p1.grp)=1 order by 3, 1,2 asc

Difference between revisions of "Talk:EDoc to PubMan migration"

Revision as of 08:05, 8 September 2008

Journal vocab: Documentation of Despoina's work[edit]

Queries to match names[edit]

Navigation menu

Search

@@ Line 23: / Line 23: @@
 *Placed on colab, not to loose them
-The query returns the number of possible authors, and the number of entries in docs.
+**The query returns the number of possible authors, and the number of entries in docs.
 Those with more entries may possibly have different name variants (for first names).
 ''select substr(p2.uml_idx,1,position(',' in p2.uml_idx)), count(*) from people p2 where p2.col=73 and p2.rm is null and archivalgrp(p2.grp)=1 group by substr(p2.uml_idx,1,position(',' in p2.uml_idx)) order by 2 desc
 ''
+**The query returns all mpg authors that match the uml_idx criteria above (extended to mpg only)
+select distinct p1.name, p1.fname, substr(p1.uml_idx,1,position(',' in p1.uml_idx)) from people p1
+where substr(p1.uml_idx,1,position(',' in p1.uml_idx))
+in (
+select substr(p2.uml_idx,1,position(',' in p2.uml_idx))
+from people p2
+where p2.col=73 and p2.rm is null and archivalgrp(p2.grp)=1
+and p2. mpgpeople=1
+group by substr(p2.uml_idx,1,position(',' in p2.uml_idx)) )
+and p1.mpgpeople=1
+and p1.rm is null
+and p1.col=73 and archivalgrp(p1.grp)=1
+order by 3, 1,2 asc