EDoc to PubMan migration/Mapping

From MPDLMediaWiki
Revision as of 07:53, 22 July 2013 by Grossmann (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

eDoc to PubMan Mapping[edit]

Mapping of eDoc genres and elements to PubMan genres and elements for importing eDoc data to PubMan. eDoc fields are taken from eDoc XML export schema, PubMan fields from eSciDoc MDs.

Mapping of genres[edit]

eDoc eSciDoc Comment Independent (= not contained in another item)
Report Report - x
Article Journal Article -
Book Book - x
Conference-Paper Conference Paper -
Conference-Report Conference Report -
Habilitation Thesis and set <degree.habilitation> - x
InBook Book Chapter -
Issue Issue -
Interactive Resource Other - x
Journal Journal - x
Lecture / Courseware Lecture/Courseware - x
Other Other - x
Paper Paper -
PhD-Thesis Thesis and set Degreee.phd - x
Poster Poster -
Proceedings Proceedings - x
Series Series - x
Software Other - x
Talk at Event Talk at Event -
Thesis Thesis - x

What is meant with "Independent" column?--Natasa 10:30, 19 August 2008 (UTC)
We are referring to the distinction between independent and contained genres in the table below --Inga 20:03, 24 August 2008 (UTC)

Mapping of elements[edit]

eDoc eSciDoc Comment
corporatebody Creator.CreatorType.Organization and Creator.CreatorRole.Editor
title Title
titlealt AlternativeTitle
markuptitle not mapped
bundletitle not mapped
language Language
publisher PublishingInfo.Publisher or
Source.PublishingInfo.Publisher for contained[1] items
markuptype not mapped
markuptitle not mapped
bundletitle not mapped
publisheradd PublishingInfo.Place or
Source.PublishingInfo.Place for contained[1] items
datepublished Date.Date and Date.Type.published in print
datemodified Date.Date and Date.Type.modified
dateaccepted Date.Date and Date.Type.accepted and Date.Date and Date.Type.published in print
datesubmitted Date.Date and Date.Type.submitted
spage Source.StartPage
epage Source.EndPage
artnum Source.SequenceNumber
journaltitle Source.Title and Souce.Genre.Journal
journalabbreviation Source.AlternativeTitle
issuetitle Source.Title and Souce.Genre.Issue
issuenr Source.Issue
issuecontributorfn Source.Creator.CreatorType.Person and Source.CreatorRole.Editor string needs to be parsed to identify individuals with source.creator.FamilyName and source.creator.GivenName
issuecorporatebody Source.Creator.CreatorType.Organization and Source.CreatorRole.Editor string needs to be parsed to identify individual organizations
volume Source.Volume Decision: two sources on the same level will be created. Source.Source will not be created for now. --Nicole 08:10, 29 August 2008 (UTC)
nameofevent Event.Title
placeofevent Event.Place
dateofevent Event.StartDate
enddateofevent Event.EndDate
titleofproceedings Source.Title and Source.Genre.Proceedings
proceedingscontributorfn Source.Creator.CreatorType.Person and Source.CreatorRole.Editor string needs to be parsed to identify individuals with source.creator.FamilyName and source.creator.GivenName
booktitle Source.Title and Source.Genre.Book
bookcreatorfn Source.Creator.CreatorType.Person and Source.CreatorRole.Author string needs to be parsed to identify individuals with source.creator.FamilyName and source.creator.GivenName
bookcontributorfn Source.Creator.CreatorType.Person and Source.CreatorRole.Editor string needs to be parsed to identify individuals with source.creator.FamilyName and source.creator.GivenName
bookcorporatebody Source.Creator.CreatorType.Organization and Source.CreatorRole.Editor string needs to be parsed to identify individual organizations
editiondescription PublishingInfo.Edition or
Source.Volume for Conference Paper
See comment on volume
titleofseries Source.Title and Source.Genre.Series See comment on volume
seriescontributorfn Source.Creator.CreatorType.Person and Source.CreatorRole.Editor See comment on volume.
seriescorporatebody Source.Creator.CreatorType.Organization and Source.CreatorRole.Editor See comment on volume.
os not mapped we don't have genre software in PubMan and this field is only relevant for software
osversion not mapped
platform not mapped
instremarks not mapped
abstract Abstract
markupabstract not mapped
authorcomment not mapped
versioncomment not mapped only the last version of the item will be imported to PubMan
discipline subject
keywords subject
phydesc TotalNumberOfPages or
Source.TotalNumberOfPages for contained items [1]
Exception: TotalNumberOfPages for paper and issue
numberofwords not mapped
toc TableOfContents
refereed=joureview ReviewMethod.peer
refereed=notrev ReviewMethod.no review
refereed=intrev ReviewMethod.internal
refereed=notspecified not mapped
pubstatusType=Recommended not mapped This value will be used to filter the entries, that should go into the yearbook container in PubMan during the import of the eDoc data.
pubstatusType=Released not mapped This value will be used to filter the entries, that should go into the yearbook container in PubMan during the import of the eDoc data.
invitationStatusType=invited Event.InvitationStatus.Invited
invitationStatusType=contributed not mapped
invitationStatusType=notspec not mapped
fturl@viewftext accessType=INTERNAL private for further info see below
fturl@viewftext accessType=MPG private for further info see below
fturl@viewftext accessType=PUBLIC public for further info see below
fturl@viewftext accessType=USER private for further info see below
fturl@viewftext accessType=INSTITUTE private for further info see below
creator@internextern internexternType=mpg not mapped will be used for assigning MPS Organizational units to Creators in PubMan
creator@internextern internexternType=unknown not mapped not mapped, will be used for assigning external Organizational units to Creators in PubMan
creatorType=individual Creator.CreatorType.Person
creatorType=group Creator.CreatorType.Organization to be checked on a case by case basis if it needs to be mapped to a MPS department. For MPIPL already checked and it is not needed.
enduserType=expertsonly not mapped
enduserType=popular not mapped
enduserType=notspecified not mapped
roleType=advisor Creator.CreatorRole.Advisor
roleType=archivist ? not required for MPIPL
roleType=artist Creator.CreatorRole.Artist
roleType=author Creator.CreatorRole.Author
roleType=constructor ? not required for MPIPL
roleType=contributor Creator.CreatorRole.Contributor
roleType=editor Creator.CreatorRole.Editor
roleType=painter Creator.CreatorRole.Painter
roleType=preservator ? not required for MPIPL
roleType=referee Creator.CreatorRole.Contributor new role will be created
roleType=translator Creator.CreatorRole.Translator
identifierType=doi Identifier.IdType.DOI and value in Identifier.Id
identifierType=issn Identifier.IdType.ISSN and value in Identifier.Id or
Source.Identifier.IdType.ISSN and value in Source.Identifier.Id for contained items [1]
identifierType=isbn Identifier.IdType.ISBN and value in Identifier.Id or
Source.Identifier.IdType.ISBN and value in Source.Identifier.Id for contained items [1]
identifierType=uri Identifier.IdType.URI and value in Identifier.Id
identifierType=url Identifier.IdType.Other and value in Identifier.Id
identifierType=oai Identifier.IdType.Other and value in Identifier.Id
identifierType=isi Identifier.IdType.ISI and value in Identifier.Id
identifierType=localid Identifier.IdType.Other and value in Identifier.Id
identifierType=reportnum Identifier.IdType.Other and value in Identifier.Id
relType=isreferencedby ? will be a case by case decision, see: EDoc_to_PubMan_migration/MPIPL
relType=hasreferences ? will be a case by case decision, see: EDoc_to_PubMan_migration/MPIPL
relType=issourceof ? will be a case by case decision, see: EDoc_to_PubMan_migration/MPIPL
relType=hassource ? will be a case by case decision, see: EDoc_to_PubMan_migration/MPIPL
relType=isversionof ? will be a case by case decision, see: EDoc_to_PubMan_migration/MPIPL
relType=ispartof ? will be a case by case decision, see: EDoc_to_PubMan_migration/MPIPL
edoc item id Identifier.IdType.eDoc and value in Identifier.Id
educationalpurpose yesnoType=Yes not mapped
educationalpurpose yesnoType=No not mapped
markupType=latex not mapped
markupType=xml not mapped
markupType=html not mapped
docaff_external Creator.Person.Organization Will be mapped to all creators, where internexternType=unknown.
docaff_researchcontext not mapped usage not clear
affiliation=mpgunit Creator.Person.Organization Add all MPG affs. to all MPG authors (internexternType=mpg).
affiliation=mpgsunit Creator.Person.Organization Add all MPG affs. to all MPG authors (internexternType=mpg).
affiliation=mpgssunit Creator.Person.Organization Add all MPG affs. to all MPG authors (internexternType=mpg).
lastmodified not mapped only the last version of the item will be imported to PubMan
owner not mapped see: EDoc_to_PubMan_migration#Users
ownertype=fullname not mapped see: EDoc_to_PubMan_migration#Users
ownertype=email not mapped
ownertype=insid ?
copyright not mapped currently not needed for MPIPL
creators=creatorini Creator.Person.GivenName only if no creators=creatorngiven available
creators=creatornfamily Creator.Person.FamilyName if eDoc CreatorType is individual; Creator. Organization.Name if eDoc CreatorType is group
creators=creatorngiven Creator.Person.GivenName

Source Transformation Rules[edit]

Source Cases:

(1a) article -> issue -> journal[edit]


  • issuetitle
  • issuecontributorfn
  • issuecorporatebody
  • spage
  • epage
  • artnum


  • journaltitle
  • journalabbreviation
  • volume
  • issuenr

(1b) article -> journal[edit]


  • journaltitle
  • journalabbreviation
  • volume
  • issuenr
  • artnum

(2a) inbook -> book[edit]


  • booktitle
  • bookcreatorfn
  • bookcontributorfn
  • bookcorporatebody
  • editiondescription
  • volume
  • spage, epage
  • publisher
  • artnum

(2b) inbook -> book -> series[edit]


  • spage, epage
  • publisher
  • publisheradd
  • phydesc
  • artnum


  • titleofseries
  • seriescontributorfn
  • seriescorporatebody
  • volume

(3a) conference-paper -> proceedings[edit]


  • titleofproceedings
  • proceedingscontributorfn
  • spage, epage
  • phydesc
  • editiondescription
  • publisher
  • publisheradd

(3b) conference-paper -> proceedings -> series[edit]


  • editiondescription


  • titleofseries
  • seriescontributorfn
  • seriescorporatebody

(3c) conference-paper -> journal[edit]

see (1b)


  • journaltitle
  • journalabbreviation
  • volume
  • issuenr
  • artnum

(3d) conference-paper -> issue -> journal[edit]

see (1a)


  • issuetitle
  • issuecontributorfn
  • issuecorporatebody
  • spage
  • epage
  • artnum


  • journaltitle
  • journalabbreviation
  • volume
  • issuenr

(4a) proceedings -> issue -> journal[edit]


  • issuenr
  • journaltitle
  • journalabbreviation


  • issuetitle

(4b) proceedings (=book) -> series[edit]


  • publisher
  • publisheradd
  • phydesc
  • editiondescription


  • volume

Full texts[edit]

  • full texts are written in eDoc in the field: fturl, e.g. <fturl viewftext="MPG" filename="IR_Fortbildungsveranstaltung.ppt">http://edoc.mpg.de/get.epl?fid=45531&did=359231&ver=0</fturl>
  • we will migrate all full texts, set to all full texts except for the public ones "private" visibility, create users with priv. view, set locators to eDoc (for non public full texts), the eDoc file visibility will also be put to PubMan into the component MD.
Ask Karin, if user with priv. view is needed.

eDoc export XML[edit]


  1. 1.0 1.1 1.2 1.3 1.4 for a distinction of contained and independent genres, see genre table above