Difference between revisions of "Talk:EDoc to PubMan migration/MPIPL"

From MPDLMediaWiki
Jump to navigation Jump to search
Line 90: Line 90:


=== Genre types ===
=== Genre types ===
--[[User:Natasab|Natasa]] 12:04, 20 November 2008 (UTC) JIRA tickets created
'''Journal article'''
'''Journal article'''
fine, apa export works fine
fine, apa export works fine

Revision as of 12:04, 20 November 2008

Open Questions[edit]

Answered Questions[edit]

  • shall eDoc users be moved to PubMan?
    Talked to Karin today. She thinks, that there is no need to migrate users. --Nicole 13:32, 7 August 2008 (UTC)
  • who should be the owner of the migrated eDoc items in PubMan?
    Karin will be the owner
  • How to map eDoc Collections to PubMan Org. Units?
  • check "special usage" of fields like: Is Part Of. How are they used by the institute, and is there a rule on how we can map the content?
    • check the usage of relations with Karin. How important they are and how we should handle them.
    • gave eDoc IDs to Karin and she will check on a case by case basis --Nicole 14:29, 9 September 2008 (UTC)
  • how to migrate that record of type "interactive resource": http://www.edoc.mpg.de/240861 ?
    • should be mapped to genre "other" --Nicole 14:29, 9 September 2008 (UTC)


Questions regarding OU for MPIPL We we are doubting we made the right decision in february regarding the org units.

Background Nicole presented the ppt with 3 options:

  • Projects and Departments listed as separate org units without any relations between them
  • Departments as org units, Project information in metadata
  • Projects as subcategories of departments incorporated in one org unit

We choose for Option 3!

Advances as we saw then, were:

  • predefined structure (controlled list)
  • relations between departments and projects are visible

BUT, real data don't fit into these rigid rules.

It's already difficult to make reliable current structure of projects belonging to groups, let alone maintaining this scheme in the future. Some units don't belong to a department at all, like eg. Junior Research Groups which are being considered Projects but have no institute's department above them.

  • We are seriously thinking to switching to Option 1!

Advantages of Option 1

  • department and project can be chosen individually
  • no need for maintaining matrix stating which project belongs to which department
  • possibility to host projects under the institute's level
  • no need to duplicate projects under department
  • easier for migration: project=edoc collection (so there is a 1:1 mapping),

authors have one department (can also be mapped 1:1) (in our present migration procedure I would have to assign up to 8 org units for some users and we would have to delete 7 after migrating).

Disadvantages two org units have to be filled in by user, with possibility to forget one or make non-plausible choices (must be checked with workflow)

Questions:

  • What does this mean in terms of metadata? I guess the xml output will change? Are there different

fields for org units as departments and org units as projects?

  • How can a selection on department or project be made? Or do we simply all department A Group, B Group, etc... and

all projects A Project, B Project, etc.

  • Are there any future OrgUnit developments regarding Metadata, structure, relations we should take into account?
  • Do we really not mis something essential? It's kind of weird that we were so certain in february.
  • It will probably mean we have to ask Zest to alter there script.

--Karin 10:53, 25 September 2008 (UTC)

Explain[edit]

  • YB workflow
  • full text migration, ask if user with privileged view is needed to see full texts with private access level
    • user with privileged view is needed --Nicole 14:36, 9 September 2008 (UTC) outcome of telco with Karin
    • we will migrate all full texts, set to all full texts except for the public ones "private" visibility, create users with priv. view, set locators to eDoc (for non public full texts), the eDoc file visibility will also be put to PubMan into the component MD.
  • only released eDoc records will be moved to PubMan
  • duplicate problem has to be solved on eDoc, we can provide a report on possible duplicates though -
    • we looked through the list of duplicates you provided and deleted all real duplicates on eDoc. Everything which has the same title now aren't really duplicates, because they are

different titles (different authors, diff publication genres...)--Karin 14:00, 21 October 2008 (UTC)

Open Issues with QA migration[edit]

General[edit]

Indexing apparently doesn't work well, we can't find all publications which makes testing very enduring. For example in edoc MPIIPL publication year 2008 has 96 entries, only 23 are found with search.

Virtual collections from edoc have been migrated, that wasn't intended , eg. http://edoc.mpg.de/126883 from collecton endnote_import_2004_06. Haven't checked other items yet, from other virtual edoc collections.

edoc collection http://edoc.mpg.de/display.epl?col=73&grp=1332 (Categories and Concepts across Language and Cognition) hasn't been migrated due to spelling differences (capital Concepts on Pubman and small concepts on eDoc) should be fixed so that the Capital 'C' should be on PubMan


Advanced Search[edit]

Search for genre=book gives you book chapters as well with the results

According the search specification, the search for genres uses index "any-genres". Therefore for "book" genre search it will also give "book-chapters" genre search (as they have source with genre "book"). Issue reported in JIRA (http://zim01.gwdg.de:8080/browse/PUBMAN-568). We will ask FIZ for extra index. PubMan will be changed only after R40 if this indeed is the requirement. --Natasa 12:00, 20 November 2008 (UTC)

Genre types[edit]

--Natasa 12:04, 20 November 2008 (UTC) JIRA tickets created Journal article fine, apa export works fine

book data are not fully migrated, eg http://edoc.mpg.de/320760 http://qa-pubman.mpdl.mpg.de:8080/pubman/item/escidoc:36803:1 pages are missing, APA export: authors/editors, publishing place and publishers name are missing

PublishingInfo.Publisher, PublishingInfo.Place and TotalNumberOfPages have not been created in PubMan. Checked mapping. It is described there correctly :-) --Nicole 13:43, 19 November 2008 (UTC)

Thesis Missing data - name, place university - pages - MPI Series in Psycholinguistics (in EDOC part of ...) - no locator citation stye: APA (in press) instead year -> separate issue have to look into this

PublishingInfo.Publisher, PublishingInfo.Place and TotalNumberOfPages have not been created in PubMan. Checked mapping. Also the creation of the source failed. What is written in the eDoc field relType=ispartof should be written into source.title and the source.genre should be set to "Series". Also the identifier (URL in eDoc record) has not been created like in the mapping: Identifier.IdType.Other and value in Identifier.Id. --Nicole 13:58, 19 November 2008 (UTC)

example http://qa-pubman.mpdl.mpg.de:8080/pubman/item/escidoc:36777:1 http://edoc.mpg.de/320778

Talk missing data -conference details (name, place, date) example http://qa-pubman.mpdl.mpg.de:8080/pubman/item/escidoc:36921:1 http://edoc.mpg.de/300896

Event.Title, Event.Place, Event.StartDate have not been created as specified in the mapping. --Nicole 14:24, 19 November 2008 (UTC)

Proceedings

Missing data: - conference details (name,place, date) - Publisher details (name, place, pages) - APA: only year and title are displayed (NOT editor, event) example:http://qa-pubman.mpdl.mpg.de:8080/pubman/item/escidoc:36759:1 edoc id: 291041

PublishingInfo.Publisher, PublishingInfo.Place, Event.Title, Event.Place, Event.StartDate and Event.EndDate have not been created as specified in the mapping. --Nicole 15:24, 19 November 2008 (UTC)


POSTER Missing data: - conference details - publishing details: year Example: http://qa-pubman.mpdl.mpg.de:8080/pubman/item/escidoc:36854:1 EDOC id: 305652

Event.Title, Event.Place, Event.StartDate and Event.EndDate have not been created as specified in the mapping. --Nicole 15:24, 19 November 2008 (UTC)


Working paper: (= Paper in Edoc) missing data: - no year - no locator example:http://qa-pubman.mpdl.mpg.de:8080/pubman/item/escidoc:38836:1 EDOC id: 127485

TotalNumberOfPages has not been created as specified in the mapping. --Nicole 15:24, 19 November 2008 (UTC)

Proceedings Paper missing data - conference details (name, place, date) - publisher details (name, place - Physical Description (DVD) example http://qa-pubman.mpdl.mpg.de:8080/pubman/item/escidoc:36643:1 versus http://edoc.mpg.de/380011 --Karin 16:32, 19 November 2008 (UTC)

Special issue missing data - place of publication, publisher, total number of pages - APA: no editors ! example http://qa-pubman.mpdl.mpg.de:8080/pubman/item/escidoc:38137:1 versus http://edoc.mpg.de/359275 --Karin 16:32, 19 November 2008 (UTC)

Karin, Meggie, and Annemieke

--Karin 10:57, 15 November 2008 (UTC)