Difference between revisions of "PubMan Func Spec Submission/arXiv mapping"

From MPDLMediaWiki
Jump to navigation Jump to search
m (New page: ==arXiv== ===Schema=== arXiv currently provides 3 metadata formats via the OAI-PMH interface: 1. oai_dc 2. arXiv 3. arXivRaw see also [http://export.arxiv.org/oai2?verb=ListMetadataFormat...)
 
Line 12: Line 12:


===Mapping from arXiv to PubItem===
===Mapping from arXiv to PubItem===
 
<pre><nowiki>
1. header/identifier => identifier (without "oai:arXiv.org:" prefix)
1. header/identifier => identifier (without "oai:arXiv.org:" prefix)
(note: this identifier is important because in the output is pointing to the exact version i.e. v1, v2) which is by arXiv used in "citeAs"
(note: this identifier is important because in the output is pointing to the exact version i.e. v1, v2) which is by arXiv used in "citeAs"
Line 31: Line 31:


10. http://arxiv.org/abs/&lt;header/identifier value> => dc:identifier (OTHER)
10. http://arxiv.org/abs/&lt;header/identifier value> => dc:identifier (OTHER)
</nowiki></pre>


===Issues===
===Issues===

Revision as of 12:35, 8 April 2008

arXiv[edit]

Schema[edit]

arXiv currently provides 3 metadata formats via the OAI-PMH interface: 1. oai_dc 2. arXiv 3. arXivRaw

see also arXiv metadata formats

For start we will use arXiv metadata format as it seems to require minimum parsing of the metadata values to PubItem.


Mapping from arXiv to PubItem[edit]

1. header/identifier => identifier (without "oai:arXiv.org:" prefix)
(note: this identifier is important because in the output is pointing to the exact version i.e. v1, v2) which is by arXiv used in "citeAs"

2. Authors
2.1. author/keyname =>LastName
2.2 author/forename => Firstname
2.3 author/affiliation => External organization 

3. title => title

4. report-no => Source/sequence-number (only if journal-ref is in, otherwise do not map?)
5. journal-ref => source/title (Parsing not in R3)
6. msc-class => dc:subject
7. abstract => abstract
8. categories => dc:subject
9. doi => dc:identifier (DOI)

10. http://arxiv.org/abs/<header/identifier value> => dc:identifier (OTHER)

Issues[edit]

  • Affiliations: (no possibility for parsing MPI für XXX as organizational units service does not fully support search by organization name)
    • as not certain if we would like to have it within the controlled vocab or directly ask for search-organizations methods from core services an issue is not created as extra requirement for core services. Might be internal requirement for controlled vocab service institutions).
  • Parsing of journal names: to check if it is feasible and if possible to relate it in future with controlled vocab service (journals)