Difference between revisions of "PubMan Func Spec Easy Submission"

From MPDLMediaWiki
Jump to navigation Jump to search
m (Redirecting to PubMan Func Spec Ingestion)
 
(2 intermediate revisions by one other user not shown)
Line 1: Line 1:
{{PubMan_Funtional_Specification}}
#REDIRECT [[PubMan Func Spec Ingestion]]
 
=Functional Specification=
==UC_PM_EASM_01 upload file in structured format==
 
 
===Data involved===
BibTeX File, structured format. See [https://zim01.gwdg.de/repos/smc/tags/public/PubMan/example_bitex_AEI.bib example file] by the AEI.
 
===Constraints===
*BibTeX files are idiosyncratically structured; [http://www.gerd-neugebauer.de/software/TeX/BibTool.en.html BibTool] may help with preprocessing/normalization.
** e.g. upper and lower case corrections, resolving macros, unicode encodings vs. (la)tex encoding, etc.
*Basic TeX Parsing is needed to interpret non-ascii characters etc., see for example https://dev.livingreviews.org/projects/epubtk/browser/trunk/ePubTk/lib/bibtexlib.py .
*In BibTeX fields are not repeatable; thus multiple authors need to be parsed from the ''author'' field.
*BibTeX allows for different formats of representing an author's name; thus the parser needs to be smart enough to recognize them all. See for example http://search.cpan.org/~gward/Text-BibTeX-0.34/BibTeX/Name.pm
 
====Suggested steps to prepare BibTeX files for import====
 
*Normalize BibTeX with BibTool (resolves macros, may be used to map field names, unifies the syntax).
*Parse the - now normalized - records.
*Allow for/provide a mapping for non-standard fields (and possibly genres).
*Handle substructure of fields
**Multiple entries in author and keyword fields. (see also http://nwalsh.com/tex/texhelp/bibtx-23.html)
**(La)TeX encoding for special characters/formulae. (see for example https://dev.livingreviews.org/projects/epubtk/browser/trunk/ePubTk/lib/charmaps/tex2unicode.py)
*Map BibTeX fields/genres (including non-standard ones) to eSciDoc PubItem application profile. Mapping can be found [[PubMan_Func_Spec_Easy_Submission/Bibtex_mapping|here.]]
*Java Tools to check
** http://jabref.sourceforge.net/
** http://www-plan.cs.colorado.edu/henkel/stuff/javabib/
 
===Future development===
*Upload files in structured format containing more than one reference
see [[Talk:PubMan_Ingestion|Ingestion]]

Latest revision as of 12:39, 5 January 2011