Difference between revisions of "PubMan Func Spec Easy Submission"

From MPDLMediaWiki
Jump to navigation Jump to search
m (Redirecting to PubMan Func Spec Ingestion)
 
(3 intermediate revisions by one other user not shown)
Line 1: Line 1:
{{PubMan_Funtional_Specification}}
#REDIRECT [[PubMan Func Spec Ingestion]]
 
=Functional Specification=
==UC_PM_EASM_01 upload file in structured format==
===Status/Schedule===
*Status: '''implemented'''
*Schedule:'''R4.1'''
 
===Motivation===
*The user wants to upload a locally created BibTeX file, containing one reference.
 
===Expected outcome===
Reference is uploaded to a collection on PubMan.
 
The item is created on PubMan and can be edited/modified afterwards.
 
===Steps===
# The user chooses a collection where he has depositor privileges
# The user chooses to upload a file in structured format.
# The user starts the upload.
# The system processes the uploaded file, checks for completeness, creates an item and releases them immediately. The use case ends successfully.
 
===Alternatives===
 
4.1 The user gets an error message, indicating type of error (time out during upload, invalid file, validation rules not met).
 
4.1a. User tries the upload again. continue with step 3.
 
4.1.b. User cancels the upload procedure.
 
4.2 For BibTeX upload: the BibTeX record contains "URL". In this case the system creates a full text within the record. The user can specify the content type and change the MIME Type in the edit mask afterwards. If the system is unable to upload the file, the user gets an error message and continues with step 3.
 
===Actors involved===
User with depositing rights for at least one collection
 
===Data involved===
BibTeX File, structured format. See [https://zim01.gwdg.de/repos/smc/tags/public/PubMan/example_bitex_AEI.bib example file] by the AEI.
 
===Constraints===
*BibTeX files are idiosyncratically structured; [http://www.gerd-neugebauer.de/software/TeX/BibTool.en.html BibTool] may help with preprocessing/normalization.
** e.g. upper and lower case corrections, resolving macros, unicode encodings vs. (la)tex encoding, etc.
*Basic TeX Parsing is needed to interpret non-ascii characters etc., see for example https://dev.livingreviews.org/projects/epubtk/browser/trunk/ePubTk/lib/bibtexlib.py .
*In BibTeX fields are not repeatable; thus multiple authors need to be parsed from the ''author'' field.
*BibTeX allows for different formats of representing an author's name; thus the parser needs to be smart enough to recognize them all. See for example http://search.cpan.org/~gward/Text-BibTeX-0.34/BibTeX/Name.pm
 
====Suggested steps to prepare BibTeX files for import====
 
*Normalize BibTeX with BibTool (resolves macros, may be used to map field names, unifies the syntax).
*Parse the - now normalized - records.
*Allow for/provide a mapping for non-standard fields (and possibly genres).
*Handle substructure of fields
**Multiple entries in author and keyword fields. (see also http://nwalsh.com/tex/texhelp/bibtx-23.html)
**(La)TeX encoding for special characters/formulae. (see for example https://dev.livingreviews.org/projects/epubtk/browser/trunk/ePubTk/lib/charmaps/tex2unicode.py)
*Map BibTeX fields/genres (including non-standard ones) to eSciDoc PubItem application profile. Mapping can be found [[PubMan_Func_Spec_Easy_Submission/Bibtex_mapping|here.]]
*Java Tools to check
** http://jabref.sourceforge.net/
** http://www-plan.cs.colorado.edu/henkel/stuff/javabib/
 
===Future development===
*Upload files in structured format containing more than one reference
see [[Talk:PubMan_Ingestion|Ingestion]]

Latest revision as of 12:39, 5 January 2011