Difference between revisions of "Talk:Linguistic Literature Single Pages"

From MPDLMediaWiki
Jump to navigation Jump to search
Line 14: Line 14:


==general info regarding the UC==
==general info regarding the UC==
'''URL of a single page'''
* view by URL/download of single page (calculated on the fly?)
* view by URL/download of single page (calculated on the fly?)
* maybe check again the requirement stated as
* maybe check again the requirement stated as
Line 27: Line 29:
:* check Digilib Page as well
:* check Digilib Page as well


*initial idea developed by Malte, Natasa
'''Search within a document'''
**constraints of core service:
* initial idea developed by Malte, Natasa
***core service does not know on which page the term is found
** constraints of core service:
***it would be huge modification to Fedora gsearch to actually enable this.
*** core service does not know on which page the term is found
**have the single file PDF search as a separate service for each file in the fulltext search result
*** it would be huge modification to Fedora search to actually enable this.
 
** have the single file PDF search as a separate service for each file in the fulltext search result
*basic use case on full text search and links to pages maybe already supported via OpenParameters of adobe?
* basic use case on full text search and links to pages maybe already supported via OpenParameters of adobe?
(Firefox, IE8 play nice..) http://pubman.mpdl.mpg.de/pubman/item/escidoc:383417:3/component/escidoc:383415/amerindian_gomez2008-1_s.pdf#search=language
: See one example here: http://pubman.mpdl.mpg.de/pubman/item/escidoc:383417:3/component/escidoc:383415/amerindian_gomez2008-1_s.pdf#search=language (Firefox, IE8 play nice..)


==Java PDF==
==Java PDF==
* http://java-source.net/open-source/pdf-libraries/jpedal
* http://java-source.net/open-source/pdf-libraries/jpedal
* http://pdfbox.apache.org/
* http://pdfbox.apache.org/

Revision as of 12:19, 7 April 2010

Links[edit]

More on PDF[edit]

Interesting to annotate also images, pdfs etc.[edit]

general info regarding the UC[edit]

URL of a single page

  • view by URL/download of single page (calculated on the fly?)
  • maybe check again the requirement stated as
"When referencing one page within an PDF via an URL, this URL shall include the logical page number. The logical page numbers are not always in one sequence. For example it could happen that one document first starts with 5 unnumbered pages, than with page I - V, and than with page 3-66. That means that the no information concerning the logical page numbers can be derived from the total number of pages. That also means that the unnumbered pages can not be referenced."
  • what happens if there is a logical page number that does not form a valid URL?
would mean we have to validate it... would be too much
proposal: stick with physical page numbers--Natasa 12:14, 10 March 2010 (UTC)
as far as i can see, the idea to have single pages accessible via URL came also about because there should be a way to link citations to fulltexts in a way like it is possible with google books. this scenario is only possible with logical page numbers, because those are typically available in citation data. (when put into a URL - most probably as query parameter - page numbers should be URL-encoded, so there is no such thing as "invalid page numbers".)--Robert 12:38, 10 March 2010 (UTC)
what if the page is not numbered at all?--Natasa 14:03, 10 March 2010 (UTC)
you can always make the pages accessible as well by physical page number, just add another URL parameter to the API.--Robert 14:33, 10 March 2010 (UTC)
Thanks, good idea to offer both. --Natasa 12:07, 11 March 2010 (UTC)
  • or if logical page number is smth like e.g. 4/99 we might have problems (to be checked with devs)
  • check Digilib Page as well

Search within a document

  • initial idea developed by Malte, Natasa
    • constraints of core service:
      • core service does not know on which page the term is found
      • it would be huge modification to Fedora search to actually enable this.
    • have the single file PDF search as a separate service for each file in the fulltext search result
  • basic use case on full text search and links to pages maybe already supported via OpenParameters of adobe?
See one example here: http://pubman.mpdl.mpg.de/pubman/item/escidoc:383417:3/component/escidoc:383415/amerindian_gomez2008-1_s.pdf#search=language (Firefox, IE8 play nice..)

Java PDF[edit]