PubMan Web Syndication Feeds

From MPDLMediaWiki
Jump to navigation Jump to search

Motivation[edit]

Web feeds allow software programs to check for updates published on a web site. To provide a web feed, a site owner may use specialized software (such as a content management system) that publishes a list (or "feed") of recent articles or content in a standardized, machine-readable format. The feed can then be downloaded by service providers that syndicate content from the feed, or by feed reader programs that allow Internet users to subscribe to feeds and view their content.

In the PubMan context a range of syndications are reasonable, e.g.:

  • recent releases in repository (item versions)
  • recent releases for a specific Organization Unit (item versions)
  • recent changes for a specific publication
  • each advanced search
  • Workspaces: latest submissions or changes

Usage[edit]

Web Syndication Feed (WSF) interface of the PubMan can be used

  • by users directly with the browsers (FF, IE, Opera, etc.) which have already built-in plugins for WSF managing
  • for automatized generation of the institutes web sites. See Feeding local webpages for more details.

Possible URIs[edit]

  • recent releases in repository (item versions)
    • Interface location: Home page
    • <link rel="alternate" ...>:
http://pubman.mpdl.mpg.de/feed/rss20/releases
  • recent releases for a specific Organization Unit (item versions)
    • Interface location: Page of the Organizational Search Results
    • <link rel="alternate" ...>:

for non-logged in users:

http://pubman.mpdl.mpg.de/feed/rss20/releases/affiliation/escidoc:persistent3

for logged in users:

http://pubman.mpdl.mpg.de/feed/atom/releases/affiliation/escidoc:persistent3/?user=escidoc:user2

Question: how to handle logged in users? Probably it is not necessarily to define ?user=escidoc:user2, the userHandler will be taken form the cookie... --Makarenko 17:15, 17 September 2008 (UTC)

Question regarding the example above: What would be the differences of releases for an affiliation if the user is logged-in or not? Maybe we should in the first step concentrate on feeds for "public views on PubMan", thus everything where the existence of an item (!!! not item component!!!) is not restricted by any means. --Inga 11:23, 18 September 2008 (UTC)

Vlad: OK, I would separate two kinds of the WSFs on PubMan: for "public views on PubMan" as Inga which is not login relevant and for "authorized views on PubMan".

[edit]

Widespread formats[edit]

For the moment there are 2 main branches of formats of the Web Syndication Feeds: RSS and Atom.

  • RSS can be divided into 2 sub branches: RSS 1.* and RSS 2.*. See here for features. Today, most feed readers and syndication tools supports both branches.
  • Atom is relative new WSF with many advantages. It has several backward compatible dialects.

Distribution: As of August 2008, the syndic8.com website was indexing 546,069 total feeds of which 86,496 were some dialect of Atom and 438,102 were some dialect of RSS, see feed summary. Following usage distribution of the RSS branches are taken from the Peachpit report from January 2007:

RSS version Usage
RSS 0.91 (RSS 2.* branch) 13%
RSS 1.0 (RSS 1.* branch) 17%
RSS 2.0 (RSS 2.* branch) 67%

Conclusion: It make sense to implement RSS 2.0 and Atom first. The Atom is the good candidate for implementation due to Google support and increasing usage for the moment. A later release may introduce support for RSS 1.* version if is explicitly requested by users

Implementation[edit]

  • structuredexportmanager will include the implementation of the functionality taking WSFs as the new export formats, e.g RSS20, ATOM, etc. Thus SearchAndOutput interface will be able to deliver WSFs via REST.
  • ROMA project can be used for the processing and generation of the wellformed WSFs.

Required:

  • Mapping PubMan MD -> RSS/Atom
  • Design changes in structuredexportmanager due to direct java bean implementation of the transformations
  • Revise user interface to allow auto discovery of feeds (<link rel="alternate" [...]) on the corresponding web pages
  • Identification of additional candidates for syndication

Further reading and related pages[edit]

RSS 2.0 Standard

JIRA Task

Atom Wikipedia

RSS Wikipedia

ROMA project