PubMan Web Syndication Feeds

From MPDLMediaWiki
Jump to: navigation, search

Motivation

Web feeds allow software programs to check for updates published on a web site. To provide a web feed, a site owner may use specialized software (such as a content management system) that publishes a list (or "feed") of recent articles or content in a standardized, machine-readable format. The feed can then be downloaded by service providers that syndicate content from the feed, or by feed reader programs that allow Internet users to subscribe to feeds and view their content.

In the PubMan context a range of syndications are reasonable, see #Candidates for syndications on PubMan

Widespread formats

For the moment there are 2 main branches of formats of the Web Syndication Feeds: RSSReally Simple Syndication and Atom.

  • RSSReally Simple Syndication can be divided into 2 sub branches: RSSReally Simple Syndication 1.* and RSSReally Simple Syndication 2.*. See here for features. Today, most feed readers and syndication tools supports both branches.
  • Atom is relative new WSF with many advantages. It has several backward compatible dialects.

Distribution: As of August 2008, the syndic8.com website was indexing 546,069 total feeds of which 86,496 were some dialect of Atom and 438,102 were some dialect of RSSReally Simple Syndication, see feed summary. Following usage distribution of the RSSReally Simple Syndication branches are taken from the Peachpit report from January 2007:

RSSReally Simple Syndication version Usage
RSSReally Simple Syndication 0.91 (RSSReally Simple Syndication 2.* branch) 13%
RSSReally Simple Syndication 1.0 (RSSReally Simple Syndication 1.* branch) 17%
RSSReally Simple Syndication 2.0 (RSSReally Simple Syndication 2.* branch) 67%
Conclusion: It make sense to implement RSSReally Simple Syndication 2.0 and Atom first. The Atom is the good candidate for implementation due to Google support and increasing usage for the moment. A later release may introduce support for RSSReally Simple Syndication 1.* version if is explicitly requested by users

Usage

Web Syndication Feed (WSF) interface of the PubManPublication Management can be used

  • by users directly with the browsers (FF, IEInternet Explorer, Opera, etc.) which have already built-in plugins for WSF managing
  • for automatized generation of the institutes web sites. See Feeding local webpages for more details.


Candidates for syndications on PubManPublication Management

The WSF can be divided into 3 groups according to the PubManPublication Management visibility

1. Public views

Implemented

  • recent releases in repository (item versions)
    • Interface location: Home page
    • <link rel="alternate" ...>:
http://pubman.mpdl.mpg.de/syndication/feed/rss20/releases
  • recent releases for a specific Organization Unit (item versions)
    • Interface location: Page of the Organizational Search Results
    • <link rel="alternate" ...>:
http://pubman.mpdl.mpg.de/syndication/feed/rss20/releases/affiliation/escidoc:persistent3

Further development if needed

  • recent changes for a specific publication
    • Interface location: Any View Item page
    • <link rel="alternate" ...>:
http://pubman.mpdl.mpg.de/syndication/feed/rss20/changes/item/escidoc:28123

2. Session dependent views

  • each advanced search (Cannot be implemented for the moment, the advanced search history scenario is not yet specified for PubMan)
  • Implemented not on the base of session handling. CQLCommon Query Language query with the framework indexes is used. Web presentation is here. Url syntax:
http://pubman.mpdl.mpg.de/syndication/feed/<feedType>/search?q=<CQLCommon Query Language query>

3. Authorization dependent views

  • Workspaces:
    • Latest submissions
    • Latest changes

In the first stage of the implementation we could concentrate on the Public views syndications.

Implementation

  • The new service SyndicationManager, it could be located in common_services to be accessible in all solutions (PubMan, Faces, ViRR). The SyndicatonManager can use structuredexportmanager for PubItemListXML->FeedXML transformations transformations considering WSF formats as the new export formats, e.g RSS20, ATOM, etc.
  • structuredexportmanager should be redesigned to be able to export aggregated information like: name of the feed, its description, date of last change, etc.

--Makarenko 09:39, 6 April 2009 (UTCCoordinated Universal Time):

    • The current version of the SyndicationManager calls the Search&Export interface directly to retrieve item list
    • Search&Export delivers eSciDocEnhanced Scientific Documentation XMLExtensible Markup Language for item list
  • The ROME project provides a set of open source Java tools which cn be used for the processing and generation of the wellformed WSFs.

Outcome of the SyndicationManager design and architecture meeting

  • no EJBEnterprise JavaBeans interface will be implemented
  • SyndicationManager will consist of a single presentation module (syndication_presentation)
  • The transformation to RSSReally Simple Syndication/Atom (Rome) will be done in the new Transformation Service (in R4, the transformation will be encapsulated into an own class in the SyndicationManager)
  • The feed definition (configuration XMLExtensible Markup Language) will be held in the SyndicationManager
  • The feed definition should be extendable by:
    • definition of alternative search/data services
    • definition of the formats of the input/output of these services
  • SyndicationManager will use SearchAndExport instead of the Search module (in R4, Search will still be used and only publication items will be used for syndication)
  • Therefore, SearchAndExport will be extended by a pure "eSciDocEnhanced Scientific Documentation XMLExtensible Markup Language" output format
  • Tom will test/explore caching in a proxy.

Required ToDos:

  • Mapping PubManPublication Management MDMetadata -> RSSReally Simple Syndication/Atom
  • Design of the SyndicationManager component
  • Revise user interface to allow auto discovery of feeds (<link rel="alternate" [...]) on the corresponding web pages
  • Identification of additional candidates for syndication

see comment regarding naming on Talk:PubMan_Web_Syndication_Feeds

Further reading and related pages

RSS 2.0 Standard

JIRAError Management and Project Management Software Tasks: AS-377, AS-586

Atom Wikipedia

RSS Wikipedia

ROME project

Media RSS Module (mrss), ROME module for mrss.

See Design in EA: /Desing Model/Use Case Realization/SyndicationManager

What Is RSS by Mark Pilgrim

w3c feed validator