Difference between revisions of "PubMan Web Syndication Feeds"

From MPDLMediaWiki
Jump to navigation Jump to search
Line 2: Line 2:
Web feeds allow software programs to check for updates published on a web site. To provide a web feed, a site owner may use specialized software (such as a content management system) that publishes a list (or "feed") of recent articles or content in a standardized, machine-readable format. The feed can then be downloaded by service providers that syndicate content from the feed, or by feed reader programs that allow Internet users to subscribe to feeds and view their content.  
Web feeds allow software programs to check for updates published on a web site. To provide a web feed, a site owner may use specialized software (such as a content management system) that publishes a list (or "feed") of recent articles or content in a standardized, machine-readable format. The feed can then be downloaded by service providers that syndicate content from the feed, or by feed reader programs that allow Internet users to subscribe to feeds and view their content.  


In the [[PubMan]] context a range of syndications are reasonable, e.g.:
In the [[PubMan]] context a range of syndications are reasonable, see [[#Candidates for syndications on PubMan]]
* recent releases in repository (item versions)
* recent releases for a specific Organization Unit (item versions)
* recent changes for a specific publication
* each advanced search
* Workspaces: latest submissions or changes
 
==Usage==
Web Syndication Feed (WSF) interface of the PubMan can be used
* by users directly with the browsers (FF, IE, Opera, etc.) which have already built-in plugins for WSF managing
* for automatized  generation of the institutes web sites. See [[PubMan_Feeding_local_webpages | Feeding local webpages]] for more details.
 
== Possible URIs ==
* recent releases in repository (item versions)
** '''Interface location''': Home page
** <link rel="alternate" ...>:
<pre>http://pubman.mpdl.mpg.de/feed/rss20/releases</pre>
* recent releases for a specific Organization Unit (item versions)
** '''Interface location''': Page of the ''Organizational Search'' Results 
** <link rel="alternate" ...>:
for non-logged in users:
<pre>http://pubman.mpdl.mpg.de/feed/rss20/releases/affiliation/escidoc:persistent3</pre>
for logged in users:
<pre>http://pubman.mpdl.mpg.de/feed/atom/releases/affiliation/escidoc:persistent3/?user=escidoc:user2</pre>
 
'''Question:''' how to handle logged in users? Probably it is not necessarily to define ''?user=escidoc:user2'', the ''userHandler'' will be taken form the cookie... --[[User:Makarenko|Makarenko]] 17:15, 17 September 2008 (UTC)
 
'''Question''' regarding the example above: What would be the differences of releases for an affiliation if the user is logged-in or not? Maybe we should in the first step concentrate on feeds for "public views on PubMan", thus everything where the existence of an item (!!! not item component!!!) is not restricted by any means. --[[User:Inga|Inga]] 11:23, 18 September 2008 (UTC)
 
'''Vlad:''' OK, I would separate two kinds of the WSFs on PubMan: for '''"public views on PubMan"''' as [[User:Inga|Inga]] which is not login relevant and for '''"authorized views on PubMan"'''.
 
=== ===


==Widespread formats==
==Widespread formats==
Line 54: Line 23:


'''Conclusion:''' It make sense to implement '''RSS 2.0''' and '''Atom''' first. The '''Atom''' is the good candidate for implementation due to Google support and increasing usage for the moment. A later release may introduce support for '''RSS 1.*''' version if  is explicitly requested by users
'''Conclusion:''' It make sense to implement '''RSS 2.0''' and '''Atom''' first. The '''Atom''' is the good candidate for implementation due to Google support and increasing usage for the moment. A later release may introduce support for '''RSS 1.*''' version if  is explicitly requested by users
==Usage==
'''Web Syndication Feed (WSF)''' interface of the PubMan can be used
* by users directly with the browsers (FF, IE, Opera, etc.) which have already built-in plugins for WSF managing
* for automatized  generation of the institutes web sites. See [[PubMan_Feeding_local_webpages | Feeding local webpages]] for more details.
==Candidates for syndications on PubMan==
The WSF can be divided into 3 groups according to the PubMan visibility
===1. Public views===
* recent releases in repository (item versions)
** '''Interface location''': Home page
** <link rel="alternate" ...>:
<pre>http://pubman.mpdl.mpg.de/feed/rss20/releases</pre>
* recent releases for a specific Organization Unit (item versions)
** '''Interface location''': Page of the ''Organizational Search'' Results 
** <link rel="alternate" ...>:
<pre>http://pubman.mpdl.mpg.de/feed/rss20/releases/affiliation/escidoc:persistent3</pre>
* recent changes for a specific publication
** '''Interface location''': Any '''View Item''' page
** <link rel="alternate" ...>:
<pre>http://pubman.mpdl.mpg.de/feed/rss20/changes/item/escidoc:28123</pre>
===2. Session dependent views===
* each advanced search (Cannot be implemented for the moment, the advanced search history scenario is not yet specified for PubMan)
===3. Authorization dependent views===
* Workspaces:
** Latest submissions
** Latest changes
In the first stage of the implementation we could concentrate on the [[#1. Public views]] syndications.


== Implementation ==
== Implementation ==
*'''structuredexportmanager''' will include the implementation of the functionality taking WSFs as the new export formats, e.g '''RSS20''', '''ATOM''', etc. Thus '''SearchAndOutput''' interface will be able to deliver WSFs via REST.
* The new component '''feedermanager''', it could be located in '''common_services''' to be accessible for all [[eSciDoc]] solutions, not only [[PubMan]]
*'''structuredexportmanager''' can include the PubItemListXML->FeedXML transformations having  WSFs as the new export formats, e.g '''RSS20''', '''ATOM''', etc. However, the design of the '''structuredexportmanager''' should be extended to be able to export aggregated information like: name of the feed, its description, date of last change, etc.
 
*[https://rome.dev.java.net/ ROMA project] can be used for the processing and generation of the wellformed WSFs.
*[https://rome.dev.java.net/ ROMA project] can be used for the processing and generation of the wellformed WSFs.



Revision as of 13:39, 18 September 2008

Motivation[edit]

Web feeds allow software programs to check for updates published on a web site. To provide a web feed, a site owner may use specialized software (such as a content management system) that publishes a list (or "feed") of recent articles or content in a standardized, machine-readable format. The feed can then be downloaded by service providers that syndicate content from the feed, or by feed reader programs that allow Internet users to subscribe to feeds and view their content.

In the PubMan context a range of syndications are reasonable, see #Candidates for syndications on PubMan

Widespread formats[edit]

For the moment there are 2 main branches of formats of the Web Syndication Feeds: RSS and Atom.

  • RSS can be divided into 2 sub branches: RSS 1.* and RSS 2.*. See here for features. Today, most feed readers and syndication tools supports both branches.
  • Atom is relative new WSF with many advantages. It has several backward compatible dialects.

Distribution: As of August 2008, the syndic8.com website was indexing 546,069 total feeds of which 86,496 were some dialect of Atom and 438,102 were some dialect of RSS, see feed summary. Following usage distribution of the RSS branches are taken from the Peachpit report from January 2007:

RSS version Usage
RSS 0.91 (RSS 2.* branch) 13%
RSS 1.0 (RSS 1.* branch) 17%
RSS 2.0 (RSS 2.* branch) 67%

Conclusion: It make sense to implement RSS 2.0 and Atom first. The Atom is the good candidate for implementation due to Google support and increasing usage for the moment. A later release may introduce support for RSS 1.* version if is explicitly requested by users

Usage[edit]

Web Syndication Feed (WSF) interface of the PubMan can be used

  • by users directly with the browsers (FF, IE, Opera, etc.) which have already built-in plugins for WSF managing
  • for automatized generation of the institutes web sites. See Feeding local webpages for more details.


Candidates for syndications on PubMan[edit]

The WSF can be divided into 3 groups according to the PubMan visibility

1. Public views[edit]

  • recent releases in repository (item versions)
    • Interface location: Home page
    • <link rel="alternate" ...>:
http://pubman.mpdl.mpg.de/feed/rss20/releases
  • recent releases for a specific Organization Unit (item versions)
    • Interface location: Page of the Organizational Search Results
    • <link rel="alternate" ...>:
http://pubman.mpdl.mpg.de/feed/rss20/releases/affiliation/escidoc:persistent3
  • recent changes for a specific publication
    • Interface location: Any View Item page
    • <link rel="alternate" ...>:
http://pubman.mpdl.mpg.de/feed/rss20/changes/item/escidoc:28123

2. Session dependent views[edit]

  • each advanced search (Cannot be implemented for the moment, the advanced search history scenario is not yet specified for PubMan)

3. Authorization dependent views[edit]

  • Workspaces:
    • Latest submissions
    • Latest changes

In the first stage of the implementation we could concentrate on the #1. Public views syndications.


Implementation[edit]

  • The new component feedermanager, it could be located in common_services to be accessible for all eSciDoc solutions, not only PubMan
  • structuredexportmanager can include the PubItemListXML->FeedXML transformations having WSFs as the new export formats, e.g RSS20, ATOM, etc. However, the design of the structuredexportmanager should be extended to be able to export aggregated information like: name of the feed, its description, date of last change, etc.
  • ROMA project can be used for the processing and generation of the wellformed WSFs.

Required:

  • Mapping PubMan MD -> RSS/Atom
  • Design changes in structuredexportmanager due to direct java bean implementation of the transformations
  • Revise user interface to allow auto discovery of feeds (<link rel="alternate" [...]) on the corresponding web pages
  • Identification of additional candidates for syndication

Further reading and related pages[edit]

RSS 2.0 Standard

JIRA Task

Atom Wikipedia

RSS Wikipedia

ROMA project