PubMan Func Spec Search Engine Optimization

From MPDLMediaWiki
Revision as of 11:29, 26 February 2010 by Martin (talk | contribs)
Jump to navigation Jump to search

This Page provides information about the possible improvement of the appearance of PubMan content in search engines, especially in Google.

---- Work in progress ----


Current Situation with Google[edit]

PubMan[edit]

Example: Search for “Experimental demonstration of a suspended diffractively coupled optical cavity”.


  • The headline is taken from the actual 'title' tag used in pubman:

Publication Manager :: Experimental demonstration of a suspended ...

  • The snipped contains a confusing mixture of irrelevant information:

Item StatisticsRevisionsRelease HistoryView item. Experimental demonstration of a suspended diffractively coupled optical cavity. Item is Released ...


Google preferably takes the snippet information from the 'description' tag (up to 150 characters). PubMan currently does not use the 'description' tag.

Alternatively Google uses information from Open Directory Project (which can be prohibited through the 'robots' tag).

Articles indexed by Google Scholar have structured snippets with information about author(s), publication year and full length title. Additionaly features like “cited by”, “related articles” or “all xy versions” become available for those articles.

eDoc[edit]

eDoc content appears with structured information in the snipped, though the information provided here is not always the same:

Sometimes Google takes the beginning of an abstract, while in other cases e.g. the full length title and authors are displayed.

Future State[edit]

eDoc Solution[edit]

Robots trying to access PubMan are redirected to an index of PubMan content, which offers a previously defined set of metadata as plain text to be indexed by the search engine. The search engine takes the information provided in the tags of the index to fill the snippet of its search result page. This means that robots will not access the live system.

This solution is supposed to work not only for Google but for most search engines.


Metadata provided in the Index could be:

  • Autor(s): taken from PubMan creator - For items with more than three authors, the listing should be cut after the third author and end with ",...".
  • Full length title: taken from PubMan title.
  • Abstract: taken from PubMan: Abstract.
  • Genre: taken from PubMan Genre.
  • Number of pages: taken from PubMan Pages.
  • Keywords: taken from PubMan Free keywords.
  • OA status: The note "Open Access" could be provided if there is at least one file in visibility status "public".



Links[edit]

  • CoLab page about SEO [1]