Difference between revisions of "Talk:PubMan Func Spec Statistics"

From MPDLMediaWiki
Jump to navigation Jump to search
(re-structured talk page according to new page set up, copied current comments)
Line 1: Line 1:
==Scenarios==
=Scenarios=
===Comments/Discussion===
'''Comments and Discussions see talk/discussion page.'''
Comments and Discussions see talk/discussion page.
===On record level===


*The public user wants to get hints on reception of a certain article and would like to see the download numbers of a specific preprint when accessing the item page in PubMan.
==Basic PubMan item statistics==
The public user wants to get hints on reception of a certain article and would like to see the download numbers of a specific preprint when accessing the item page in PubMan.


*The public user wants to get an overview on the reception a  certain preprint, which is not limited to the local repository. He would like to see statistics on the download of the preprint by summing the download numbers of all copies located in distributed repositories.
==Interpretations/Analysis==
''It is not clear to me how this should be technically possible. Would require to retrieve and accumulate statistic data from different repositories. Are these only eSciDoc-repositories or different types of repositories? Can they be accessed by only one application (e.g. PubMan) or by different ones?'' --[[User:Haarlaender|Haarlaender]] 08:58, 27 October 2008 (UTC)
*google/google scholar
 
The author wants to understand the visibility of his research. He wants to understand how often is article is accessed based on a hit by google/google scholar search.  
''This scenario is related to standardized exchange of statistical data, e.g by using [http://www.niso.org/workrooms/sushi SUSHI]. Adressing this requirement is scheduled for later releases, it's mentioned here for not getting lost. In addition, we have to exchange knowledge with Margit Palzenberger, she is quite deep into provision of statistical data by publishers.
''--[[User:Uat|Ulla]] 17:10, 27 October 2008 (UTC)


*The author wants to monitor continuosly the reception of his own research output over time. He is interested in increasing/decreasing downloads of his papers over time and geografical distribution of research interest. He would like to have this information visualised by graphs, timelines, diagrams and geografical maps.
*geographical distribution
The author wants to understand the geografical distribution of research interest.


*The author decides to publish the detailled statistics on his own papers and share this information with public users.
*domain statistics
The author wants to understand the background of users accessing his research and would like to understand the domain they are coming from, such as .com, .edu, .gov


*The author wants to understand the visibility of his research. He wants to understand how often is article is accessed based on a hit by google/google scholar search.  
*institutional statistics
The author wants to know how if his publication is accessed by colleagues from the same or neighbour departments of his institution.


*The author wants to understand the background of users accessing his research and would like to understand the domain they are coming from, such as .com, .edu, .gov
*The author wants to know how if his publication is accessed by colleagues from the same or neighbour departments of his institution.


''Most of these requirements need the analysis of IP-addresses to get geographic or location-specific information of the user. It is not directly supported by the eSciDoc coreservice to automatically gather and store these IP-addresses. Thus, this must be realized in the solution. But then we have the problem that more than one solution can access one repository and it can even be accessed without any solution, e.g. by REST-interface. These requests wouldn't be counted and the statistic is distorted'' --[[User:Haarlaender|Haarlaender]] 08:58, 27 October 2008 (UTC)
''Most of these requirements need the analysis of IP-addresses to get geographic or location-specific information of the user. It is not directly supported by the eSciDoc coreservice to automatically gather and store these IP-addresses. Thus, this must be realized in the solution. But then we have the problem that more than one solution can access one repository and it can even be accessed without any solution, e.g. by REST-interface. These requests wouldn't be counted and the statistic is distorted'' --[[User:Haarlaender|Haarlaender]] 08:58, 27 October 2008 (UTC)


''In any case, any provision of statistical data has to have a "disclaimer" how to read and understand the statistics.'' --[[User:Uat|Ulla]] 17:10, 27 October 2008 (UTC)
''In any case, any provision of statistical data has to have a "disclaimer" how to read and understand the statistics.'' --[[User:Uat|Ulla]] 17:10, 27 October 2008 (UTC)
==Visualisations==
The author wants to get statistical data and its analyses/interpretations adequately visualised by graphs, timelines, diagrams and geografical maps.


===Individual reports===
==Reports==
 
Based on administrative searches (queries on demand and saved searches), reports (item lists and/or numbers) can be delivered:
*The author wants to have an overview on all his records deposited and released, with special information on the number of his OA-publications.
 
*The author wants to understand how many of his papers are co-authored.
 
*The author wants to understand how many of his papers are co-authored with members of Institution X.
 
*The author wants to know the number of items related to his name across the collection/contexts in the repository.
 
*The author wants to define dynamic reports, based on a certain query, to have ongoing up-to-date information.


===Administrative reports===
'''Current coverage in the repository'''
*The author wants to have an overview on all his records deposited and released


*The library would like to prepare a report on the visibility of the local research output stored in the repository. They need to understand the number of access to certain records and downloads of certain components from outside (i.e. not including access from institute). They would like to differentiate between human user access and requests from search engines.
*The institution wants to have an overview on all records of specific department deposited and released


'''Open Access'''
*The author wants to have an overview on all his records with at least one OA component
*The author wants to have an overview on the number of OA components
*The institution wants to have an overview on the number of OA components per department
*The OA policy department would like to prepare a statistical report on increase of OA publications in the last 2 years, with details on monthly level, for the MPS. (See more scenarios under [[PubMan_OA_Statistics|OA statistics]])
*The OA policy department would like to prepare a statistical report on increase of OA publications in the last 2 years, with details on monthly level, for the MPS. (See more scenarios under [[PubMan_OA_Statistics|OA statistics]])
*The local Press department needs reliable numbers of record entries per department, including the number of fulltexts. They would like to get the numbers in a re-usable format, e.g. XML.


*The local Press department needs reliable numbers of record entries per department, including the number of fulltexts. They would like to get the numbers in a re-usable format, e.g. XML.
===Citation metrics/Research evaluation===
*to be defined


==Requirements==
=Requirements=
===Basic information on item/component level===
==Basic item/component statistics==
'''Status: implemented''' (see [[PubMan_Func_Spec_Browsing_and_displays#UC_PM_BD_08_view_item_statistics | Use case view item statistics]]
'''Status: implemented''' (see [[PubMan_Func_Spec_Browsing_and_displays#UC_PM_BD_08_view_item_statistics | Use case view item statistics]]


Line 62: Line 55:
* Numbers of downloads for a specific file by users (anonymous/all
* Numbers of downloads for a specific file by users (anonymous/all
**Visibility: public
**Visibility: public
=Future developments=
*Co-authoring
**The author wants to understand how many of his papers are co-authored.
**The author wants to understand how many of his papers are co-authored with members of Institution X.
*Cross-repository
**The author wants to know the number of items related to his name across the collection/contexts in the repository.
*Harvesting statistics
**The public user wants to get an overview on the reception a  certain preprint, which is not limited to the local repository. He would like to see statistics on the download of the preprint by summing the download numbers of all copies located in distributed repositories.
''It is not clear to me how this should be technically possible. Would require to retrieve and accumulate statistic data from different repositories. Are these only eSciDoc-repositories or different types of repositories? Can they be accessed by only one application (e.g. PubMan) or by different ones?'' --[[User:Haarlaender|Haarlaender]] 08:58, 27 October 2008 (UTC)
''This scenario is related to standardized exchange of statistical data, e.g by using [http://www.niso.org/workrooms/sushi SUSHI]. Adressing this requirement is scheduled for later releases, it's mentioned here for not getting lost. In addition, we have to exchange knowledge with Margit Palzenberger, she is quite deep into provision of statistical data by publishers.
''--[[User:Uat|Ulla]] 17:10, 27 October 2008 (UTC)
*Citation metrics/Research evaluation
*Private statistics
**Some statistical information might be access restricted to the author himself or administrative staff.





Revision as of 19:20, 27 October 2008

Scenarios[edit]

Comments and Discussions see talk/discussion page.

Basic PubMan item statistics[edit]

The public user wants to get hints on reception of a certain article and would like to see the download numbers of a specific preprint when accessing the item page in PubMan.

Interpretations/Analysis[edit]

  • google/google scholar

The author wants to understand the visibility of his research. He wants to understand how often is article is accessed based on a hit by google/google scholar search.

  • geographical distribution

The author wants to understand the geografical distribution of research interest.

  • domain statistics

The author wants to understand the background of users accessing his research and would like to understand the domain they are coming from, such as .com, .edu, .gov

  • institutional statistics

The author wants to know how if his publication is accessed by colleagues from the same or neighbour departments of his institution.


Most of these requirements need the analysis of IP-addresses to get geographic or location-specific information of the user. It is not directly supported by the eSciDoc coreservice to automatically gather and store these IP-addresses. Thus, this must be realized in the solution. But then we have the problem that more than one solution can access one repository and it can even be accessed without any solution, e.g. by REST-interface. These requests wouldn't be counted and the statistic is distorted --Haarlaender 08:58, 27 October 2008 (UTC)

In any case, any provision of statistical data has to have a "disclaimer" how to read and understand the statistics. --Ulla 17:10, 27 October 2008 (UTC)

Visualisations[edit]

The author wants to get statistical data and its analyses/interpretations adequately visualised by graphs, timelines, diagrams and geografical maps.

Reports[edit]

Based on administrative searches (queries on demand and saved searches), reports (item lists and/or numbers) can be delivered:

Current coverage in the repository

  • The author wants to have an overview on all his records deposited and released
  • The institution wants to have an overview on all records of specific department deposited and released

Open Access

  • The author wants to have an overview on all his records with at least one OA component
  • The author wants to have an overview on the number of OA components
  • The institution wants to have an overview on the number of OA components per department
  • The OA policy department would like to prepare a statistical report on increase of OA publications in the last 2 years, with details on monthly level, for the MPS. (See more scenarios under OA statistics)
  • The local Press department needs reliable numbers of record entries per department, including the number of fulltexts. They would like to get the numbers in a re-usable format, e.g. XML.


Requirements[edit]

Basic item/component statistics[edit]

Status: implemented (see Use case view item statistics

Schedule: R3

  • Numbers of retrievals for a specific item from the framework by users (anonymous/all).
    • Visibility: public
  • Numbers of file downloads for a specific item by users (anonymous/all).
    • Visibility: public

Downloads of files with content type “copyright transfer agreement” and “correspondence” are not counted.

  • Numbers of downloads for a specific file by users (anonymous/all
    • Visibility: public


Future developments[edit]

  • Co-authoring
    • The author wants to understand how many of his papers are co-authored.
    • The author wants to understand how many of his papers are co-authored with members of Institution X.
  • Cross-repository
    • The author wants to know the number of items related to his name across the collection/contexts in the repository.
  • Harvesting statistics
    • The public user wants to get an overview on the reception a certain preprint, which is not limited to the local repository. He would like to see statistics on the download of the preprint by summing the download numbers of all copies located in distributed repositories.

It is not clear to me how this should be technically possible. Would require to retrieve and accumulate statistic data from different repositories. Are these only eSciDoc-repositories or different types of repositories? Can they be accessed by only one application (e.g. PubMan) or by different ones? --Haarlaender 08:58, 27 October 2008 (UTC)

This scenario is related to standardized exchange of statistical data, e.g by using SUSHI. Adressing this requirement is scheduled for later releases, it's mentioned here for not getting lost. In addition, we have to exchange knowledge with Margit Palzenberger, she is quite deep into provision of statistical data by publishers. --Ulla 17:10, 27 October 2008 (UTC)

  • Citation metrics/Research evaluation
  • Private statistics
    • Some statistical information might be access restricted to the author himself or administrative staff.