PARSE.Insight Workpackages

PARSE

=WP1 Project Management=

Objectives

 * To provide overall project management and co-ordination.
 * To ensure the quality management and assurance.
 * To maintain the information flow between partners.
 * To provide administrative and financial control according to work plan.
 * To co-ordinate the dissemination and promotion activities and to present the project towards the European Commission.

Description of Work
(possibly broken down into tasks) and role of partners

Task 1.1: Administrative and Financial co-ordination (Task leader STFC)
Establish and maintain financial records. Co-ordinate cost statements submission by all project partners, follow-up of EC payments, distribute partner shares according to consortium agreement agreed rules. Maintain contractual documents (Consortium Agreement). Coordinate internal and contractual periodic reporting. Organise and manage audits requested by the Commission. Of particular importance is the mandatory mid-term and end-of-term review, and technical or financial audits upon request by the Commission. Organise periodic project meetings.

Task 1.2: Scientific and technical co-ordination (Task leader STFC)
Coordinate the technical activities; monitor the work being carried out, the results and the necessary changes to the work plan. Organise Quality control, establish and benchmark project milestones. Co-ordinate timely production of deliverables, organisation of reviews, control of quality and consistency against technical and contractual aspects.

Task 1.3: Project management (Task leader STFC)
Organise project launch: establish procedures, project management methods and tools (SA Management organisation and processes, conflict resolution); prepare and organise project kick-off meeting. Handle IPR aspects and confidentiality Co-ordination and controlling of each partner’s results in order to allow for effective and efficient internal information delivery (e.g. deliverables). Maintain and monitor the work plan, monitor project progress, identification and trouble shooting of technical and organisational problems, technical co-ordination meetings

Task 1.4: Web site, project management support (Task leader STFC)
Produce and maintain the PARSE.Insight web site

Deliverables
(brief description) and month of delivery

none

=WP2 Development of Roadmap=

Objectives
The objectives of this work package are first to produce a draft roadmap which will guide the WP2 surveys, and second to produce a refined roadmap for the development of the parts of the e-infrastructure needed to assure the continued usability and accessibility of scientific information.

Description of work
(possibly broken down into tasks) and role of partners

Task 2.1: Survey of existing Roadmaps (Task leader FUH)
Review Roadmaps produced by such international bodies as ESFRI and e-IRG, as well as national plans across Europe and the USA (and elsewhere). Valuable ideas can also be drawn from discipline specific experiences from partners within the PARSE.Insight consortium, such as STFC, ESA and CERN, considering roadmaps for preservation and concrete possibilities of curating and re-using data. Beside these examples of “hard” sciences we will include many other research areas, for example Fusion and Nuclear Waste Management. Also roadmaps from “life” sciences, “social” sciences, government and industry will be sought.

Earth Science and environmental discipline for example have the difficulty that they include very different data types, quality and alternative sources. On the other hand, these data sources have immediate comprehension in the generic public and the handling of the information may need specific care.

This task is broken into a number of phases of work to allow us to bring in additional relevant materials.

Task 2.2: Initial synthesis of Roadmaps (Task leader STFC)
Produce a synthesis of the Roadmaps, guided by a technology-push/requirements-pull approach. An initial draft will be produced early in the project in order to provide some additional structure to the survey to be carried out in WP2.

Task 2.3: Revised Roadmap (Task leader ESA)
The Roadmap will be revised based on consultations with key representatives of Roadmap producers, nationally and internationally, and using the feedback from the WP2 surveys and position statements. The workshops and other dissemination activities will also act as a conduit for discussion and comments.

This task is broken into a number of phases of work – as indicated on the GANTT chart, to allow integration of information gathered from other work packages to be factored in to refine the Roadmap.

Deliverables
(brief description) and month of delivery

D2.1 Initial draft Roadmap (Month 3)

D2.2: Revised Strategic Roadmap (Month 24)

=WP3 Community Insight (includes Forum)=

Objectives
The main objective of WP3 is: Identify who’s doing what in preservation of digital scientific information and why are they doing it like that?

Research produces data and publication. Data are preserved by research institutes or data archives, publications by the institution (IR) and libraries. This traditional division of tasks is currently undergoing changes with the emergence of Open Access publishing and a move towards a one-stop repository covering both data and publications. But still the actors that preserve do not work together, because they don’t know each other and are not aware of similarities and differences in their work. The objects of WP3 are to identify:


 * who is doing what in specific research communities
 * who is held responsible?
 * what is the current state of preservation policies being implemented?
 * what are the main differences and similarities in approaches and how are these defined?
 * what are the incentives or obstacles for setting up a durable e-infrastructure, both in terms of organisations/funding as in technical issues?
 * what is the impact of changes in publishing models and emerging interactivity between publications and the underlying data?

The methodology will be to:
 * perform a general scan of actors involved.
 * design the surveys to elicit the capabilities and practices on e-Infrastructures and preservation used or planned by specific communities
 * capture survey results from target communities using the survey platform and, where appropriate, additional interviews.
 * perform in-depth case studies on two specific communities
 * analyse the survey results and producing an inventory of current and planned research and development relating to e-Infrastructures and permanent access at a national, European and global level.

This methodology will be described in D3.2.

Description of work
(possibly broken down into tasks) and role of partners

Task 3.1 Design and Implementation of a platform to support surveys and forum (Task leader KB)
To achieve the objectives of this work package it is necessary to implement a technical platform to support the surveys and the forum. These platforms will, if possible be based on existing solutions.

Task 3.2: Identification of survey targets (Task leader MPG)
To identify key targets we will identify a matrix of groups the first cut of which is as follows:

During this workpackage key players in these areas will be identified based on a scan of the current state of the art in a number of axes of technological R&D that are important to digital preservation environments. This scan will be based on literature review and will include
 * E-Science and supporting infrastructure
 * Digital Preservation Technologies
 * Archiving Technologies
 * Digital Library Technologies
 * Distributed Repository Technologies
 * Content and Knowledge Technologies

The identification work will use a combination of analytical techniques (e.g. citation analysis, clustering etc.) and personal expert networks and contacts to identify the most productive and important research and user groups within Europe. The aim is to identify core teams and communities of technology developers and users that are representative for showcasing the state in the different technology areas.

Task 3.3: Design, publish and process surveys with specific targeting (Task leader KB)
The survey’s questions will focus on requirements for preservation and permanent access, current status of facilities and preservation approaches, using the draft Roadmap to provide a framework to ensure that the survey results can be inter-compared and Gap Analysis report be produced in a consistent way. The surveys will be designed in different versions to address specific issues with specific communities, also making use of a vocabulary that is familiar to them. The evaluation of the survey results will be supported by bibliographic analysis and visualisation in topic maps – an advanced method in the field of technology monitoring.

Task 3.4: Identify and perform case studies (Task leader CERN)
Simultaneously with the processing of the surveys, further investigation of two case studies (High Energy Physics and Space Related communities Digital Objects) will be conducted. Field study and face to face interviews will be conducted to learn more about the specifics of the approach of preservation of all sort of digital objects (legal, administrative, technical documents, scientific publications, data and derived information,…) in the two research communities. This will include an exhaustive inside-out analysis (and risk analysis) of the strategy (or lack thereof) for the archival and re-use of the data in the specific research communities. Focus lies with what can define success or failure, what risks and challenges can be identified and what incentives or obstacles are to set-up a durable infrastructure for both data as well as publications.

Task 3.5: Produce Insight report from survey results and case studies (Task leader KB)
The Insight Report is one of the key deliverables of this project. It will be developed from the results of the surveys and case studies and will be accompanied by the database of key players, topics and activities.

The structure of the survey should ensure that the skeleton for the report can be derived relatively easily; however a great deal of care will be needed to ensure consistency of terminology.

Deliverables
(brief description) and month of delivery

D3.1 Survey and forum platforms (m3)

''Interactive platform with which the surveys can be delivered to the target audiences and the results captured. This platform will, if possible, be based on an existing solution.''

D3.2 Inventory of communities and actors in the e-infrastructure (research institutes, data archives, libraries) (m4)

''This will include an interactive map of key players and respective topics and a searchable database of R&D activity in Digital Preservation related technologies. It will also include a description of the methodology which has been adopted for the survey and case studies, which, together with the guidance provided by the draft Roadmap, will help to ensure a consistent approach is applied to all those surveyed, and studies in depth in the case studies.''

D3.4 Case study report (m12)

D3.5 Survey report (m13)

D3.6 Insight report (m18)

=WP4 Gap Analysis=

Objectives
The main objective of this work package is to identify gaps in the European e-infrastructure, including enabling technologies and corresponding interoperability models. Based on the findings of the draft roadmap and survey results, the discrepancy between the requirements from case studies and future scenarios (WP3) and the developing European research infrastructure is assessed in a systematic way. In this sense, the gap analysis determines “the space between where we are and where we want to be”, and serves as a means to bridge that space.

Expected results from this work package include:
 * specification of the PARSE Gap Analysis Schema for e-infrastructures
 * PARSE Gap Analysis based on the roadmap and case studies/future requirements
 * consultation on and community validation of PARSE Gap Analysis

The methodology will be to:
 * elicit dimensions and attributes for future scenarios
 * specification of stepwise procedure for conducting gap analyses
 * provision of IT support
 * application of the Parse Gap Analysis Schema on selected case studies and scenarios
 * validation by expert interviews and online surveys

Description
of work (possibly broken down into tasks) and role of partners

Task 4.1: Specification of the gap analysis framework (Task leader STFC)
In this task, relevant dimensions and corresponding attributes for future e-infrastructures will be elicited (in cooperation with WP3) and structured into a formal schema. A stepwise procedure will be developed for conducting the gap analyses, providing methods and metrics for enabling the identification of gaps within the European research infrastructure.

The gap analysis schema will be scenario-based, i.e. it will contain generic components which are customized by domain- and community-specific profiles.

Task 4.2: Provision of appropriate technology support for gap analysis (Task leader FUH)
The goal of this task is to assess the need for IT support for the modelling and data management taking place in other tasks in this work package, based on its prior experience in business intelligence and related disciplines. Resources have been allocated to configuring and customizing appropriate tools as well as providing support to other participants of WP4.

Task 4.3: Application of gap analysis (Task leader FUH)
The gap analysis framework will be applied in this task, performing thorough analyses of selected case studies and future scenarios (defined as part of WP 3) will be performed in this task. A comprehensive Gap Analysis Report (D4.2) will be prepared as major outcome, documenting the identified gaps and (intellectually) relate them to the concepts and objectives described in the roadmap and insight report.

Task 4.4: Consultation on gap analysis (Task leader CERN)
Consult with relevant actors on Gap Analysis Report (D4.2) and document the corresponding findings and interpretations as a (revised, extended) addendum to D4.2. Consultations will be aligned with WP3 and conducted in various modes (interviews with leading experts, workshops, online surveys).

Deliverables
(brief description) and month of delivery

D4.1 Specification of gap analysis schema and tool support (m10)

Report on the development of the gap analysis schema and corresponding tool support.

D4.2 Gap Analysis Report (m17)

Assessment of discrepancies between requirements from case studies and future scenarios.

D4.3 Gap Analysis final report (m24)

''Includes Addendum to D4.2: Community Feedback on Gap Analyses Extended interpretation of Gap Analyses based on consultation results.''

=WP5 Impact Analysis=

Objectives
The main objective of this work package is the creation of a framework for impact analysis to foster better justified investment decisions. These decisions will enable the appropriate development of sustainable e-repositories for scientific records.

The framework will be based on a set of metrics and models, with an underlying database and a support tool. There will also be an intellectual evaluation of the framework. The framework will facilitate comparison of different scenarios for closing the gaps that are identified in WP4. Furthermore it will feed back into the refinement of the roadmap (WP2).

Description of work
(possibly broken down into tasks) and role of partners

Task 5.1: Specification of Impact Metrics (Task leader STFC)
Define a number of metrics by which impact of particular investment decisions can be judged, and models linking underlying drivers and decisions on actions to these metrics. The models will take account of uncertainty and context and will be at appropriate level of complexity for the purpose.

Task 5.2: Consultation on Impact Metrics (Task leader KB)
Consultation on the proposed impact metrics by separate surveys of funders, researchers and commercial organisations. A number of “what if?” scenarios will be run to evaluate the impact of changing metrics and weighting.

This will allow us to produce a number of policy recommendations, based on the Roadmap, Impact Analysis and Sustainability workpackages.

Task 5.3: Impact Analysis Tool Specification (Task leader FUH)
The goal of this task to provide the specification of the functional (and non-functional) requirements for the impact analysis tool. The impact analysis tool is intended to facilitate the estimation of the comparative value of closing identified gaps, implementing the impact metrics and models from Task 5.1 to simulate the effects (potential impact) of particular decisions on the data models provided by WP4, i.e. to test hypotheses in various contexts regarding investments in specific areas of the e-infrastructure landscape.

Task 5.4: Impact Analysis tool (Task leader FUH)
Based on the specification of Task 5.3, the impact analysis tool will be developed to operate on the data provided by gap analyses (WP4), survey and case studies (WP3). It will draw on existing components and techniques from Business Intelligence, and possibly from Information Visualization as well for handling various types of operations (“testing hypotheses”) and their simulated effects ("impact") on multi-dimensional, multi-faceted data. The development will be performed in an iterative way; first a prototype will be implemented which will be validated and evaluated. In the end we will provide a final robust version for reuse.

Task 5.5: Application of Parse impact analysis (Task leader UGOE)
Supported by the Impact Analysis tool, analyses of selected case studies and future scenarios will be performed in this task. A comprehensive Impact Analysis Report (D5.3) will be prepared as major outcome, documenting the identified impact and (intellectually) relate them to the concepts and objectives described in the roadmap, the insight report and the gap analyses.

Deliverables
(brief description) and month of delivery

D5.1 Impact Analysis tool specification (Month 19)

Analysis of the likely impact of such developments, and specification and prototype tool to estimate impact of proposed projects on a European and World stage.

D5.2 Impact Analysis tool plus supporting database (Month 24)

A prototype will be available at month 20.

D5.3 Impact Analysis Report (m24)

=WP6 Sustainability and Evaluation=

Objectives
The objectives of this Workpackage are to:
 * address issues of sustainability of data resources
 * bring together international approaches for the evaluation and certification of long term preservation repositories in order to promote practical experiences with the evaluation and certification
 * to work with the global community on developing a standard on which a data repository certification process can be based.

Description of work
(possibly broken down into tasks) and role of partners

Task 6.1: Build on ongoing certification work (Task leader STFC)
Build on the on the ongoing work of the ISO-BOF and the documents by the RLG/NARA working group, Nestor, CASPAR, DPE, and DCC, with requirements to fit the needs of the global community, including the results from WP3 on the requirements of the communities for certification and accreditation of trusted repositories

The task will include the organisation of an international expert workshop on repository audit and certification, involving key players from the EU and USA plus others, including experts from different background that have experiences in evaluation and certification, e.g. with ISO9000, Moreq2 etc, with audit and evaluation tools like DRAMBORA, PLANETS, CASPAR Testbeds etc.

Task 6.2: Sustainability (Task leader ESA)
The sustainability of digital resources will further examine the results from the survey (WP3) with specific focus on the funding horizon of targeted institutions and the technical, sociological and legal constraints on the pooling of resources to share the burden of sustaining these resources and associated services.

Closely coupled to this will be the development of the mechanisms which underpin the trust in the longevity of such resources and services, namely certification, which is addressed in Tasks 6.2. and 6.3.

Task 6.3: Evaluation process (Task leader STFC)
Contribute to the certification standard based on ongoing work supplemented by the workshop results. Begin the process of submitting this into the ISO standardisation process.

Deliverables
(brief description) and month of delivery

D6.2 Workshop Report (Month 12)

D6.1: Sustainability report (m14)

A report which defines the constraints on and lays out a plan for sustainability of digital resources.

D6.3 Certification Report (Month 24)

A summary of the status of the work including the draft standard which has been produced and a description of the next stages – integrated with the Roadmap from WP2.

=WP7 Dissemination of Results=

Objectives
Disseminate results of the project through a series of intermediate workshops.

In summary, to raise awareness about the digital preservation issues and disseminate data and knowledge about the Project, i.e., its activities and results, among the digital preservation related governmental and scientific Communities within ERA context.

Furthermore, the work package will promoting dialogue among researchers and users to identify opportunities and common areas of interest; building-up cooperation with industry on agreed “hot” topics, cross-fertilisation of knowledge and, ultimately, creating and using synergies.

Description of work
(possibly broken down into tasks) and role of partners

The dissemination activities can be categorised as:

PARSE.Insight workshops
These three workshops, organised by CERN, KB and FUH will expose the work of the project to targeted audiences and critical stages of the project. The feedback from each of these workshops will help to advance the next phase of the project.

External conferences
We will target a spread of disciplines and stakeholders organised by other organisations.

Deliverables
(brief description) and month of delivery

D7.1 Dissemination Plan (m4)

The plan will form the basis of the activities in the second half of the project, to disseminate results and to further the international discussions of permanent access.

D7.2 Roadmap workshop (m10)

D7.3 Insight workshop (m15)

D7.4 Gap workshop (m18)

=Milestones=