Trip Report NEERI 2010
- Conference full title: Networking Event for the European Research Infrastructures (NEERI)
- Date/Place: 21 October 2010, Vienna
NEERI2010 is the second Networking Event of its kind, providing a follow-up to NEERI2009 held in Helsinki. The goal of NEERI2010 is to exchange ideas on a number of topics relevant for research infrastructures and to clear common ground on the further development and application of these topics. NEERI focuses on what we share and what we can learn from each other. Examples of such commonalities are architectural issues, communication with users and integration of services and tools.
Starting with a very interesting Keynote from Laurent Romary, clarifying the
Report of the High-Level Expert Group on Scientific Data, October 2010. Riding the Wave: how Europe can gain from the raising tide of scientific data File:Laurent Romary Plenary key speaker-HLG-SDI Report ppt - NEERI.pptx
and its impact on the humanities and social sciences.
The Expert group considers of outmost importance for the research infrastructures to establish collaboration in all important organizational and technical aspects, towards a vision of a "scientific e-Infrastructure that supports seamless access, use, re-use and trust of data. In a sense, the physical and technical infrastructure becomes invisible and the data themselves become the infrastructure – a valuable asset, on which science, technology, the economy and society can advance."
That would enable as well collaboration among researchers, increase the productivity of research, allow for sharing, use and re-use of data, however by preserving data authenticity, integrity and trustworthiness. It is both a challenge and an opportunity to establish proper data management and data integration infrastructures - having in mind the data scale, complexity and diversity and their accelerating growth. Beneficiaries of such established, living and collaborating research infrastructures would be not only the researchers, but also general public, funders and policy makers, as well as enterprises and industry. Therefore, EU and national agencies must define clear strategies and ensure sufficient resources for their implementation.
The Expert group had developed an initial wish list (adapted from the PARADE White Paper) containing minimum requirements that such an infrastructure has to fulfill: long term preservation (bitstream, format migration), persistent identification, standardization of metadata - format- and semantic level interoperability, proper implementation of access rights, enabling large groups of researchers to operate on the data, regular quality assesment and metrics on data usage, availability and reliability to feed back into further improvement of the infrastructure.
Additionally, a list of possible actions to overcome impediments such as financing, trustworthiness, data expertise, usage and complexity of the infrastructures, lack of published data, unwillingness to cooperate accross disciplines.
General mssage - close collaboration between researchers, funders, implementors of research infrastructures and industry has to be established from the very beggining - only then challenges could be addressed successfully.
Connecting the European Grid Infrastructure (EGI) to Research Communities[edit]
- Steve Brewer,, Chief Community Officer
Steve Brewer gave an overview of the EGI development and how EGI works towards achieving the goal of increasing the number of scientists and research groups that actively use and benefit from EGI. Communities have to be actively supported technically by enabling innovation in technologies (Grids, Clouds, virtualization) , innovation in software (provide reliable and persistent platform) and supporting international research (e.g. ESFRI). Additionally Human networks have to be developed and cultivated, at the end, humans are users of such infrastructures - through both general (training events, material, helpdesks, user and technology meetings) and discipline specific services (e.g. Bio Apps). Continuous definition and verification of user requirements has to be established. Grouping of these activities into Virtual organizations to help better address these issues through setting up Virtual Research Communities.
Certainly an experience that can be re-used as well in Humanities and Social Sciences oriented research infrastructures.
Grids, Clouds and Research Infrastructure[edit]
- Peter Wittenburg, Max Planck Institute for Psycholinguistics, Head of the Language Archive
Peter Wittenburg gave a very interesting definition of grid and cloud terms and how would a humanities researcher benefit from it, having in mind the nature of the research in humanities: is highly unpredictable, usually small and focused projects but scattered and diverse data; data has to be sustainable - e.g. it's about a human history. Usually, a lot of technical services, that in general have to be inexpensive. However, there are still issues with willingness of researchers to share their data, most probably due to amount of non-automated and intellectual work contributed in the data - researchers are sensitive to ownership. In general, computing over structured data is not an issue here, rather is the question how to quickly enable tools that can "simulate human mind".
DARIAH and CLARIN are certainly large european e-Infrastructure projects that try to address some of these issues - to enable for reliable, sustainable and trustworthy data storage that ensures the data integrity, authenticity, visibility, accessibility, interpretability, etc. Such data infrastructure must implement accordingly mechanisms for authorization and authentication , but also offer services and tools to work accross scattered data resources.
How did Grid/Cloud related projects contribute to Humanities and Social Sciences (SSH)? Grids are mostly used for data storage. SSH are mostly considered out of scope for the Grids, despite projects such as TextGrid (Germany) - Grids are used mainly for data storage; SSH researchers are not even aware if they are using cloud services or not (this may also be considered as positive outcome).
Whether SSH can benefit from Grids/Coulds is still not completely clear. There are many issues concerning financing, data ownership, long term data accessibility- especially when it comes to usage of Cloud-based services - "is AMAZOOGLE for data what Elseviers are for publications"?. Even if Could based services do not have to be commercial, there is a long way upfront to enable research Clouds unburdened from commercial use. Standardization is certainly an issue (work started at DTMF, see
At the end, it is all about services that shall be offered to the researchers. Still the question is whether SSH related e-Infrastructure projects can make an optimal use of all the knowledge and experience from decades of Grid development?