Difference between revisions of "AWOB"

From MPDLMediaWiki
Jump to navigation Jump to search
(→‎Components of the collaborative work: Added questions to get clarified.)
 
(115 intermediate revisions by 7 users not shown)
Line 1: Line 1:
<accesscontrol>eSciDoc,, GAVO</accesscontrol>
{{AWOB_NO_TOC_Public}}
Preparation and planning for shared MPDL project "Scholarly Workbench for Astronomy"


Based on experiences and outcomes of [http://www.g-vo.org/www/ German Virtual Observatory (GAVO)]
[[Category:AWOB]]


Contacts@[http://www.mpe-garching.mpg.de/ MPE]:
== Initiators, Coordinators, the Team, and Partners ==
Jaiwon Kim, Gerard Lemson, Wolfgang Voges


*Project initiators: Dr. Jai Won Kim, Dr. Gerard Lemson, Dr. Wolfgang Voges, Nata&scaron;a Bulatovi&#263;, Ulla Tschida, Malte Dreyer
*Project coordinators (alphabetical order): Dr. Jaiwon Kim, Dr. Wolfgang Voges, Dr. Andreas Vogler
*[[AWOB_Contacts_Public|Team members]]
*Partner
** [http://www.mpe.mpg.de/ Max Planck Institute for Extraterrestrial Physics (MPE)]
** [http://www.mpa-garching.mpg.de/ Max Planck Institute for Astrophysics (MPA)]
** In a later phase all MPIs associated to astrophysics.


== AWOB in Few Words ==


=Scenarios=
AWOB, the Astronomer's Workbench, is a web based collaboration-data evaluation-publication-platform
==Collaborative environment==
which helps scientific working groups of any size to enhance the communication and to share resources,
Enable easy, wiki-like setup of collaborative environment for shared projects. Allow registered users to access the project, the related pages and linked and/or uploaded data. Link the collaborative platform with eSciDoc repository to allow long-term archiving and PIDs for content stored.
data, results, publication texts etc. throughout the whole scientific life cycle.  


Example for shared project workflow (see details in [https://zim01.gwdg.de/repos/smc/trunk/02_Related_Material/02_Projects/GAVO/mpdl20080912.ppt slides (restricted)]:
Resulting e-publications and publications to the Virtual Observatory allow long term archiving of data,
*definition of shared project and objectives
the annotation of metadata as well as easy access of digital outcomes by other users.
*definition of required experiments
*distribution of responsibilities
*tracking of activities and results: set up experiments, run experiments, produce data, postprocess data, analyse data, extract scientific results
*share data, combine results
*produce  publication-ready paper (shared authoring)


The AWOB project has a duration of 3 years and is based on experiences made by the German
Astrophysical Virtual Observatory ([http://www.g-vo.org GAVO]).


===Components of the collaborative work===
== AWOB and the Scientific Life Cycle ==
*Collaborative work is "publication-in-progress" developed in a Wiki environment as output of the research process.
*Collaborative work comprises
**textual components - mostly metadata describing textual part of a collaborative work such as: abstract, title, authors, affiliations, keywords, annotations of sources, references, and its structural information such as:subject headings, body sections etc.
**non-textual components - mostly representative data sets and illustrations to support the scientific results and conclusions which are presented in tables and figures. Example of figures are  images, plots, and diagrams etc.
**integrated external tools - interactive tools for visualizing non-textual components and manipulating underlying data of these components, querying of remote archives etc.


====Textual components====
[[File:LRAWOB.jpg|800px|alt=Scientific Life Cycle]] <br><br>
Textual components enable to:
Typical scientific lifecycle in astrophysics. The whole lifecycle is supported by AWOB. Copyrigth for the multi wavelength image of the Milky Way image illustrating "correlation with databases": NASA (National Aeronautics and Space Administration),
*link to references to preprints, published papers etc.(ADS, arXiv)
Goddard Space Flight Center. Within this reference you can find the references to the individual observations. Copyright for the M 104 mid infrared image illustrating "data visualization": NASA/JPL-Caltech/R. Kennicutt (University of Arizona), and the SINGS Team
*lookup annotated sources i.e. astronomy objects in databases such as Simbad, NED
*describe the collaborative work with metadata which could be used for query such as: authors, title, abstract, keywords
*describe the structure of collaborative work
[Questions(JK) 1. is a body of each section is also
a textual component? 2. is an equation a textual component?
3. Figure/table caption: could it be a metadata of non-textual component?]


====Non-textual components====
== Public Intro to AWOB as a PDF File ==
Non-textual components enable to:
*visually represent research data and illustrations such as experiment set up. These components could be represented in tables, and figures including images, plots, and diagrams and may have own metadata (e.g. image metadata)
[Question(JK) Could you give some specific examaples of image metadata? Depending on the context types of metadata could be quite different.
For example, is it like the size, and file type of an image or more science related metadata, e.g., observation date, location, etc ?]
*show metadata for an e.g. image
[JK: Please see above]
*invoke external data collection viewer
[Question(JK) Does this mean to invoke a external tool to view the underlying data?]
*download data related to the component
*open external tool for visualizing and working with the data
**for tabular data: e.g. TOPCAT
**for image data: e.g. Aladin
**for spectral data: e.g. SpecView, Splat, VOSpec
**[http://www.ivoa.net/Documents/Notes/Plastic/PlasticDesktopInterop-20060601.html PLASTIC enabled]


====Integrated external tools====
[[{{ns:media}}:2011-06-09.pdf|Download a PDF File]] with a generally understandable public intro to the Astronomer's Workbench. The slides contain supplementary information to this page and some images.
Integrated external tools enable to link from either textual or non-textual components to existing external astronomical services or tools such as:
*astronomical services - directly linked such as ADS, arXiv, NED, Simbad, VizieR, SkyServer or to enable discovery in the registries of astronomical sources
*common analysis environments (IDL, ...)
*services for retrieval of image data ( [http://www.ivoa.net/Documents/latest/SIA.html Simple Image Access specification SIA] )
*services for retrieval of spectra ([http://www.ivoa.net/Documents/latest/SSA.html Simple Spectral Access Protocol SSA])
*services for retrieval records from catalogs ([http://www.ivoa.net/Documents/latest/ConeSearch.html Simple Cone Search SCS])
*simulation database SimDB ([http://www.ivoa.net/cgi-bin/twiki/bin/view/IVOA/IVOATheorySimulationDatamodel Simulation Data model and Simulation data Access Protocol SimDAP])
*invoke queries on external services that support [http://www.ivoa.net/cgi-bin/twiki/bin/view/IVOA/IvoaVOQL Query languages ] such as ADQL/TAP


==Sharing of content==
== Situation and Challenge in Today's Astronomy ==
Enable for privileged users to upload,  and /or link and describe data with metadata, comments and notes.
*Standardised data: FITS, VOTable, Spectra, SQL query results
*Custom data (''more input needed'')


===Types of data===
Astronomical research is increasingly organised in projects that can include hundreds of collaborators, often spread out over the globe. In collaborations of any size the task of sharing resources such as data products or images and texts for publications can be very cumbersome. The raw data of many observatories have to be evaluated carefully to get reasonable scientific results like calibrated images. Since the programs of such pipelines remain normally unpublished, it is difficult for users not familiar with a certain instrument to mine the data.
*images (radio, optical, x-ray)
*images (i.e.simulation)
*spectra
*source catalog
*plot (i.e. spectrum)
*diagram
*flow chart
*illustration
*table (i.e. source catalog)
*publication (textual components)
===Metadata to be supported===
*Bibliographic metadata
**title, author, abstract, subject heading, journal metadata
*Structural metadata/elements
**section/TOC, annotation, footnote, equation, caption, references
*Other
**provenance (input files, make files, plotting scripts, analysis code, simulation code, ...)
**log files
**curation (''more input needed'')
**PIDs (ADS, IVOA)
**IVOA standards (VOTable, UCD, UTYPE, Data models, data access protocol, ...)


== Example of a Typical Astrophysical Quest: A Multi Wavelength Study of NGC 3079 ==


Multi wavelengths studies of objects allow a holistic view. Often, objects are observed by different groups at different wavelengths. The following images, e.g., combine optical, radio, and X-ray data of the edge-on spiral galaxy NGC 3079. The data reveal hot gas in the halo of the galaxy. The gas is thrown out of the disk into the halo by a super bubble of ionised gas. The images shown are based on the following data: Images with contour plots, image 1 and 2, on optical data (Digital Sky Survey, DSS), X-ray data (Pietsch et al. 1997), and radio & H-&alpha; data (both Veilleux et. al 1994). The false colour image shown in image 3 is based on Hubble Space Telescope and Chandra (X-ray) observations. The copyright belongs to NASA/CXC/STScI/U.North Carolina/G.Cecil.         


==Shared Authoring==
E-publications, as supported by AWOB, make calibrated images available for all scientists and thereby facilitate multi wavelengths studies.  
Author tools are provided to enable shared and standardised authoring. Authors are supported in developing publication-ready papers.
*Provision of text editor (emacs? TeX IDEs?)
*Import LaTeX article and conversion to html (incl. figures, tables)
*Templates for publication-ready papers (metadata attachments, links, figures, captions)
*allow publication-ready figures from visualisation tools


=MPDL project - draft=
[[File:NGC3079.jpg|450px|alt=NGC3079]] [[File:NGC3079_b.jpg|165px|alt=NGC3079 HST + X-rays]]
==Summary==
*online publications linked to/from online published data sets
*networking through standardisation
*collaboration enabled
*focusing on scientific practice (collaboration, publication), by re-using existing data centers and resource registries, existing standards,  and adding scientific "workbench-environment"
**no interruption of daily practice
**faciliate publishing of data
*online environment should support
**collaborative authoring for publications in virtual organisations
**explicit integration of data sets used for/in the final publication(s) by either uploading original data or linking to external data sets
**annotation of resources with metadata and identifier according to IVOA standard
**value-added services on known data types (search, mining, visualization, analysis)
**interfaces to external archives/registries/catalogs via standard protocols
**integration of client tools (needed and known in community)
**long-term preservation of resources (publications, data, services)
**registration of resources in IVOA standard registries


==Background==
== The Idea of an Astronomer's Workbench ==
*Results of German Astrophysical Virtual Observatory (GAVO)
**make results (data sets and services) of astronomical research easily available to community
**faciliate standardised publication of results (PIDs, Virtual Observatory standards, long-term archiving)
**focus on interoperability to enable networking (standards in use: [http://www.ivoa.net/Documents/latest/UCD.html IVOA])and automated discovery and re-use
**make use of standards-aware client tools and services (for cross-matching, visualisation, combination, data mining etc.)
*Re-use of data leads to more references and scientific improvements => proof of concept [http://www.mpa-garching.mpg.de/millenium Millenium Run]
**community-based quality control: errors discovered by others have improved data quality
**still, as no formal revisions were made, old/original data was lost
*currently, no (or limited) possibilities to add original data to publication, only representations/shortened examples:
**e.g. only image representations of multi-dimensional data
**e.g. only representative samples of large collections (images, spectra, source catalogs)
**e.g. only static data


*enable the shift from large data centers/resource registries (based on IVOA, formal, machine-readable, homoegeneous) to scientific practice, i.e. collaboration and publications (informal, human-readbale, heterogeneous)
The project Astronomer’s Workbench (AWOB) aims to assist astronomers in these tasks by building a web based platform  that will enhance the communication between the scientists and support the centralised organisation and collective usage of resources, ideas, data, results, and documentation.  
Use of AWOB should significantly improve coordination of these collaborative projects from their initial formulation to their completion as a publication. An additional focus of AWOB is to facilitate the publication of the project data to the astronomy community, both using Virtual Observatory (VO) standards, and as enhanced e-Publications.


==Needs==
== Functional Components ==
==Preparatory and related work==
*GAVO
**stable storage and curation of data products needed
**stabe environment for deploying Virtual Observatory protocols and other value-added services
*IVOA standard
*AstroGrid (?)


==Wider context/Re-use for others==
To support the mentioned idea of AWOB, the platform will contain the following main functional components:
*Long-term storage of data sets used in a publication (cf. deposit mandate?)
*Open access to all results of scientific research online (cf. Berlin declaration?)
*Showcase for added value of implemented standards
**mandated by some funding agencies
**IVOA dataset identifier in use by [http://adswww.harvard.edu/ ADS] (main portal for astronomers)
*Integration of standards, stable infrastructure and web2.0 technologies to facilitate dynamic and collaborative environments (cf. eSciDoc?)
*Re-use for astronomy community within MPS
**MPI Astronomie (Heidelberg)
**MPI Astrophysik (Garching)
**MPI extraterrestrische Physik (Garching)
**MPI Gravitationsphysik (Golm)
**MPI Kernphysik (Heidelberg)
**MPI Physik (München)
**MPI Radioastronomie (Bonn)
**MPI Sonnensystemforschung (Katlenburg-Lindau)


==Work description==
* Explicit project and user management modules for organizing and coordinating a project and associated collaborators.
===Pilot phase===
* Role based authentication and authorisation modules for restricting access to the project resources to selected users.
*Set-up community platform for creation of shared projects, registration of users, assign privileges
* Centralised storage and management of the shared data and other resources for a given project.
**Analysis of existing community-based platforms for linking community-environment to eSciDoc repository
* Value-adding tools for resource management tasks such as search, visualisation and analysis.
***[http://www.xwiki.org/xwiki/bin/view/Main/WebHome Xwiki]
* A WEB portal providing easy access to external web services, and public archives provided by the astronomy community, where possible using Virtual Observatory standards.
***[http://dc2008.de/wp-content/uploads/2008/09/1-dc_08_bartolo_lowe_tandy-final.pdf Wiki2Fedora used for MatDL/NSDL]
* Meta-data extraction and annotation for publication of finalised data products according to Virtual Observatory standards.
***other wiki software to be considered?
* One-click e-publication of data and the relevant artefacts generated during the project.
**Basic user management
***Author access - authors of the project
***Administrator access - project coordinator
***Public access - public users
**Linking
***to external data sets/services (URL based)
***to eSciDoc resources (Wiki extension to support special eSciDoc tag or URL based)
*Basic integration of community platform with eSciDoc pilot solution
**enable upload (or referencing) and description of data  
**enable invocation of external selected visualization tool from eSciDoc pilot solution (e.g. for FITS data)
**integrate arXiv and ADS(if possible) as sources for fetching publications, pre-prints
*Explore possibilities for federated search (within community platform, eSciDoc repository, 1-2 external services)


====ToDo====
== AWOB History ==
Clarify:
*which data to be supported in pilot phase
*precise functional requirements for data management (Scenario level)
*eSciDoc managed vs externally referenced data
*formats (e.g. FITS) and how they should be supported (e.g. storage, search, visualization via external tools, etc.)
*types of annotated sources and relating them to external services
*available external client tools for demonstration (quick win)


==Work distribution==
AWOB has grown out of two main developments. On the one hand there is the International Virtual Observatory Alliance. Developments in its German node, GAVO, have lead us to see the importance of centralised organisations of data long before final publication is taking place. Also requests for collaborative environments and self-publication of data were made to us.
On the other hand the MPDL has developed tools dealing with concerning sustainable long term archiving of research data, the integration of external tools and services, and the conception of user centred working environments.


Workpackages based on pilot approach
Having said this, AWOB is extending the assistance for scientists to the whole scientific life cycle from group-building and data acquisition to e-publications and long term archiving. The implementation of AWOB is thereby supported by the tools and infrastructure already developed by the MPDL.
*Wiki selection
*Basic user management
*Demonstrator solution
**Architecture
***main components (wiki, eSciDoc)
***interaction between Wiki, eSciDoc repository, eSciDoc solutions
***identification of existing services to be involved, evtl. modification
***checking if existing eSciDoc solutions (PubMan) can be re-used and identify necessary modifications
**Implementation


==Required resources==
<!--
*new staff at institute(s)
== Scientific Life Cycle ==
*new stafff at MPDL
*overall costs (human resources, hardware, other)
==Organisational==
*Institutes involved
check possibiltity of having a pilotphase with one/two institutes at start, to deliver quick and convincing results. After first showcase, other institutes can join.
*Responsible for proposal
*Required budget (total, annual)


=Meetings=
AWOB aims at assisting the scientists throughout the scientific life cycle. A life cycle diagram will be available soon.
==15th sept 2008==
-->
*first brainstorming at MPDL/Munich
== Time Schedule ==
==25th sept 2008==
 
*updated presentation by Gerard/Jaiwon/Wolfgang (see under [https://zim01.gwdg.de/repos/smc/trunk/02_Related_Material/02_Projects/GAVO/mpdl20080912.ppt SVN of MPDL (restricted)]
The project is structured in three phases:
*outcome:
Phase one will focus on building a demonstrator and subsequent first release of the community platform together with two project partners (MPE and MPA). Phase two will focus on active outreach to the astrophysical community within the MPG and other organizations, to gather necessary feedback on extensions, improvements and their priorities. Other potential partner institutes will be invited to join the project and we will interact with the Scholarly Workbench working group for sharing experiences and potential collaborations. In Phase three, resulting in the final AWOB release,  we will focus on the holistic scenario of an enhanced publication, i.e. publishing projects, data, and publications. In addition, we will design and incorporate additional tools and interfaces to external systems and databases and produce the final AWOB release. Some of these extensions will be based on the feedback received during the previous phases. In this last phase particular attention will be given to activities concerning outreach and training to MPG and other external communities. This will enable a larger community to use this collaborative platform or parts of it for their own purposes. Finally a cost model will be developed to ensure long term sustainability of the AWOB community platform.
**First draft of MPDL project proposal until 10th of oct (Ulla, Natasa), focusing on requirements and approach
 
**First draft of [[User:Natasab/ESciDoc_HowTo|eSciDoc HowTo]] for definition of high-level requirements
== Demonstrator ==
 
A Liferay based Demonstrator of the AWOB platform was available in Nov 2011. With this Demonstrator, we adressed our associated scientists who performed the first testing of the platform.
 
== First Productive Version AWOB 0.5 ==
 
July 2012: The first productive instance of AWOB (AWOB 0.5) is delivered.
 
== Public Minutes of Meetings ==
 
* [[AWOB_Meeting_2011-05-12|Kick-off meeting (2011-05-12)]]
* [[AWOB_Meeting_2011-06-09|Public Intro (2011-06-09)]]

Latest revision as of 11:36, 19 February 2013


Astronomers’ Workbench

General:
Overview · Usage Scenarios
Contact and support

Internal Pages

Initiators, Coordinators, the Team, and Partners[edit]

AWOB in Few Words[edit]

AWOB, the Astronomer's Workbench, is a web based collaboration-data evaluation-publication-platform which helps scientific working groups of any size to enhance the communication and to share resources, data, results, publication texts etc. throughout the whole scientific life cycle.

Resulting e-publications and publications to the Virtual Observatory allow long term archiving of data, the annotation of metadata as well as easy access of digital outcomes by other users.

The AWOB project has a duration of 3 years and is based on experiences made by the German Astrophysical Virtual Observatory (GAVO).

AWOB and the Scientific Life Cycle[edit]

Scientific Life Cycle

Typical scientific lifecycle in astrophysics. The whole lifecycle is supported by AWOB. Copyrigth for the multi wavelength image of the Milky Way image illustrating "correlation with databases": NASA (National Aeronautics and Space Administration), Goddard Space Flight Center. Within this reference you can find the references to the individual observations. Copyright for the M 104 mid infrared image illustrating "data visualization": NASA/JPL-Caltech/R. Kennicutt (University of Arizona), and the SINGS Team

Public Intro to AWOB as a PDF File[edit]

Download a PDF File with a generally understandable public intro to the Astronomer's Workbench. The slides contain supplementary information to this page and some images.

Situation and Challenge in Today's Astronomy[edit]

Astronomical research is increasingly organised in projects that can include hundreds of collaborators, often spread out over the globe. In collaborations of any size the task of sharing resources such as data products or images and texts for publications can be very cumbersome. The raw data of many observatories have to be evaluated carefully to get reasonable scientific results like calibrated images. Since the programs of such pipelines remain normally unpublished, it is difficult for users not familiar with a certain instrument to mine the data.

Example of a Typical Astrophysical Quest: A Multi Wavelength Study of NGC 3079[edit]

Multi wavelengths studies of objects allow a holistic view. Often, objects are observed by different groups at different wavelengths. The following images, e.g., combine optical, radio, and X-ray data of the edge-on spiral galaxy NGC 3079. The data reveal hot gas in the halo of the galaxy. The gas is thrown out of the disk into the halo by a super bubble of ionised gas. The images shown are based on the following data: Images with contour plots, image 1 and 2, on optical data (Digital Sky Survey, DSS), X-ray data (Pietsch et al. 1997), and radio & H-α data (both Veilleux et. al 1994). The false colour image shown in image 3 is based on Hubble Space Telescope and Chandra (X-ray) observations. The copyright belongs to NASA/CXC/STScI/U.North Carolina/G.Cecil.

E-publications, as supported by AWOB, make calibrated images available for all scientists and thereby facilitate multi wavelengths studies.

NGC3079 NGC3079 HST + X-rays

The Idea of an Astronomer's Workbench[edit]

The project Astronomer’s Workbench (AWOB) aims to assist astronomers in these tasks by building a web based platform that will enhance the communication between the scientists and support the centralised organisation and collective usage of resources, ideas, data, results, and documentation. Use of AWOB should significantly improve coordination of these collaborative projects from their initial formulation to their completion as a publication. An additional focus of AWOB is to facilitate the publication of the project data to the astronomy community, both using Virtual Observatory (VO) standards, and as enhanced e-Publications.

Functional Components[edit]

To support the mentioned idea of AWOB, the platform will contain the following main functional components:

  • Explicit project and user management modules for organizing and coordinating a project and associated collaborators.
  • Role based authentication and authorisation modules for restricting access to the project resources to selected users.
  • Centralised storage and management of the shared data and other resources for a given project.
  • Value-adding tools for resource management tasks such as search, visualisation and analysis.
  • A WEB portal providing easy access to external web services, and public archives provided by the astronomy community, where possible using Virtual Observatory standards.
  • Meta-data extraction and annotation for publication of finalised data products according to Virtual Observatory standards.
  • One-click e-publication of data and the relevant artefacts generated during the project.

AWOB History[edit]

AWOB has grown out of two main developments. On the one hand there is the International Virtual Observatory Alliance. Developments in its German node, GAVO, have lead us to see the importance of centralised organisations of data long before final publication is taking place. Also requests for collaborative environments and self-publication of data were made to us. On the other hand the MPDL has developed tools dealing with concerning sustainable long term archiving of research data, the integration of external tools and services, and the conception of user centred working environments.

Having said this, AWOB is extending the assistance for scientists to the whole scientific life cycle from group-building and data acquisition to e-publications and long term archiving. The implementation of AWOB is thereby supported by the tools and infrastructure already developed by the MPDL.

Time Schedule[edit]

The project is structured in three phases: Phase one will focus on building a demonstrator and subsequent first release of the community platform together with two project partners (MPE and MPA). Phase two will focus on active outreach to the astrophysical community within the MPG and other organizations, to gather necessary feedback on extensions, improvements and their priorities. Other potential partner institutes will be invited to join the project and we will interact with the Scholarly Workbench working group for sharing experiences and potential collaborations. In Phase three, resulting in the final AWOB release, we will focus on the holistic scenario of an enhanced publication, i.e. publishing projects, data, and publications. In addition, we will design and incorporate additional tools and interfaces to external systems and databases and produce the final AWOB release. Some of these extensions will be based on the feedback received during the previous phases. In this last phase particular attention will be given to activities concerning outreach and training to MPG and other external communities. This will enable a larger community to use this collaborative platform or parts of it for their own purposes. Finally a cost model will be developed to ensure long term sustainability of the AWOB community platform.

Demonstrator[edit]

A Liferay based Demonstrator of the AWOB platform was available in Nov 2011. With this Demonstrator, we adressed our associated scientists who performed the first testing of the platform.

First Productive Version AWOB 0.5[edit]

July 2012: The first productive instance of AWOB (AWOB 0.5) is delivered.

Public Minutes of Meetings[edit]