ESciDoc Services DataAcquisitionHandler

From MPDLMediaWiki
Jump to: navigation, search
eSciDoc SOA
SOAP and REST style interfaces
Service layers
Core services
Context Handler · Item Handler
Container Handler
Organizational Unit Handler
User Account Handler
Authentication
Content Model Handler
Semantic Store Handler
Intermediate services
Validation Service
Statistics Manager
Technical Metadata extraction
PIDManager
Basket Handler
Duplication detection
ImageHandler(Digilib)

Application services
Depositing
Searching
Search&Export
Control of Named Entities
Citation style Manager
RightsChecking
DataAcquisition
Transformation
Fledged Data
PID Cache
OAI-PMH

SOA Introduction

edit

IDIdentifier (Label)

DAASData As A Service

Complete Name

Data Acquisition Service

Status

Implemented

Description

Acquisition Service for data from internal and external sources with an unAPISimple Interface for API interface.

Operations Overview

Operation Status Input Output Description
explainSources implemented none String Scope:Public
Gives back a list of all available sources for acquisition
and which formats can be fetched from these sources
doFetch implemented sourceName: String
identifier: String
byte[] Scope:Public
This operation fetches data from the specified source.
The format of the requested data will be the default format defined in sources.xml
doFetch implemented sourceName: String
identifier: String
Format: String
byte[] Scope:Public
This operation fetches data from the specified source and returns it in the requested format.
This format can either be the format the external source provides,
or a format we can transform from a format the external source provides. The format properties are default
doFetch implemented sourceName: String
identifier: String
trgFormatName: String
trgFormatType: String
trgFormatEncoding: String
byte[] Scope:Public
This operation fetches data from the specified source and returns it in the requested format.
This format can either be the format the external source provides,
or a format we can transform from a format the external source provides. The format properties are default
doFetch implemented sourceName: String
identifier: String
Formats: Format[]
byte[] Scope:Public
This operation fetches data from the specified source and returns it in the requested format.
The fetched data will return in zip format,
currently only file fetching is possible for multiple formats
doFetch implemented sourceName: String
identifier: String
Formats: String[]
byte[] Scope:Public
This operation fetches data from the specified source and returns it in the requested format.
The fetched data will return in zip format,
currently only file fetching is possible for multiple formats. The format properties are default

Supported Systems

  • Implemented:
    • eSciDoc [no mapping required]
    • Arxiv [find mapping here] Important Note: Fetch from arxiv will only be successful if the PubManPublication Management server is registered at arxiv.org, otherwise your fetching attempts will be blocked
    • PubMed Central [find mapping here]
    • Spires [find mapping here]
    • BioMed Central [find mapping here]


  • In Design:


Service interfaces

The four steps to fetch data:

1. Choose the presentation of the data

dataacquisition/view: Views the fetched data in the browser
dataacquisition/download: The fetched data will be provided as a download

2. Specify the interface you want to use (currently only unAPISimple Interface for API)

dataacquisition/view/unapi
dataacquisition/download/unapi

3. Provide the identifier of the item you want to fetch

dataacquisition/view/unapi?id=escidoc:1234
dataacquisition/download/unapi?id=escidoc:1234

4. Provide the format you want the fetched item in

dataacquisition/view/unapi?id=escidoc:1234&format=bibtex
dataacquisition/download/unapi?id=escidoc:1234&format=bibtex

Supported Identifiers:

1. A identifier from a supported source (explained in /dataacquisition).

2. A identifier = any URLUniform Resource Locator (the eSciDocEnhanced Scientific Documentation DataAcquisition Service has no information about this source and can only try to call the given URLUniform Resource Locator for the fetching request).

The format has to be set to "url". The response will be a zip file of the fetched content. The view option for url identifiers is disabled


Future Development

  • Prioritize fetching formats for import (client or serverside?). E.g. fetch pdfPortable Document Format if not possible fetch doc.
  • Prevent DAASData As A Service to be a security leak for the sources he fetches from.