Dariah Service interoperability (AP1.3)

MPDL

Link to Dariah wiki:

= Presentation =

What is interoperability

 * Interoperability is a property referring to the ability of diverse systems and organizations to work together (inter-operate).Wikipedia

Context

 * Dariah is a european infrastructure for the research in the arts and humanities:
 * Extreme heteregeonous area:
 * Wide research area spectra
 * Multilinguism
 * Large amount of institutions
 * Technical status/level?

Interoperability goals in the Dariah infrastructure

 * There are 2 use cases:
 * 1) interoperability for Dariah services (including all sofware/database developed within the Dariah infrastructure and the element of the infrastructure itself)
 * 2) interoperability for existing services (i.e. services created independantly from Dariah)

= References / State of the art =

Links about interoperability

 * in facto (open standards) vs post facto interoperability
 * ISO/IEC 2382-01
 * egif
 * NIEM
 * OASIS-open
 * Emergency Data Exchange Language (EDXL) suite of standards including the Common Alerting Protocol (CAP).
 * The ebMS 3.0 Advanced Features Specification : extends the ebMS 3.0 Core Specification with support for ebMS intermediaries (multi-hop), efficient high-volume messaging (bundling) and exchange of very large messages (splitting and compression).
 * The AS4 profile: a light-weight profile of the ebMS 3.0 Core Specification. AS4 is designed with input from GS1 and is a Web Services-based functional super set of both ebMS 2.0 and of the EDIINT AS2 standard.
 * Web Services Interoperability Organization (WS-I)
 * CMIS a domain model and Web services standard for working with Enterprise content management repositories and systems.
 * SAML: a standard XML-based framework for the secure exchange of authentication and authorization information.
 * XACML standard XML-based protocol for access control policies.
 * XRI URI-compatible scheme and resolution protocol for abstract identifiers used to identify and share resources across domains and applications.
 * SPML standard XML-based protocol for the integration and interoperation of service provisioning requests.


 * IETF and RFC
 * IDABC
 * European interoperability FW - FW for interoperability in public organisation
 * Quick overview
 * SEMIC

Links to similar project

 * Interedition
 * Interoperability
 * Community Level
 * OSDC (Open Source Developers' Conference) in the humanities
 * Ecology of projects: Juxta, Hermans, Munster, Faust, A32, Nines, Clarin/Dariah, SADE, TEI-c, etc. etc. etc. (Mindmap this?): http://www.interedition.eu/wiki/index.php/AssociatedProjects
 * Connections to various other COST Actions: A32 (Open Scholarly Communities on the Web), IS1005 (Medieval Cultures and Technological Resources), IS09101 (Wome Writers in History)
 * Road Map
 * Semantic level
 * Model for collation task
 * Model for modularizing scholarly tasks
 * Networking formats, standards, protocols: lightweight REST/JSON
 * Technical level
 * Web services model / architecture / REST/JSON
 * CollateX + spin off GUIs, pre and post processors
 * Links:
 * http://www.interedition.eu/wiki/index.php/About_microservices
 * The services provided by interedition are based on the concept of microservice. A microservice is the atomic element of the infrastructure. They are based on REST, uses JSON as data format, and are could based solution. They should be small and fast. Microservices supports better sustainability of the FW (smaller services are easier the maintain), and are easier to replicate (better reliability).
 * Notice: Incovenient: more dependencies, the interfaces should be fixed or versionned.
 * http://www.interedition.eu/wiki/index.php/Text-Image-Transcription/Palo_Alto_Model
 * http://www.interedition.eu/wiki/index.php/About_text_sources
 * http://www.interedition.eu/wiki/index.php/About_smart_data_%26_microrepositories
 * http://www.interedition.eu/wiki/index.php/Interaction_model
 * One problem of the micro services model is to be purely stateless. By some (macro)services might be required to have an interaction with the user which might needs for state information of the data. The foreseen solution is to create a micro-repositories that store the data status.
 * http://www.interedition.eu/wiki/index.php/Annotation_and_Linking/Darmstadt
 * http://www.interedition.eu/wiki/index.php/Existing_Tools


 * IMPACT - Improving Access to Text.
 * http://www.impact-project.eu/uploads/media/IMPACT_COORD1_Annual_report_2010_Publishable_summary.pdf
 * http://www.impact-project.eu/uploads/media/IMPACT_2009_Publishable_summary.pdf
 * CHAIN - Coalition of Humanities and Arts Infrastructures and Networks.
 * PLANETS, Preservation and Long-term Access Through NETworked Services.
 * Interoperability framework paper: http://www.planets-project.eu/docs/reports/Planets_IF-D11_ConsolidatedReleaseDocumentation.pdf
 * Java base FW, using WSDL, SOAP, Java interfaces, xml, JAX-WS.
 * RODA

Others links

 * http://www.ehr-impact.eu/downloads/documents/EHRI_D1_2_Conceptual_framework_v1_0.pdf

Achieving software interoperability (source wikipedia)
Software Interoperability is achieved through five interrelated ways:
 * 1) Product testing
 * Products produced to a common standard, or to a sub-profile thereof, depend on clarity of the standards, but there may be discrepancies in their implementations that system or unit testing may not uncover. This requires that systems formally be tested in a production scenario – as they will be finally implemented – to ensure they actually will intercommunicate as advertised, i.e. they are interoperable. Interoperable product testing is different from conformance-based product testing as conformance to a standard does not necessarily engender interoperability with another product which is also tested for conformance.
 * 1) Product engineering
 * Implements the common standard, or a sub-profile thereof, as defined by the industry/community partnerships with the specific intention of achieving interoperability with other software implementations also following the same standard or sub-profile thereof.
 * 1) Industry/community partnership
 * Industry/community partnerships, either domestic or international, sponsor standard workgroups with the purpose to define a common standard that may be used to allow software systems to intercommunicate for a defined purpose. At times an industry/community will sub-profile an existing standard produced by another organization to reduce options and thus making interoperability more achievable for implementations.
 * 1) Common technology and IP
 * The use of a common technology or IP may speed up and reduce complexity of interoperability by reducing variability between components from different sets of separately developed software products and thus allowing them to intercommunicate more readily. This technique has some of the same technical results as using a common vendor product to produce interoperability. The common technology can come through 3rd party libraries or open source developments.
 * 1) Standard implementation
 * Software interoperability requires a common agreement that is normally arrived at via a industrial, national or international standard.

Each of these has an important role in reducing variability in intercommunication software and enhancing a common understanding of the end goal to be achieved.

= Dariah recommendations =

Interoperability levels

 * Legal : legal issues/ terms of use of services and there data.
 * Organisational : responsability issues, exchange worflows, documentation, support, etc. (defined via service level agreements)
 * Semantic: data standards (schema/profile, syntax, vocabularies)
 * technical: Interfaces, protocol, languages, implemented services

These levels leads to an Dariah service compliance level, which evaluate the level of implementation of the interoperability levels.

Legal

 * Level 3 = Terms of aggreement signed
 * Could be different level of aggreement...?

Organisational

 * Level 1 = Technical access enable
 * Level 2 = Documentation provided
 * Level 3 = Contact and support persons defined

Semantic

 * level 1 = open data format
 * level 2 = defined semantic (via schema, profile, etc.)
 * level 3 = Registered schema in schema registry

Data format

 * xml, rdf, json, no format (i.e agnostic)?

Protocols

 * http
 * OAI-PMH, OAI-ORE

Interfaces

 * REST

service registry

 * Do we need one?
 * WADL, WSDL