Use Cases

KIM-AG-IM =Europeana Libraries Aggregation Infrastructure=

Use Cases of Infrastructure and Harvester

S. Ruehle (CERL) J. Borbinha (IST) G. Pedrosa (IST)

Goals

 * Establish systems and processes capable of ingesting and indexing significant quantities of digitized material, including text, moving images and sound clips.
 * Extend TEL's existing aggregation infrastructure to enable the aggregation of digital content from libraries in Europe for Europeana, including full-text content.
 * Offer to Europeana in particular but also to any other potentially interested Service Provider, Metadata and Full-Text Data where the text will be fully searchable, making it possible to search inside books and other materials.

Questions

 * How homogeneous is the content Europeana Libraries has to handle?
 * Concerning content – only digitised books, journals, articles or more (e.g. foto, video)?
 * Concerning metadata – only „librarian standards“ like MARC 21, Unimarc, MODS, METS or more (e.g. Dublin Core, EAD)?
 * Concerning full-text – is there any searchable/machinereadable full-text available in the next 4 – 5 years.

TEL Aggregator
The Data Aggregator realized by the „Europeana Libraries Aggregation Infrastructure“ under the responsibility of TEL.

Actors (content origin)

 * Data Provider
 * Person or organisation using the TEL Aggregator to supply content to a Service Provider (esp. Europeana)
 * Data Provider Service
 * Computational service under the control of a Data Provider

Actors (content re-use)

 * Service Provider
 * Person or organisation interested in the harvesting of data from the TEL Aggregator
 * Service Provider Harvester
 * A remote service behaving under the control of a Service Provider

Actors (content processing)

 * Aggregation Team
 * A person or organisation scheduling and monitoring the collecting, transformation, validation and provision of content in the TEL Aggregator and validating the results of these processes
 * Administrator
 * Person or organisation managing, monitoring and maintaining the system of the infrastructure concerning reliability and security, responsible for the traffic on the systems and the partners involved

Use Case 1: Technical Reference

 * Supports a forum for publishing and sharing technical reference documents for
 * Dissemination of information
 * Interaction by all human actors

Questions

 * What technical specifications or other information do libraries need to provide data via TEL Aggregator?
 * What is necessary in the next month?
 * What is needed in the long run?

Use Case 2: Manage Data Provider

 * Supports the registration and management of all the information related to a Data Provider, such as contact data, data collections and harvesting processes.
 * Makes it possible to edit a Data Provider Record (= all information about a Data Provider)
 * A new record is created if the action is for a new Data Provider
 * An existing record is edited if the action is for an already registered Data Provider*

Questions

 * How should the registration system work?
 * What is needed for the first contact?
 * What is needed in the long run?
 * What more should we have in mind?

Use Case 3: Manage Data Schema

 * Support the Data Provider to map their Data schemas to the schemas the TEL Aggregator wants to make available to the Data Providers (namely those required by Europeana)
 * Provide a mechanism to register and manage Data schemas
 * Create a new schema
 * Edit an existing schema
 * Provide a mechanism to edit a data schema transformation
 * Create a new transformation
 * Edit an already registered transformation
 * Provide a mechanism to test Data transformation using Data existing in the system and providing as feedback audit reports

Questions

 * What services are needed in this context?
 * Who should do the mappings, data transformation, normalization, enrichment?
 * the Data Provider?
 * the Aggregation Team?
 * both?
 * What are the schemas that have to be considered?
 * Who will validate the transformed, normalized, enriched data?

Use Case 4: Manage Data Ingest

 * Data Ingest Task = Harvesting by the TEL Aggregator of a Data collection from a Data Provider
 * A new data ingest task is created
 * An existing data task is edited
 * Assessment of the data harvesting should be performed by the Data Provider, but the lack of technical skills from that actor might imply an intervention of the Aggregation Team
 * Validation of the data harvesting

Questions

 * How do the data provider provide their data?
 * OAI-PMH
 * FTP
 * HTTP
 * Should Europeana Libraries provide an upload facility (e.g. ftp-server) for Content Provider?

Use Case 5: Harvest Data Ingest

 * Data Ingest Task = Harvesting by the TEL Aggregator of a Data collection from a Data Provider
 * Execute the data ingest task as configured

Use Case 6: Harvest Data Export

 * Data Export Task = Harvesting of a data collection from the TEL Aggregator by a Service Provider
 * Execute a Data export task at any moment for any available collection of data

Questions

 * Europeana is the main Service Provider but:
 * Are there any other Service Provider interested in the data?
 * Are the Libraries providing content interested to reuse the content after it was transformed/normalized/enhanced be the TEL Aggregator?
 * Are they interested in using other libraries content?
 * Will the libraries allow the reuse of their data by other Service Provider?

Use Case 7: Manage Data Exports

 * Data Export Task = Harvesting of a data collection from the TEL Aggregator by a Service Provider
 * Monitoring all the data export tasks being executed in real-time

Use Case 8: Service Provider Report

 * Provide a report of each execution of a Data Harvest Task performed by a Service Provider
 * Provide an audit report of the results of each Data Harvest Task concerning
 * Quantity (number of records, attributes in the records, etc.)
 * Quality (consistency of the values of the attributes, conformance with the exporting schema, etc.)

Questions

 * What information need the Content Provider about the usage of their data by Service Providers
 * Who are the Service Provider(s)?
 * Number of data, elements/attributes, etc. harvested/used by Service Provider?

Use Case 9: Configuration and Management

 * Supports configuration and maintenance of the system and its services