Escidoc-core Migration v1.1 to v1.2

From MPDLMediaWiki
Jump to: navigation, search

step-by-step instructioncs to migrate escidoc-core 1.1 to escidoc-core 1.2. (detailed information is related to the MPDLMax Planck Digital Library productive environment).

Affected servers

  • srv01.mpdl.mpg.de (PortgreSQL and FedoraFlexible Extensible Digital Object Repository Architecture)
  • coreservice.mpdl.mpg.de (srv02.mpdl.mpg.de) (escidoc-core)
  • pubman.mpdl.mpg.de (PubManPublication Management)
  • faces.mpdl.mpg.de (Faces)
  • virr.mpdl.mpg.de (VirrVirtueller Raum Reichsrecht)
  • peer-coreservice.mpdl.mpg.de (PubManPublication Management)
  • any other instance accessing coreservice.mpdl.mpg.de

Required software packages

all available at https://www.escidoc.org/JSPWiki/en/DownloadForRelease1.2

  • foxml_migration-jar-with-dependencies.jar
    • plus xsl folder containing xslt styesheets:
wget https://subversion.mpdl.mpg.de/repos/common/trunk/common_services/foxml_migration/xsl/foxml_context.xsl --no-check-certificate
wget https://subversion.mpdl.mpg.de/repos/common/trunk/common_services/foxml_migration/xsl/foxml_face.xsl --no-check-certificate
wget https://subversion.mpdl.mpg.de/repos/common/trunk/common_services/foxml_migration/xsl/foxml_facesAlbum.xsl --no-check-certificate
wget https://subversion.mpdl.mpg.de/repos/common/trunk/common_services/foxml_migration/xsl/foxml_file.xsl --no-check-certificate
wget https://subversion.mpdl.mpg.de/repos/common/trunk/common_services/foxml_migration/xsl/foxml_orgUnit.xsl --no-check-certificate
wget https://subversion.mpdl.mpg.de/repos/common/trunk/common_services/foxml_migration/xsl/foxml_pubItem.xsl --no-check-certificate
wget https://subversion.mpdl.mpg.de/repos/common/trunk/common_services/foxml_migration/xsl/foxml_virrElement.xsl --no-check-certificate
wget https://subversion.mpdl.mpg.de/repos/common/trunk/common_services/foxml_migration/xsl/transform_context.xsl --no-check-certificate
wget https://subversion.mpdl.mpg.de/repos/common/trunk/common_services/foxml_migration/xsl/transform_face.xsl --no-check-certificate
wget https://subversion.mpdl.mpg.gwdg.de/repos/common/trunk/common_services/foxml_migration/xsl/transform_facesAlbum.xsl --no-check-certificate
wget https://subversion.mpdl.mpg.gwdg.de/repos/common/trunk/common_services/foxml_migration/xsl/transform_file.xsl --no-check-certificate
wget https://subversion.mpdl.mpg.gwdg.de/repos/common/trunk/common_services/foxml_migration/xsl/transform_orgUnit.xsl --no-check-certificate
wget https://subversion.mpdl.mpg.gwdg.de/repos/common/trunk/common_services/foxml_migration/xsl/transform_pubItem.xsl --no-check-certificate
wget https://subversion.mpdl.mpg.gwdg.de/repos/common/trunk/common_services/foxml_migration/xsl/transform_virrElement.xsl --no-check-certificate
wget https://subversion.mpdl.mpg.gwdg.de/repos/common/trunk/common_services/foxml_migration/xsl/ves-mapping.xml --no-check-certificate
    • plus migration.properties file

available at https://subversion.mpdl.mpg.de/repos/common/trunk/common_services/foxml_migration

Pre-Migration requirements

  • shutdown all solutions accessing coreservice.mpdl.mpg.de
    • stop JBossOpen source Java EE-based application server application server on
      • pubman.mpdl.mpg.de
      • faces.mpdl.mpg.de
      • virr.mpdl.mpg.de
    • stop JBossOpen source Java EE-based application server application server on any other instance, which is accessing coreservice.mpdl.mpg.de
  • shutdown the escidoc-core instance
    • stop JBossOpen source Java EE-based application server application server on coreservice.mpdl.mpg.de
  • shutdown FedoraFlexible Extensible Digital Object Repository Architecture instance
    • /X/fedora/tomcat/bin/shutdown.sh on srv01.mpdl.mpg.de
  • coreservice.mpdl.mpg.de only

Main migration steps

Installation of FedoraFlexible Extensible Digital Object Repository Architecture 3.3.1

  • on srv01.mpdl.mpg.de change to directory /X
  • delete the symbolic link fedora
  • DO NOT DELETE THE EXISTING FEDORA INSTANCE INSIDE /X/fedora3.2.1 !!!
  • create a new directory fedora3.3
  • create a new symbolic link fedora pointing to the new directory fedora3.3
  • drop old fedora databases "fedora32" and "riTriples"
  • CREATE DATABASE "fedora3" WITH ENCODING='UTF8' OWNER="fedoraAdmin";
  • CREATE DATABASE "riTriples" WITH ENCODING='SQLStructured Query Language_ASCIIAmerican Standard Code for Information Interchange' OWNER="fedoraAdmin" TEMPLATE=template0;
  • install FedoraFlexible Extensible Digital Object Repository Architecture 3.3.1
    • java -jar fcrepo-installer-3.3-fix01.jar
    • instructions can be found here: https://www.escidoc.org/JSPWiki/en/Fedora3.3Installation
    • when the installation is finished
      • copy the entire data directory located in the old FedoraFlexible Extensible Digital Object Repository Architecture instance (/X/fedora3.2.1/data) into the newly installed FedoraFlexible Extensible Digital Object Repository Architecture instance

Migration of the FOXMLFedora Object XML files

this will backup the FedoraFlexible Extensible Digital Object Repository Architecture data diectory!

  • go to escidoc-core-admin-1.2
  • edit the admin-tool.properties file
    • check FedoraFlexible Extensible Digital Object Repository Architecture properties
    • especially fedora-src.home=/X/fedora
  • ANT_OPTS=-Xmx1024m -Xms512m -XX:MaxPermSize=256m
  • call ant foxml-migration-from1.1-to1.2

Research Data Repository Only: Migrating old escidoc:TOCTable of Contents content model

  • locate the escidoc_TOCTable of Contents file inside /X/fedora/data/objects and open it for editing
    • replace all occurances of "TOCTable of Contents" with "toc"
    • check the creation date of the RELS-EXT.0 and the RELS-EXT.1 datastreams
    • if both dates "CREATED=<date>" are equal, decrease the one in RELS-EXT.0
  • update all references to escidoc:TOCTable of Contents in any resource
    • the current migration.properties file has a list of all relevant resources
    • java -jar foxml_migration-jar-with-dependencies.jar cmodel

Rebuild FedoraFlexible Extensible Digital Object Repository Architecture

  • edit <fedora-dir>/server/bin/env-server.sh:
exec_cmd="exec \"$java\" -server -Xmn64m -Xms256m -Xmx1024m \
  • start fedora: <fedora-dir>/tomcat/bin/startup.sh
  • rebuild the FedoraFlexible Extensible Digital Object Repository Architecture database and the FedoraFlexible Extensible Digital Object Repository Architecture resource index
    • Location: <fedora-dir>/server/bin
    • Script: fedora-rebuild.sh
      • CAUTION! causes NoClassDefFoundError on "exotic" variants of JDKJava Development Kit. (Let the SUN shine in ...)
      • rebuild first the database
      • rebuild next the resource index

Post-rebuild FedoraFlexible Extensible Digital Object Repository Architecture steps

  • create new tablespaces
    • check if tablespaces for directories are created
     mkdir /var/lib/pgsql/data/tables
     mkdir /var/lib/pgsql/data/tables/fedora
     mkdir /var/lib/pgsql/data/tables/triples
     mkdir /var/lib/pgsql/data/tables/escidoc-core
     mkdir /var/lib/pgsql/data/tables/statistics
     mkdir /var/lib/pgsql/data/indexes
     mkdir /var/lib/pgsql/data/indexes/fedora
     mkdir /var/lib/pgsql/data/indexes/triples
     mkdir /var/lib/pgsql/data/indexes/escidoc-core-large
     mkdir /var/lib/pgsql/data/indexes/escidoc-core-normal
     mkdir /var/lib/pgsql/data/indexes/statistics
     cd /var/lib/pgsql/data/
     chown -R postgres:postgres *
    • run the script (as postgres user)
    https://subversion.mpdl.mpg.de/repos/common/trunk/common_services/framework_access/src/test/resources/migration_1.1_1.2/create_tablespaces.sql (create new tablespaces)
  • FedoraFlexible Extensible Digital Object Repository Architecture database
    • run the scripts (as postgres user)
    https://subversion.mpdl.mpg.de/repos/common/trunk/common_services/framework_access/src/test/resources/migration_1.1_1.2/set_fedora_tables_to_tablespace.sql (re-set fedora tables tablespace)
    https://subversion.mpdl.mpg.de/repos/common/trunk/common_services/framework_access/src/test/resources/migration_1.1_1.2/set_fedora_indexes_to_tablespace.sql (re-set fedora indexes tablespace)


  • RiTriples database
    • run the scripts (as postgres user)
    • NOTE: Recommended is to re-run the select queries in these scripts and to create additional scripts.
    • Data in triples are changed daily, so are the tables (in case there are new data or predicates).
    • this has to be done AFTER FedoraFlexible Extensible Digital Object Repository Architecture resource index is rebuilt.
    https://subversion.mpdl.mpg.de/repos/common/trunk/common_services/framework_access/src/test/resources/migration_1.1_1.2/set_triples_tables_to_tablespace.sql (re-set triples tables tablespace)
    https://subversion.mpdl.mpg.de/repos/common/trunk/common_services/framework_access/src/test/resources/migration_1.1_1.2/set_triples_indexes_to_tablespace.sql (re-set triples indexes tablespace)

Migration of the escidoc-core database

  • ensure there is enough disk space, because the db-migration ant task requires far too much of it!
  • backup the escidoc-core database on coreservice.mpdl.mpg.de !!!
    • log-in as postgres-user to the system
    • issue
       pg_dump -f escidoc_core_database.dmp -C escidoc-core

in a desired backup directory, you need to know the password of user postgres in the pg database

  • The database migration uses certain fingerprints in order to determine the current version of the db. The layout of our databases sometimes differs from the one assumed by the db-migration tool.
  • To check if the database layout is fine
    • go to escidoc-admin tool
    • getrepository info
    • usually (except for very old versions of escidoc-admin tool) there is an information whether the database is consistent or not
  • IF the database is not consistent run the following script prior to the db-migration:
  • Modify admin-tool.properties.
    • Check the following entries:
    • datasource.script.prefix=
    • datasource.url=check-which-one
    • datasource.driverClassName=org.postgresql.Driver
    • datasource.username=check-which-one
    • datasource.password=check-which-one
    • creator_id=the user-id of a sysadmin user (e.g. escidoc:user42 for roland)
    • persistentHandle=the handle taken from aa.user_login_data
    • escidoc.database.tablespace.list=pg_default
  • go to admin-tool
  • Get repository info (shall tell if the fingerscript is different)
    • DO NOT RUN db-migration unless database is consistent!!!
      • if this is the case, compare the fingerprint script of 1.1.0 with the fingerprint script /tmp/fingerprint.xml
      • this should give an idea what to modify manually
      • repeat get-repository-info as long as consistency is achieved
  • call ant db-migration
    • NOTE: for the research data repository, prior to calling ant-db-migration please drop table aa.user_account_backup. All the rest is prepared. --Natasa 12:20, 17 September 2010 (UTCCoordinated Universal Time)
    • DO NOT USE THE JAR FILE with the db-migration argument!

JBossOpen source Java EE-based application server patch

  • backup escidoc-core.properties
  • copy everything inside jboss-patch-1.2/server to /usr/share/jboss on coreservice.mpdl.mpg.de, ignore jboss-patch-1.2/bin
  • remove old file wstx-asl-3.2.4.jar from <JBOSS_HOME>/server/default/lib.
  • Change properties in escidoc-core.properties according to escidoc-core.properties backup

Post-Migration steps

  • shutdown core-service
  • shutdown fedora
  • Postgresql shall run
  • vacuum analyze all tables in escidoc-core, riTriples, FedoraFlexible Extensible Digital Object Repository Architecture database (all as postgres user)
       helper: issue following sql to get the script (on each of the mentioned databases):
      
       select 'vacuum analyze '||table_schema||'.'||table_name||'; ' from information_schema.tables where table_schema in ('list', 'aa', 'public')
       copy/paste result in a new PgAdmin SQLStructured Query Language Editor
       replace all " with nothing (Edit->Find and Replace->Find what " -> Replace All) 
       run all commands together as PgScript (or use F6)
  • change the tablespaces in escidoc-core database
      run the script (as postgres user)
 
      re-set escidoc-core tables tablespace
  • drop list.property indexes (chance for faster recache), create some aa.indexes, list.filter indexes
      run the script (as postgres user)
 
      postmigration-pre-recache script

escidoc-core deployment

  • copy the escidoc-core-bin-1.2/earEnterprise Archive File Formate/escidoc-core.earEnterprise Archive File Formate file into /usr/share/jboss/server/default/deploy on coreservice.mpdl.mpg.de

Recache and reindex

  • Change Property for Lucene Indexing stylesheet in escidoc-core.properties:
gsearch.escidoc.indexingStylesheet = http://escidoc1.escidoc.mpg.de/resources/searchIndexDefinition/mpdlEscidocXmlToLucene_1.2.xslt
  • start JBossOpen source Java EE-based application server on coreservice.mpdl.mpg.de
  • start FedoraFlexible Extensible Digital Object Repository Architecture on srv01.mpdl.mpg.de
  • launch the escidoc-core admin tool at http://coreservice.mpdl.mpg.de/adm/admin.jsp
  • recache (ensure the "clear cache" checkbox is selected!) (MUST BE RUN FIRST, see afterwards POST-RECACHE below)
  • reindex (ensure the "clear index" checkbox is selected!) (MUST BE RUN SECOND, as it reads the SOAPSimple Object Access Protocol representation from the cache)

Post-recache

  • NOTE: POST-recache procedure can run in parallel with "reindexing" from eSciDocEnhanced Scientific Documentation Admin (DONE on coreservice pubman)
  • create new indexes on list.property
      run the script (as postgres user on psql from command line)
 
      set_list_property_indexes script


  • re-vacuum analyze databases escidoc-core, riTriples, FedoraFlexible Extensible Digital Object Repository Architecture database
            vacuum analyze
  • run the script with changed policies (as postgres user)
               change of policies


  • (optional) see Performance
    • NOTE: the escidoc-core database on production is prepared, no need to change anything, use these tipps for the other databases

deployment of solutions

  • pror to the deployment of Faces and VirrVirtueller Raum Reichsrecht
    • the Java version on virr.mpdl.mpg.de and on faces.mpdl.mpg.de has to be upgarded to version 1.6
  • deploy the appropriate solution earEnterprise Archive File Formate files to pubman, faces and virr
  • apply necessary updates to the solution specific properties files
  • start the JbossOpen source Java EE-based application server instances for the three solutions