eXo Platform instance backup involves backing up the databases and the filesystems for JCR index and value storage. JCR backup should be done off-line. Allmost all data under gatein.data.dir should be backuped (find details below in "Plan backup" paragraph).

Start you backup strategy with a concept of a data repository. The backup data needs to be stored somehow and probably should be organized to a degree. eXo Platform instance backup will produces set of files. What can be located on various storage medias (hard disk, tape, optical or solid storages, or even a special remote backup services can be used).

Organization of the files in catalogs (folders) or use of different media is up to a concrete Plaform implementation.

But it is highly recommended to apply a Backup rotation scheme to make the backup implementation effective and reliable. Also always use your Operating System and Database software available backup solutions. It will simplify backup organization and help to avoid mistakes and data lost.

Plan Platform backup operations taking in account full stop of the Platform server(s).

Warning

In case of Platform cluster, every node should be stopped before a backup will be performed.

Platform backup consists of following parts:

  • JCR data backup

    • JCR index files, pointed by configuration property gatein.jcr.index.data.dir

    • JCR value storage files, pointed by configuration property gatein.jcr.storage.data.dir

    • JCR database backup, database specified in JDNI configuration of Application server with "exo-jcrportal" name

  • Organization service database backup, database specified in JDNI configuration of Application server with "exo-idmportal" name

  • Transaction service files backup, pointed by configuration property com.arjuna.ats.arjuna.objectstore.objectStoreDir

Be ready for Restore: It's recommended to prepare tools (scripts etc) for restore at the backup planning stage. It will make particular restore operation quick and safe.

Note

Platform means one Portal application in this context , what is by default. But if your Platform instance runs several portals, each portal has own JCR, Organization and Transaction services. So, each portal should be backed up separately. In this document a single-portal Platform backup described. It just can be repeated for each portal of your system.

Note

Notes about JCR: Only two kind of JCR files are important in backup sense: index and value storages. The gatein.jcr.data.dir folder (by default it's $gatein.data.dir/jcr also contain swap sub-folder. The swap folder used for temporary files in case when BLOBs stored in database (see JCR configuration guide) and has no meaning for backup.

Below given an example of Platform backup process organization. This example introduces basic principles and will help to create your backup implementation.

Environment

Naming and Rotation

It's a general case when backup organized in two cycles rotation: everyday backup files stored for a last week days, older data stored on weekly basis and we'll plan to keep three years history at all.

To implement this approach we'll run daily backups (at night time when our site isn't in use) and will store result files (database and JCR files) on network storage in following structure:

Files will have following format (using ISO 8601 date format):

  • yyyy-MM-dd_mysql_jcrdb.tar.gz - for JCR database backup

  • yyyy-MM-dd_mysql_idmdb.tar.gz - for Organization service database backup

  • yyyy-MM-dd_jcr_values.tar.gz - for JCR value storage files backup

  • yyyy-MM-dd_jcr_index.tar.gz - for JCR index files backup

  • yyyy-MM-dd_jta.tar.gz - for Transaction service files backup

For files backup we have a shell script running on the Platform server. This script does next steps:

  • Stops the Platform server (ensure full stop by log sniffering)

  • Runs database backup tool against jcrdb and store result file in archive /mnt/backupfs/my_plf_backup/current/yyyy-MM-dd_mysql_jcrdb.tar.gz

  • Runs database backup tool against idmdb and store result file in archive /mnt/backupfs/my_plf_backup/current/yyyy-MM-dd_mysql_idmdb.tar.gz

  • Copies JCR value files to archive /mnt/backupfs/my_plf_backup/current/yyyy-MM-dd_jcr_values.tar.gz

  • Copies JCR index files to archive /mnt/backupfs/my_plf_backup/current/yyyy-MM-dd_jcr_index.tar.gz

  • Copies Transaction service files to archive /mnt/backupfs/my_plf_backup/current/yyyy-MM-dd_jta.tar.gz

  • On each Sunday copies all 7 days old archive files to a week folder, e.g. /my_plf_backup/current/weeks/02 for backup at January 9, 2011.

  • Otherwise deletes files older of 7 days from /my_plf_backup/current/

  • If it's first week of a new year, the script creates a previosu year folder in /my_plf_backup/, e.g. /my_plf_backup/2010, and moves content of /my_plf_backup/current/weeks there.

  • Starts the Platform server

  • Sends mail to Admin in case of error on any step.

Restore procedure planned as a manual operation.

Note

Example script implementation is outside of the scope of this guide.

Steps described above based on full backup of data. But there is an incremental backup approach: an incremental backup preserves data by creating multiple copies that are based on the differences in those data, a successive copy of the data contains only that portion which has changed since the preceding copy has been created.

Having in account steps described above it's also possible to implement an incremental backup against Platform data.

Users of Unix platforms can use rsync tool for files synchronization and implement incremental backup for JCR value and indexe files. Microsoft Windows users can use Backup utility (Ntbackup.exe).

These tools can be used in conjunction with a database incremental backup feature of your RDBMS to implement the Platform incremental backup solution. But all backup targets described above should be counted.

In case of the example it's possible to organize full backup weekly (every Sunday) and incrementals each day of a week. Incremental backup will be faster, it will decrease time of your site everyday maintenance.

Note

It's also possible to use ready solutions as backula.org Follow a product documentation for the implementation.