Creating Backups for ACM Services
=================================

The ACM systems are remarkably complex -- lots of moving parts everywhere, etc.
It is basically hopeless to stand the whole thing up again without guidance,
which is tragic, because in my (nwf) tenure here I have seen essentially three
complete reconstructions of the ACM computing environment.  Every time,
services get lost as institutional memory has graduated and gone off to the
Real World (TM).

So: the philosophy of the current round of systems architects, I think, is
that we are aiming for quick *restore*, rather than quick de novo
deployment.  Part of that effort is this very collection of documents that
you're now reading, and part of it is system-wide autonomic backups of
system configuration files.

Many of the more interesting machines back themselves up into AFS; run ``ls
/afs/acm.jhu.edu/service/*/backup.sh`` with your admin hat on to get an idea of
who's playing this game right now.  Typically, those scripts are simple rsyncs
into ``/afs/acm.jhu.edu/service/.../snapshot`` and invoked by the host's root
``crontab`` with ``timelimit`` and ``k5start``.  All of those volumes should be
replicated to archival partitions and then stashed into the archive. (See
:ref:`afs_long-term-archive`.)  So even if the world burns, most of the
configuration, even historical configuration, can be found in the archives.

Of course, not everything is in the filesystem (which is, of course, a bug in
its own right, but that's another battle); our databases, for example, are also
dumped nightly to files and those are then replicated using AFS and then again
dumped to the archive.  It's a long path, but things get there!

.. warning:: PLEASE, *PLEASE*, **PLEASE**, if you have stood up a complex
   service, play this game!  Later sysadmins will love you for it.
   
Roughly, the steps necessary, while wearing your admin hat, are to:

* Generate a host keytab for your machine if it doesn't have one already
  Place it in your machine's /etc/krb5.keytab .

* Run ``/afs/acm.jhu.edu/group/admins.pub/scripts/new-afs-service-volume ${YOUR_SERVICE}``
  for your machine or service.  this script will automagically handle
  preparing the volume for regular replication by the archive machine's automation!

* Grant your host's principal rlidw for the service directory::

     fs sa /afs/acm.jhu.edu/service/${YOUR_SERVICE} rcmd.${YOUR_HOST} rlidw

  * Optionally, adjust the quota of your service volume.

* Create a backup.sh file along these lines::

     #!/bin/sh
      
     rsync -rl -c --delete --relative -vv \
             \
             /etc/${IMPORTANT_FILE}	\
             /var/${OTHER_IMPORTANT_DIRECTORY} \
             \
             /afs/acm.jhu.edu/service/${YOUR_SERVICE}/snapshot

* Add the backup to cron's automation, using "crontab -e" as root to insert
  ``@daily timelimit -q k5start -f /etc/krb5.keytab -U -t --
  /afs/acm.jhu.edu/service/${YOUR_SERVICE}/backup.sh``.

.. warning::

   Despite the possible presentation above, it is not possible to split
   commands across lines in crontab.  The whole thing should appear on one
   line!

If you need help or inspiration, Chicago and Magellan are likely the two
most elaborate setups to date.