Venti and AFS ============= Overview -------- On Chicago, we run a program called Venti for all our long-term archives. It's a content-addressable block store, allowing for efficient storage of slowly-changing contents like homedirs and service backups. Or at least, that's the theory. In any case, the basic automation pipeline is something like this: * On a relatively frequent schedule, AFS volumes are released to Chicago. * On a less frequent schedule, these volumes are dumped (with 'vos dump'), processed, and archived. The full procedure is in fact: * For each volume on Chicago's archival partition, ``vos dump`` feeds data to a ``rabinsplit`` program which generate a series of files that are then ``vac``-ed into venti. * The ``rabinsplit`` program helps us with deduplication by recovering from non-block-sized insertions into the dump files. See its source for details, but the net result is that we use it to produce a directory of files, ``x/00000000``, ``x/00000001``, etc, which when concatenated in name-ascending-sorted order yields the stream produced by ``vos dump``. * These directories are then fed to the Venti archival tool, ``vac``, in a way that causes it to produce 'archive files'. These are in Chicago's ``/mnt/vicepa.dump`` directory and are simply pointers into the Venti store. The automation is overseen by AFS BOS; venti is overseen by runit so that the entire AFS subsystem may be restarted without affecting Venti. Looking at or restoring an archive file --------------------------------------- The ``unvac`` tool can show us what's inside an archive file; find a .vac file and run ``/usr/local/plan9/bin/unvac -h "tcp\!localhost\!venti" -t $VACFILE``. The contents will be files named ``YYYY/MMDD/NNNNNNNN``. Pick the particular ``YYYY/MMDD`` you want to extract and do so with ``/usr/local/plan9/bin/unvac -h "tcp\!localhost\!venti" $VACFILE YYYY/MMDD``. Then it's simply a matter of ``find YYYY/MMDD/* -type f | sort -n | xargs cat | vos restore ...`` to bring things back. (The use of ``find | sort | xargs`` is because we may create archives whose list of files would overflow the maximum command line length limits; xargs manages that for us this way.) Restoring from archive without nuking an exisiting volume ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ If a user has asked for an emergency restore but does not want their home directory clobbered, consider creating, mounting, and restoring a new volume for them. Something like :: vos create $SERVER $PARTITION recover.$USER fs mkm ~$USER/acmsys/recover recover.$USER.readonly find YYYY/MMDD/* -type f | sort -n | xargs cat | vos restore $SERVER $PARTITION recover.$USER -readonly And then 'vos remove' the partition when the user has gotten their files back. Manually inserting a dump into the archive ------------------------------------------ The easiest thing to do is to ``vos release ${VOLUMENAME}`` .. todo:: And then what?