ZREP ZFS replication and failover

This is the home for "zrep", a robust yet easy to use cross-platform ZFS based replication and failover solution. It can also serve as the conduit to create a simple backup hub.
It has been in development since 2012

Contents

Downloads

Download zrep version 1.8.0 , released in Nov 2019
Or alternatively, you can grab the latest dev version directly from its github project. (the 'latest' may or may not be a stable version, however!)

Note that the download is not an archive, but a single 2000+ line shellscript. Just chmod +x zrep, and you are ready to go!

Zrep has been reported to run on multiple OS's that provide ZFS, including Solaris, IllumOS, Linux, and BSD (including FreeNAS, and Nas4Free).

Compatibility issues
I chose ksh as the interpreter, because it has extra efficiencies with builtin functions, and also user written functions. Just be sure to run zrep with Real ksh, not an impostor such as pdksh, or it may not work properly.
Similarly, there may be a bug with Gentoo "improved" ksh, which is a non-official patch. I have had a report that standard 2012 ksh works, but the app-shells/ksh-93.20140625 gentoo ksh may have a problem.
FreeNAS may work best using #!/usr/local/bin/ksh93 as first line for zrep (or just symlink that to /bin/ksh !)
If for some reason you want ksh source code, the best place seems to be
https://github.com/att/ast

License
The license for zrep is available here. The short summary is that you are free to use it as much as you like,as long as you dont sue me for anything that goes wrong :-)
If you are really bored, you may also read the CHANGELOG For historians, some older versions are still available: zrep version 0.8.4 , Oct 17th, 2012 / zrep version 0.7 , June 29th, 2012

What is zrep?

Zrep is an enterprise-grade, single-program solution for handling asynchronous, continuous replication of a zfs filesystem, to another filesystem. That filesystem can be on another machine, or on the same machine.

It also handles 'failover', as simply as "zrep failover datapool/yourfs". This will conveniently handle all the details of

It also has user-friendly status outputs. For example:
# zrep status
scratch/datasrc                      synced as of Mon Mar 12 13:23 2012

Why use zrep over (z-other-thing)?

I originally looked around for a pre-existing program that safely handled replication and failover of a zfs filesystem, but could not find one. Existing programs all seem to be some form of glorified rsync. They are not engineered to be run continuously, or safely. They often require a considerable amount of knowlege for a sysadmin to configure them well.

In contrast, zrep is designed to be

  1. Safe (has locking and other sanity checks)
  2. Easy to use
  3. Handles "failover" of active filesystem.
  4. Can be run frequently
zrep can safely be run in a loop, thereby potentially generating snap-and-replicate operations as frequently as 1-3 times a minute, and keep going safely for many years.

Overview of use, aka Documentation

If you would like a detailed view of zrep use, I recommend that you read the zrep documentation page.

If you are in a hurry to just try zrep out, however, a super-trivialized version of how to use it would be:

	zrep init pool/fs desthost destpool/fs
	# (will create the destination fs!)
	# Initialize additional fs's with zrep if you wish. Then...
	while true; do zrep sync all; done
After the initial full sync, this will do incremental zfs sends, back to back, "forever". (or at least until you hit an error :)

For some amount of greater detail, please see the usage message, via "zrep -h"

The one "undocumented feature" you may care about, is that the property zrep:savecount controls the number of recent snapshots preserved. To change from the default (currently, 5), use

zfs set zrep:savecount=NEWVAL your/fs/here

There is also a separate troubleshooting page

Caution

There is one thing to beware of: Don't try to use zrep on nested filesystems, without using the special recursive flag.
It's okay to use it on
/pool/fs/here
However, it is probably a bad idea to try it out of the box on BOTH of
/pool/fs/here
/pool/fs/here/too
If you really wish to sync a bunch of ZFS filesystems nested under a master filesystem, zrep does now support a recursive flag. See the documentation for more details.

Faster initialization and throughput

If you are planning to do a lot of "initial syncs", and your data is very large, you may be interested in looking at these patches to ssh. They make it have larger sized TCP windows, and also allow you to disable encyption on the transfer, if that helps you:

http://www.psc.edu/networking/projects/hpn-ssh/

 Some speed results, from local-host testing:
  using regular scp to regular sshd, got about  20MB/sec 
  using regular scp to hpn-sshd, got about 30MB/sec
  using hpc-scp to hpn-sshd, got 150MB/sec

Additionally, zrep supports integrating with "mbuffer", or "bbcp". Either of which can have a beneficial effect to transfer speeds.

It should be noted, however, that generating the data for a "zfs send" has speed limits of its own.
You may want to first time "zfs send your@snapshot >/dev/null", to see if optimizing your network throughput is going to be significant.
Bottom line: unless you're sending from an SSD, and/or sending Terabytes of data, it may be best to just stick with SSH.

Source Code

zrep is a shellscript, so in essence, if you download it, you already have "the source". However, since it is a large scale script of 2000 lines, I actually 'compile' it together from multiple modular files, in a manner similar to other programming languages. I even use 'make' to do so.

If you are a shellscript writer, this may interest you. Feel free to browse around the source directory, which I have now moved to a GitHub repository for zrep


Written by:Philip Brown
Bolthole Top Search Bolthole.com