Friday, March 13, 2009

Cygwin Rsync over SSH Hang - Solution

It is well known that running rsync over SSH to pull data from a windows cygwin host hangs and there is no good workaround but to not use SSH PIPES. Well, the not USE_PIPES workaround is not so good since it fails sometimes and the major drawback is the speed transfer which is about 30-40% less then using a normal SSH pipe.

My solution addresses both security and transfer speed and there is no need to compile or modify any software whatsoever.

The idea is to use SSH tunnel with port forwarding.
Lets say we have the following hosts:

Host_A: Windows Cygwin Host, running SSH server and has the data to be backed up
Host_B: Linux Host, The backup server where the backup scripts are running

We want to initiate a command on Host_B that will pull data from Host_A. Normally, you will do this by running:

Host_B# rsync ---verbose --stats --progress --rsh=ssh HOST_A: /backups/

That command will hang after a while. Here is how to do it with my workaround:

1) Install Rsync Daemon (as a Windows service) on Host_A:
Host_A# cygrunsrv.exe -I "Rsync" -p /cygdrive/c/cygwin/bin/rsync.exe -a "--config=/cygdrive/c/cygwin/etc/rsyncd.conf --daemon --no-detach" -f "Rsync daemon service"

2) Create /etc/rsyncd.conf on Host_A
----------
use chroot = false
strict modes = false
address = 127.0.0.1

[data]
path = /cygdrive/c/
comment = data
-----------
Note: I used 127.0.0.1 the address to bind to be more secured. No other hosts will connect to rsync server other then the host itself.

3) Verify your rsync server
Host_A# rsync rsync://localhost/data/

4) Lets create the SSH tunnel
Host_B# ssh -L 1234:127.0.0.1:873 root@Host_A
It prompts for Host_A password

Check if you have port 1234 binded on Host_B. If not, something went wrong with previouse command

Host_B# netstat -lnp | grep 1234

5) Check if you can see the remote rsync server on local Host_B port 1234
Host_B# rsync rsync://localhost:1234/data/

6) Proceed with the backup command
Host_B# rsync --verbose --stats --progress --recursive rsync://localhost:1234/data/ /backups//

It will not hang and you will get all the data.

More then that, if you are running rsync the second time, for the same data, then it will be much faster then doing it as you normally do with --rsh=ssh option.

Still have problems ? No, I don't think you will...but if so - please let me know