rsync without retransmitting moved files

i’m using rsync a lot; both at work [ backups, replication of content to various servers, ad-hoc copying ] and privately. it’s smart enough to avoid re-sending the whole file if it has grown a bit [ like logs like to do ] or only few bytes changed in source or destination.

out-of-the-box rsync is snot smart enough to detect that given file was moved around in a directory structure. that happens a lot when i’m re-organizing my collection of photos:


root@srcserver:/tmp/test# ls -la
total 5420
drwxr-xr-x  2 root root    4096 Mar  2 09:30 .
drwxrwxrwt 16 root root    4096 Mar  2 09:30 ..
-rw-r--r--  1 root root  966246 Sep  9  2010 IMG_3185.jpg
-rw-r--r--  1 root root 1191165 Sep  9  2010 IMG_3318.jpg
-rw-r--r--  1 root root 2090196 Sep  9  2010 IMG_3343.jpg
-rw-r--r--  1 root root 1287526 Sep  9  2010 IMG_3369.jpg

root@dstserver:/tmp# rsync -av --progress root@srcserver:/tmp/test ./
receiving incremental file list
test/
test/IMG_3185.jpg
        966,246 100%    4.54MB/s    0:00:00 (xfr#1, to-chk=3/5)
test/IMG_3318.jpg
      1,191,165 100%    4.44MB/s    0:00:00 (xfr#2, to-chk=2/5)
test/IMG_3343.jpg
      2,090,196 100%    6.11MB/s    0:00:00 (xfr#3, to-chk=1/5)
test/IMG_3369.jpg
      1,287,526 100%    3.30MB/s    0:00:00 (xfr#4, to-chk=0/5)

sent 104 bytes  received 5,536,784 bytes  3,691,258.67 bytes/sec
total size is 5,535,133  speedup is 1.00

root@srcserver:/tmp/test# mkdir somedir
root@srcserver:/tmp/test# mv IMG_3369.jpg somedir/

root@dstserver:/tmp# rsync --delete-after -av --progress root@srcserver:/tmp/test ./
receiving file list ... 6 files to consider
test/
test/somedir/
test/somedir/IMG_3369.jpg
      1,287,526 100%    5.39MB/s    0:00:00 (xfr#1, to-chk=0/6)
 0 files...
deleting test/IMG_3369.jpg

sent 48 bytes  received 1,288,047 bytes  515,238.00 bytes/sec
total size is 5,535,133  speedup is 4.30

that’s not good – we had to re-transmit 1.2MB of content that was already present at the destination.

rsync has –fuzzy parameter, but it does not solve this issue: This option tells rsync that it should look for a basis file for any destination file that is missing. The current algorithm looks in the same directory as the destination file for either a file that has an identical size and modified-time, or a similarly-named file. If found, rsync uses the fuzzy basis file to try to speed up the transfer. [ Debian’s man for rsync 3.2.7 ].

this post explains in more details how it’s implemented.

but.. don’t give up – there are few options:

i could not wrap my head around the first two, but the last did the trick and worked with the latest rsync.


wget https://www.samba.org/ftp/rsync/rsync-3.4.1.tar.gz https://www.samba.org/ftp/rsync/rsync-patches-3.4.1.tar.gz
tar -xvf rsync-3.4.1.tar.gz
tar -xvf rsync-patches-3.4.1.tar.gz

cd rsync-3.4.1

patch -p1 -N < patches/detect-renamed.diff

patch -p1 -N < patches/detect-renamed-lax.diff

./configure ; make

in turn we get rsync with an extra flag – –detect-moved which will detect moved file at the destination and not retransmit it from the source if size and file name [ but not file location ] match.

let’s try earlier mentioned scenario with patched rsync:

root@srcserver:/tmp/test# mv IMG_3318.jpg  somedir/

root@dstserver:/tmp# /tmp/src/rsync-3.4.1/rsync --delete-after --detect-moved -av --progress root@srcserver:/tmp/test ./
receiving file list ... 6 files to consider
test/
test/somedir/
test/somedir/IMG_3318.jpg
 0 files...
deleting test/IMG_3318.jpg

sent 46 bytes  received 172 bytes  145.33 bytes/sec
total size is 5,535,133  speedup is 25,390.52

this time rsync has sent much less data! also – it was enough to have patched rsync at the destination server – machine where i was pulling the data; srcserver had Debian’s standard rsync.

Leave a Reply

Your email address will not be published. Required fields are marked *

(Spamcheck Enabled)