{"id":3710,"date":"2025-03-02T10:54:58","date_gmt":"2025-03-02T09:54:58","guid":{"rendered":"https:\/\/kudzia.eu\/b\/?p=3710"},"modified":"2025-03-02T10:55:49","modified_gmt":"2025-03-02T09:55:49","slug":"rsync-without-retransmitting-moved-files","status":"publish","type":"post","link":"https:\/\/kudzia.eu\/b\/2025\/03\/rsync-without-retransmitting-moved-files\/","title":{"rendered":"rsync without retransmitting moved files"},"content":{"rendered":"\n<p>i&#8217;m using <em>rsync<\/em> a lot; both at work [ backups, replication of content to various servers, ad-hoc copying ] and privately. it&#8217;s smart enough to avoid re-sending the whole file if it has grown a bit [ like logs like to do ] or only few bytes changed in source or destination.<\/p>\n\n\n\n<p>out-of-the-box rsync is snot smart enough to detect that given file was moved around in a directory structure. that happens a lot when i&#8217;m re-organizing my collection of photos:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>\nroot@srcserver:\/tmp\/test# ls -la\ntotal 5420\ndrwxr-xr-x  2 root root    4096 Mar  2 09:30 .\ndrwxrwxrwt 16 root root    4096 Mar  2 09:30 ..\n-rw-r--r--  1 root root  966246 Sep  9  2010 IMG_3185.jpg\n-rw-r--r--  1 root root 1191165 Sep  9  2010 IMG_3318.jpg\n-rw-r--r--  1 root root 2090196 Sep  9  2010 IMG_3343.jpg\n-rw-r--r--  1 root root 1287526 Sep  9  2010 IMG_3369.jpg\n\nroot@dstserver:\/tmp# rsync -av --progress root@srcserver:\/tmp\/test .\/\nreceiving incremental file list\ntest\/\ntest\/IMG_3185.jpg\n        966,246 100%    4.54MB\/s    0:00:00 (xfr#1, to-chk=3\/5)\ntest\/IMG_3318.jpg\n      1,191,165 100%    4.44MB\/s    0:00:00 (xfr#2, to-chk=2\/5)\ntest\/IMG_3343.jpg\n      2,090,196 100%    6.11MB\/s    0:00:00 (xfr#3, to-chk=1\/5)\ntest\/IMG_3369.jpg\n      1,287,526 100%    3.30MB\/s    0:00:00 (xfr#4, to-chk=0\/5)\n\nsent 104 bytes  received 5,536,784 bytes  3,691,258.67 bytes\/sec\ntotal size is 5,535,133  speedup is 1.00\n\nroot@srcserver:\/tmp\/test# mkdir somedir\nroot@srcserver:\/tmp\/test# mv IMG_3369.jpg somedir\/\n\nroot@dstserver:\/tmp# rsync --delete-after -av --progress root@srcserver:\/tmp\/test .\/\nreceiving file list ... 6 files to consider\ntest\/\ntest\/somedir\/\ntest\/somedir\/IMG_3369.jpg\n      1,287,526 100%    5.39MB\/s    0:00:00 (xfr#1, to-chk=0\/6)\n 0 files...\ndeleting test\/IMG_3369.jpg\n\nsent 48 bytes  received 1,288,047 bytes  515,238.00 bytes\/sec\ntotal size is 5,535,133  speedup is 4.30<\/code><\/pre>\n\n\n\n<p>that&#8217;s not good &#8211; we had to re-transmit 1.2MB of content that was already present at the destination.<\/p>\n\n\n\n<p>rsync has <em>&#8211;fuzzy <\/em>parameter, but it does not solve this issue: <em>This option tells rsync that it should look for a basis file for any destination file that is missing. The current algorithm looks in the same directory as the destination file for either a file that has an identical size and modified-time, or a similarly-named file. If found, rsync uses the fuzzy basis file to try to speed up the transfer. <\/em>[ Debian&#8217;s man for rsync 3.2.7 ]. <\/p>\n\n\n\n<p><a href=\"https:\/\/cybso.de\/blog\/2013-12\/rsyncs-fuzzy-parameter-what-similar-named-file\/\">this<\/a> post explains in more details how it&#8217;s implemented.<\/p>\n\n\n\n<p>but.. don&#8217;t give up &#8211; there are few options:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/github.com\/m-manu\/rsync-sidekick\">https:\/\/github.com\/m-manu\/rsync-sidekick<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/github.com\/dparoli\/hrsync\">https:\/\/github.com\/dparoli\/hrsync<\/a> based on <a href=\"https:\/\/lincolnloop.com\/insights\/detecting-file-moves-renames-rsync\/\">https:\/\/lincolnloop.com\/insights\/detecting-file-moves-renames-rsync\/<\/a><\/li>\n\n\n\n<li>related discussions: <a href=\"https:\/\/serverfault.com\/questions\/489289\/handling-renamed-files-or-directories-in-rsync\">https:\/\/serverfault.com\/questions\/489289\/handling-renamed-files-or-directories-in-rsync<\/a><\/li>\n\n\n\n<li>detect-renamed + detect-renamed-lax patches<\/li>\n<\/ul>\n\n\n\n<p>i could not wrap my head around the first two, but the last did the trick and worked with the latest rsync.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code><br>wget https:\/\/www.samba.org\/ftp\/rsync\/rsync-3.4.1.tar.gz https:\/\/www.samba.org\/ftp\/rsync\/rsync-patches-3.4.1.tar.gz<br>tar -xvf rsync-3.4.1.tar.gz <br>tar -xvf rsync-patches-3.4.1.tar.gz<br><br>cd rsync-3.4.1<br><br>patch -p1 -N &lt; patches\/detect-renamed.diff<br><br>patch -p1 -N &lt; patches\/detect-renamed-lax.diff<br><br>.\/configure ; make<\/code><\/pre>\n\n\n\n<p>in turn we get <em>rsync<\/em> with an extra flag &#8211; <em>&#8211;detect-moved<\/em> which will detect moved file at the destination and not retransmit it from the source if size and file name [ but not file location ] match. <\/p>\n\n\n\n<p>let&#8217;s try earlier mentioned scenario with patched rsync:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>root@srcserver:\/tmp\/test# mv IMG_3318.jpg  somedir\/\n\nroot@dstserver:\/tmp# \/tmp\/src\/rsync-3.4.1\/rsync --delete-after --detect-moved -av --progress root@srcserver:\/tmp\/test .\/\nreceiving file list ... 6 files to consider\ntest\/\ntest\/somedir\/\ntest\/somedir\/IMG_3318.jpg\n 0 files...\ndeleting test\/IMG_3318.jpg\n\nsent 46 bytes  received 172 bytes  145.33 bytes\/sec\ntotal size is 5,535,133  speedup is 25,390.52\n<\/code><\/pre>\n\n\n\n<p>this time rsync has sent much less data! also &#8211; it was enough to have patched rsync at the destination server &#8211; machine where i was pulling the data; srcserver had Debian&#8217;s standard rsync.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>i&#8217;m using rsync a lot; both at work [ backups, replication of content to various servers, ad-hoc copying ] and privately. it&#8217;s smart enough to avoid re-sending the whole file if it has grown a bit [ like logs like to do ] or only few bytes changed in source or destination. out-of-the-box rsync is [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[17],"tags":[56],"class_list":["post-3710","post","type-post","status-publish","format-standard","hentry","category-tech","tag-rsync"],"_links":{"self":[{"href":"https:\/\/kudzia.eu\/b\/wp-json\/wp\/v2\/posts\/3710","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/kudzia.eu\/b\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/kudzia.eu\/b\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/kudzia.eu\/b\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/kudzia.eu\/b\/wp-json\/wp\/v2\/comments?post=3710"}],"version-history":[{"count":2,"href":"https:\/\/kudzia.eu\/b\/wp-json\/wp\/v2\/posts\/3710\/revisions"}],"predecessor-version":[{"id":3712,"href":"https:\/\/kudzia.eu\/b\/wp-json\/wp\/v2\/posts\/3710\/revisions\/3712"}],"wp:attachment":[{"href":"https:\/\/kudzia.eu\/b\/wp-json\/wp\/v2\/media?parent=3710"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/kudzia.eu\/b\/wp-json\/wp\/v2\/categories?post=3710"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/kudzia.eu\/b\/wp-json\/wp\/v2\/tags?post=3710"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}