{"id":3352,"date":"2022-05-22T16:03:19","date_gmt":"2022-05-22T15:03:19","guid":{"rendered":"https:\/\/kudzia.eu\/b\/?p=3352"},"modified":"2022-05-22T16:59:00","modified_gmt":"2022-05-22T15:59:00","slug":"mysqldump-headaches","status":"publish","type":"post","link":"https:\/\/kudzia.eu\/b\/2022\/05\/mysqldump-headaches\/","title":{"rendered":"mysqldump headaches"},"content":{"rendered":"\n<p>over the years i&#8217;ve set up multiple backup workflows. one of them is described <a href=\"https:\/\/kudzia.eu\/b\/2013\/05\/somewhat-byzantine-backup-setup\/\">here<\/a>. part of the cycles involves taking database backups. for MySQL i&#8217;m mostly using mysqldump. it comes with some headaches:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>it&#8217;s single-threaded by nature &#8211; both for backup, and restore [ unless you chop the dump file into pieces and try to feed it into mysql in parallel ]. i&#8217;m getting backup speed at ~ 50-100MB\/s, restore speed &#8211; at 5-20MB\/s. this becomes a problem if you need to restore database server with more than few GBs of data,<\/li><li>it leads to random lock-up of update statements, despite of using <em>&#8211;skip-lock-tables &#8211;single-transaction<\/em> switches. i did not find good documentation of that behavior but I&#8217;ve read that others experienced similar problem.<\/li><\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">how about physical backup?<\/h2>\n\n\n\n<p>moving from logical backups to physical backups is one option. i could use:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li><a href=\"https:\/\/www.percona.com\/software\/mysql-database\/percona-xtrabackup\">xtrabackup<\/a> from Percona &#8211; for both Oracle&#8217;s MySQLs and MariaDBs,<\/li><li><a href=\"https:\/\/mariadb.com\/kb\/en\/mariabackup-overview\/\">mariabackup<\/a> &#8211; for MariaDB,<\/li><li>file-system-level-snapshots taken from replication slaves; it&#8217;s something i&#8217;m doing already for the most important databases, using BTRFS and <a href=\"https:\/\/github.com\/jf647\/btrfs-snap\">btrfs-snap<\/a>.<\/li><\/ul>\n\n\n\n<p>but there&#8217;s a value in logical backups &#8211; partial recovery of a specific table or row is usually quite easy, can be done with cat + grep, maybe awk. such backup is usually not too fuzzy about restoring to slightly different version of the server; if traditional data types are in use and there are no fancy indices &#8211; it&#8217;s possible to recovery from MariaDB to Oracle&#8217;s MySQL and vice-versa.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">mydumper<\/h2>\n\n\n\n<p><a href=\"https:\/\/github.com\/mydumper\/mydumper\">mydumper<\/a> sounds interesting; while it&#8217;s an independent open source project &#8211; Percona&#8217;s long-term employees are behind it &#8211; it&#8217;s a good sign.<\/p>\n\n\n\n<p>i gave it a try &#8211; using version 0.12.3-2 and got surprised that streaming backup &#8211; officially released 4 months earlier &#8211; was throwing a lot of errors:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\nroot@szctsql0.i:\/tmp\/b# mydumper --defaults-file=\/etc\/mysql\/debian.cnf --database mysql --tables-list mysql.user --rows 1000000  --trx-consistency-only  --threads 12 --compress --stream &gt; \/dev\/null\n\n** (mydumper:1349422): WARNING **: 20:01:47.789: loading mydumper: Key file does not have group \u201cmydumper\u201d\n\n** (mydumper:1349422): WARNING **: 20:01:47.789: Using trx_consistency_only, binlog coordinates will not be accurate if you are writing to non transactional tables.\n\n** (mydumper:1349422): CRITICAL **: 20:01:47.802: Data transmited for export-20220518-200147\/mysql-schema-create.sql.zst doesn&#039;t match. File size: 0 Transmited: 101\n\n** (mydumper:1349422): CRITICAL **: 20:01:47.804: Data transmited for export-20220518-200147\/mysql.user-schema.sql.zst doesn&#039;t match. File size: 0 Transmited: 745\n\n** (mydumper:1349422): CRITICAL **: 20:01:47.804: Data transmited for export-20220518-200147\/mysql.user.00000.sql.zst doesn&#039;t match. File size: 0 Transmited: 577\n\n** (mydumper:1349422): CRITICAL **: 20:01:47.806: Data transmited for export-20220518-200147\/metadata doesn&#039;t match. File size: 0 Transmited: 75\n\n<\/pre><\/div>\n\n\n<p>&#8220;critical&#8221; errors are not something you want to see in your backup routine. looks like this issue was fixed in<a href=\"https:\/\/github.com\/mydumper\/mydumper\/releases\/tag\/v0.12.3-3\"> v0.12.3-3<\/a> but i&#8217;m still pondering how to migrate from mysqldump to mydumper without losing sleep and asking myself if my backups are complete and not-corrupted.<\/p>\n\n\n\n<p>i had a chance to chat with Percona&#8217;s consultant and asked him about generic backup advice; he was, understandably, very much on the safe side. his recommendations:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>do both logical and physical backups &#8211; first to allow selective recoveries, 2nd &#8211; to be able to restore quickly,<\/li><li>have automated backup verification pipeline [ for now &#8211; i cannot justify putting so much resources at it, so this remains a pipe-dream ]:<ol><li>set up binlogs on the database server you want to back up &#8211; so it can be a primary server for replication; even if you don&#8217;t need any replication,<\/li><li>take a consistent backup using tool of your choice,<\/li><li>restore backup to another server,<\/li><li>point this server to the server from 1# and start replication,<\/li><li>let it catch-up to the 1# server by replying binary logs,<\/li><li>use <a rel=\"noreferrer noopener\" href=\"https:\/\/www.percona.com\/doc\/percona-toolkit\/LATEST\/pt-table-checksum.html\" target=\"_blank\">pt-table-checksum<\/a> &#8211; tool to check replication consistency &#8211; to confirm that replica and primary servers have identical data.<\/li><\/ol><\/li><\/ul>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>over the years i&#8217;ve set up multiple backup workflows. one of them is described here. part of the cycles involves taking database backups. for MySQL i&#8217;m mostly using mysqldump. it comes with some headaches: it&#8217;s single-threaded by nature &#8211; both for backup, and restore [ unless you chop the dump file into pieces and try [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-3352","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/kudzia.eu\/b\/wp-json\/wp\/v2\/posts\/3352","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/kudzia.eu\/b\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/kudzia.eu\/b\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/kudzia.eu\/b\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/kudzia.eu\/b\/wp-json\/wp\/v2\/comments?post=3352"}],"version-history":[{"count":3,"href":"https:\/\/kudzia.eu\/b\/wp-json\/wp\/v2\/posts\/3352\/revisions"}],"predecessor-version":[{"id":3359,"href":"https:\/\/kudzia.eu\/b\/wp-json\/wp\/v2\/posts\/3352\/revisions\/3359"}],"wp:attachment":[{"href":"https:\/\/kudzia.eu\/b\/wp-json\/wp\/v2\/media?parent=3352"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/kudzia.eu\/b\/wp-json\/wp\/v2\/categories?post=3352"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/kudzia.eu\/b\/wp-json\/wp\/v2\/tags?post=3352"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}