qcow2 snapshots leading to data corruption in KVM + libvirt coming from Debian Trixie?

i don’t have any reproducible scenario, but i want to leave this one out in the open. maybe someone else experiences similar issue?

after upgrading both OS on the physical servers and under VMs to Debian Trixie i’ve experienced 4 database corruptions – both under MySQL 5.8.4 and PostgreSQL 17, on different physical servers. it’s something i have not seen in the past at such scale, in such a short period of time [ ~4 months ].

what i’ve seen in MySQL:

2025-12-14T20:31:28.026983Z 82383692 [ERROR] [MY-013183] [InnoDB] Assertion failure: btr0pcur.cc:339:cur_page == prev_of_next thread 140504756078272
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/8.4/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
2025-12-14T20:31:28Z UTC - mysqld got signal 6 ;
Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware.
BuildID[sha1]=a63fbd129f41632f3d0a68a68702da454aaf928d
Thread pointer: 0x7fc0f4056250
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7fc9d0133b70 thread_stack 0x100000
 #0 0x55db410d8ce1 <unknown>
 #1 0x55db410d928e <unknown>
 #2 0x55db4223d889 <unknown>
 #3 0x55db4249e884 <unknown>
 #4 0x55db424e7fe7 <unknown>
 #5 0x55db42411d77 <unknown>
 #6 0x55db4241b65e <unknown>
 #7 0x55db422afb4f <unknown>
 #8 0x55db411fa476 <unknown>
 #9 0x55db4134e0b0 <unknown>
 #10 0x55db4147a96c <unknown>
 #11 0x55db40fc6c5e <unknown>
 #12 0x55db40f621a0 <unknown>
 #13 0x55db40f65c2c <unknown>
 #14 0x55db40f6854d <unknown>
 #15 0x55db40f69125 <unknown>
 #16 0x55db410c8fb7 <unknown>
 #17 0x55db4298a833 <unknown>
 #18 0x7fcf2a46fb7a <unknown>
 #19 0x7fcf2a4ed7b7 <unknown>
 #20 0xffffffffffffffff <unknown>

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7fc0f44a6ec0): /* ApplicationName=DBeaver 25.2.4 - SQLEditor <DBNAME.TABLENAME> */ delete  FROM DBNAME.TABLENAME where EntityUpdateType  = 3
Connection ID (thread ID): 82383692
Status: NOT_KILLED

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
2025-12-14T20:31:28.026984Z 0 [System] [MY-013951] [Server] 2025-12-14T20:31:28Z UTC - mysqld got signal 6 ;
2025-12-14T20:31:28.026985Z 0 [System] [MY-013951] [Server] Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware.
2025-12-14T20:31:28.026986Z 0 [System] [MY-013951] [Server] BuildID[sha1]=a63fbd129f41632f3d0a68a68702da454aaf928d
2025-12-14T20:31:28.026987Z 0 [System] [MY-013951] [Server] Thread pointer: 0x7fc0f4056250
2025-12-14T20:31:28.026988Z 0 [System] [MY-013951] [Server] Attempting backtrace. You can use the following information to find out
2025-12-14T20:31:28.026989Z 0 [System] [MY-013951] [Server] where mysqld died. If you see no messages after this, something went
2025-12-14T20:31:28.026990Z 0 [System] [MY-013951] [Server] terribly wrong...
2025-12-14T20:31:28.026991Z 0 [System] [MY-013951] [Server] stack_bottom = 7fc9d0133b70 thread_stack 0x100000
2025-12-14T20:31:28.026992Z 0 [System] [MY-013951] [Server]  #0 0x55db410d8ce1 <unknown>
2025-12-14T20:31:28.026993Z 0 [System] [MY-013951] [Server]  #1 0x55db410d928e <unknown>
2025-12-14T20:31:28.026994Z 0 [System] [MY-013951] [Server]  #2 0x55db4223d889 <unknown>
2025-12-14T20:31:28.026995Z 0 [System] [MY-013951] [Server]  #3 0x55db4249e884 <unknown>
2025-12-14T20:31:28.026996Z 0 [System] [MY-013951] [Server]  #4 0x55db424e7fe7 <unknown>
2025-12-14T20:31:28.026997Z 0 [System] [MY-013951] [Server]  #5 0x55db42411d77 <unknown>
2025-12-14T20:31:28.026998Z 0 [System] [MY-013951] [Server]  #6 0x55db4241b65e <unknown>
2025-12-14T20:31:28.026999Z 0 [System] [MY-013951] [Server]  #7 0x55db422afb4f <unknown>
2025-12-14T20:31:28.027000Z 0 [System] [MY-013951] [Server]  #8 0x55db411fa476 <unknown>
2025-12-14T20:31:28.027001Z 0 [System] [MY-013951] [Server]  #9 0x55db4134e0b0 <unknown>
2025-12-14T20:31:28.027002Z 0 [System] [MY-013951] [Server]  #10 0x55db4147a96c <unknown>
2025-12-14T20:31:28.027003Z 0 [System] [MY-013951] [Server]  #11 0x55db40fc6c5e <unknown>
2025-12-14T20:31:28.027004Z 0 [System] [MY-013951] [Server]  #12 0x55db40f621a0 <unknown>
2025-12-14T20:31:28.027005Z 0 [System] [MY-013951] [Server]  #13 0x55db40f65c2c <unknown>
2025-12-14T20:31:28.027006Z 0 [System] [MY-013951] [Server]  #14 0x55db40f6854d <unknown>
2025-12-14T20:31:28.027007Z 0 [System] [MY-013951] [Server]  #15 0x55db40f69125 <unknown>
2025-12-14T20:31:28.027008Z 0 [System] [MY-013951] [Server]  #16 0x55db410c8fb7 <unknown>
2025-12-14T20:31:28.027009Z 0 [System] [MY-013951] [Server]  #17 0x55db4298a833 <unknown>
2025-12-14T20:31:28.027010Z 0 [System] [MY-013951] [Server]  #18 0x7fcf2a46fb7a <unknown>
2025-12-14T20:31:28.027011Z 0 [System] [MY-013951] [Server]  #19 0x7fcf2a4ed7b7 <unknown>
2025-12-14T20:31:28.027012Z 0 [System] [MY-013951] [Server]  #20 0xffffffffffffffff <unknown>
2025-12-14T20:31:28.027013Z 0 [System] [MY-013951] [Server] Trying to get some variables.
2025-12-14T20:31:28.027014Z 0 [System] [MY-013951] [Server] Some pointers may be invalid and cause the dump to abort.
2025-12-14T20:31:28.027015Z 0 [System] [MY-013951] [Server] Query (7fc0f44a6ec0): /* ApplicationName=DBeaver 25.2.4 - SQLEditor <DBNAME.TABLENAME> */ delete  FROM DBNAME.TABLENAME where EntityUpdateType  = 3
2025-12-14T20:31:28.027016Z 0 [System] [MY-013951] [Server] Connection ID (thread ID): 82383692
2025-12-14T20:31:28.027017Z 0 [System] [MY-013951] [Server] Status: NOT_KILLED
2025-12-14T20:31:28.027018Z 0 [System] [MY-013951] [Server] The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
2025-12-14T20:31:28.027019Z 0 [System] [MY-013951] [Server] information that should help you find out what is causing the crash.
2025-12-14T20:31:29.025290Z 0 [System] [MY-015015] [Server] MySQL Server - start.

another case in MySQL:

2026-01-05T10:36:56.139551Z 260899 [ERROR] [MY-012153] [InnoDB] Trying to access page number 4264257694 in space 248603, space name DBNAME/TABLENAME, which is outside the tablespace bounds. Byte offset 0, len 16384, i/o type read. If you get this error at mysqld startup, please check that your my.cnf matches the ibdata files that you have in the MySQL server.
2026-01-05T10:36:56.139623Z 260899 [ERROR] [MY-012154] [InnoDB] Server exits.
2026-01-05T10:36:56.139633Z 260899 [ERROR] [MY-013183] [InnoDB] Assertion failure: fil0fil.cc:7541 thread 140658412197568
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/8.4/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
2026-01-05T10:36:56Z UTC - mysqld got signal 6 ;
Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware.
BuildID[sha1]=a63fbd129f41632f3d0a68a68702da454aaf928d
Thread pointer: 0x7fe65c018480
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7fed96b18b70 thread_stack 0x100000
 #0 0x55d918950ce1 <unknown>
 #1 0x55d91895128e <unknown>
 #2 0x55d919ab5889 <unknown>
 #3 0x55d919d16884 <unknown>
 #4 0x55d919e7acdd <unknown>
 #5 0x55d919e98975 <unknown>
 #6 0x55d919e989fd <unknown>
 #7 0x55d919db983b <unknown>
 #8 0x55d919db9c35 <unknown>
 #9 0x55d919d707dd <unknown>
 #10 0x55d919d7ced7 <unknown>
 #11 0x55d919d7cfc0 <unknown>
 #12 0x55d919d7de51 <unknown>
 #13 0x55d919f00732 <unknown>
 #14 0x55d919f010c3 <unknown>
 #15 0x55d919b81290 <unknown>
 #16 0x55d919c8c10e <unknown>
 #17 0x55d919c8cb3f <unknown>
 #18 0x55d919c95c37 <unknown>
 #19 0x55d919b27b4f <unknown>
 #20 0x55d918a72361 <unknown>
 #21 0x55d918bc60b0 <unknown>
 #22 0x55d9188b4baf <unknown>
 #23 0x55d9188b4f3b <unknown>
 #24 0x55d918833dba <unknown>
 #25 0x55d91883ec5e <unknown>
 #26 0x55d9187da1a0 <unknown>
 #27 0x55d9187ddc2c <unknown>
 #28 0x55d9187e054d <unknown>
 #29 0x55d9187e1125 <unknown>
 #30 0x55d918940fb7 <unknown>
 #31 0x55d91a202833 <unknown>
 #32 0x7ff05629cb7a start_thread at ./nptl/pthread_create.c:448
 #33 0x7ff05631a7b7 __clone3 at sysdeps/unix/sysv/linux/x86_64/clone3.S:78
 #34 0xffffffffffffffff <unknown>

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7fe65c3985b0): SELECT /*!40001 SQL_NO_CACHE */ * FROM `TABLENAME`
Connection ID (thread ID): 260899
Status: NOT_KILLED

in PostgreSQL:

2025-12-14 22:56:43.054 UTC [1441945] postgres@application_cache ERROR:  unexpected chunk number 2 (expected 0) for toast value 294783547 in pg_toast_82860713
2025-12-14 22:56:43.054 UTC [1441945] postgres@application_cache STATEMENT:  COPY public.generic_storage (id, content_type, model_type, content_updated_date, stored_date, content_data, data_source) TO stdout;

what do both servers have in common? both are running as KVM vms, i’m taking VM-level snapshots of both – by running on the virtualization host:

virsh snapshot-create-as --domain VMNAME --name "kvmsnap-VMNAME" --no-metadata --atomic --disk-only vdX,external
executing rsync to fetch the vdX.qcow2 to another server
virsh blockcommit VMNAME vdX--active --pivot --verbose
deleting the snapshot files from directories where the qcow2 files are kept for VMs

all corruptions that i’ve experienced happened soon after VM-level backup was executed; during backup VMs were under heavy read&write loads.

i’ve searched for similar cases but could not find anything relevant. the closest i could find is a mention of bug fix present in libvirt 11:

qemu: set detect_zeroes for all backing chain layers

Some block jobs (snapshots, block commit) could modify the backing chain in a way where detect_zeroes would no longer be honoured. We now set it for all images in the backing chain, so that it will behave correctly even after those operations

if you’ve experienced a similar issue – how did you solve it?

i’ve moved snapshots of the most busy servers to less loaded database replicas.

issue reported to the qemu project – https://gitlab.com/qemu-project/qemu/-/issues/3273

Leave a Reply Cancel reply