Add option to skip zero pages during VM migration by arctic-alpaca · Pull Request #117 · cyberus-technology/cloud-hypervisor

arctic-alpaca · 2026-03-23T08:15:53Z

Important

Drafted for now until we have a clearer picture how to incorporate this into the stats tracking.

Improvement of #112.

A VM may have previously unused memory that is still zeroed (or memory zeroed by the guest, but that's more unlikely). This memory doesn't need to be transferred during a migration as the migration destination provides zeroed memory to the VM anyway. This PR adds the option to skip zero pages during migration.

The zero page skipping now scales with the number of connections because it's done in the connection thread. I think that's an intuitive way to scale without adding an additional parameter to configure the amount of zero page skip threads.
By moving to the connection threads, we also don't require all memory to be scanned for zero pages before sending the first byte of memory to the destination. Memory is scanned in chunks as it's passed to the connection threads.

All benchmarks were done between livemig-dellemc-2tb-1 and livemig-dellemc-2tb-2 and run only once per setup, so expect some variation between runs.

Setup	Migration time	% of no skip
32GiB, 4 vCPU, no memtouch, no skip, 1 connection	18666ms	100%
32GiB, 4vCPU, no memtouch, with skip, 1 connection	2297ms	12%
1TiB, 32vCPU, no memtouch, no skip, 8 connections	90462ms	100%
1TiB, 32vCPU, no memtouch, with skip, 8 connections	7855ms	9%
32GiB, 4 vCPU, with 50% memtouch, no skip, 1 connection	16700ms	100%
32GiB, 4 vCPU, with 50% memtouch, with skip, 1 connection	10991ms	66%
1TiB, 32vCPU, with 50% memtouch, no skip, 8 connections	95710ms	100%
1TiB, 32vCPU, with 50% memtouch, with skip, 8 connections	56691ms	59%
32GiB, 4 vCPU, with 100% memtouch, no skip, 1 connection	18350ms	100%
32GiB, 4 vCPU, with 100% memtouch, with skip, 1 connection	19221ms	105%
1TiB, 32vCPU, with 100% memtouch, no skip, 8 connections	114490ms	100%
1TiB, 32vCPU, with 100% memtouch, with skip, 8 connections	108835ms	95%
32GiB, 4 vCPU, with 100% memtouch, no skip, 2 connection	11604ms	100%
32GiB, 4 vCPU, with 100% memtouch, with skip, 2 connection	9708ms	84%

Benchmark commands

32GiB, 4 vCPU, no memtouch, no skip, 1 connection

> cargo run --release --bin cloud-hypervisor -- --api-socket /tmp/jschindel_chv1.sock --kernel result/linux_6_19.bzImage --cmdline "console=ttyS0" --serial tty --console off --initramfs result/initrd_default --seccomp log -vv --memory size=32G --cpus boot=4
> cargo run --release --bin ch-remote -- --api-socket /tmp/jschindel_chv1.sock send-migration tcp:192.168.123.2:7868 --downtime 200 --migration-timeout 12000

32GiB, 4vCPU, no memtouch, with skip, 1 connection

> cargo run --release --bin cloud-hypervisor -- --api-socket /tmp/jschindel_chv1.sock --kernel result/linux_6_19.bzImage --cmdline "console=ttyS0" --serial tty --console off --initramfs result/initrd_default --seccomp log -vv --memory size=32G --cpus boot=4
> cargo run --release --bin ch-remote -- --api-socket /tmp/jschindel_chv1.sock send-migration tcp:192.168.123.2:7868 --downtime 200 --migration-timeout 12000 --skip-zero-pages

1TiB, 32vCPU, no memtouch, no skip, 8 connections

> cargo run --release --bin cloud-hypervisor -- --api-socket /tmp/jschindel_chv1.sock --kernel result/linux_6_19.bzImage --cmdline "console=ttyS0" --serial tty --console off --initramfs result/initrd_default --seccomp log -vv --memory size=1024G --cpus boot=32
> cargo run --release --bin ch-remote -- --api-socket /tmp/jschindel_chv1.sock send-migration tcp:192.168.123.2:7868 --downtime 200 --migration-timeout 12000 --connections 8

1TiB, 32vCPU, no memtouch, with skip, 8 connections

> cargo run --release --bin cloud-hypervisor -- --api-socket /tmp/jschindel_chv1.sock --kernel result/linux_6_19.bzImage --cmdline "console=ttyS0" --serial tty --console off --initramfs result/initrd_default --seccomp log -vv --memory size=1024G --cpus boot=32
> cargo run --release --bin ch-remote -- --api-socket /tmp/jschindel_chv1.sock send-migration tcp:192.168.123.2:7868 --downtime 200 --migration-timeout 12000 --connections 8 --skip-zero-pages

32GiB, 4 vCPU, with 50% memtouch, no skip, 1 connection

> cargo run --release --bin cloud-hypervisor -- --api-socket /tmp/jschindel_chv1.sock --kernel result/linux_6_19.bzImage --cmdline "console=ttyS0" --serial tty --console off --initramfs result/initrd_default --seccomp log -vv --memory size=32G --cpus boot=4
> cargo run --release --bin ch-remote -- --api-socket /tmp/jschindel_chv1.sock send-migration tcp:192.168.123.2:7868 --downtime 200 --migration-timeout 12000
> memtouch --rw_ratio 100 --thread_mem 4096 --num_threads 4 --once

32GiB, 4 vCPU, with 50% memtouch, with skip, 1 connection

> cargo run --release --bin cloud-hypervisor -- --api-socket /tmp/jschindel_chv1.sock --kernel result/linux_6_19.bzImage --cmdline "console=ttyS0" --serial tty --console off --initramfs result/initrd_default --seccomp log -vv --memory size=32G --cpus boot=4
> cargo run --release --bin ch-remote -- --api-socket /tmp/jschindel_chv1.sock send-migration tcp:192.168.123.2:7868 --downtime 200 --migration-timeout 12000 --skip-zero-pages
> memtouch --rw_ratio 100 --thread_mem 4096 --num_threads 4 --once

1TiB, 32vCPU, with 50% memtouch, no skip, 8 connections

> cargo run --release --bin cloud-hypervisor -- --api-socket /tmp/jschindel_chv1.sock --kernel result/linux_6_19.bzImage --cmdline "console=ttyS0" --serial tty --console off --initramfs result/initrd_default --seccomp log -vv --memory size=1024G --cpus boot=32
> cargo run --release --bin ch-remote -- --api-socket /tmp/jschindel_chv1.sock send-migration tcp:192.168.123.2:7868 --downtime 200 --migration-timeout 12000 --connections 8
> memtouch --rw_ratio 100 --thread_mem 16192 --num_threads 32 --once

1TiB, 32vCPU, with 50% memtouch, with skip, 8 connections

> cargo run --release --bin cloud-hypervisor -- --api-socket /tmp/jschindel_chv1.sock --kernel result/linux_6_19.bzImage --cmdline "console=ttyS0" --serial tty --console off --initramfs result/initrd_default --seccomp log -vv --memory size=1024G --cpus boot=32
> cargo run --release --bin ch-remote -- --api-socket /tmp/jschindel_chv1.sock send-migration tcp:192.168.123.2:7868 --downtime 200 --migration-timeout 12000 --connections 8 --skip-zero-pages
> memtouch --rw_ratio 100 --thread_mem 16192 --num_threads 32 --once

32GiB, 4 vCPU, with 100% memtouch, no skip, 1 connection

> cargo run --release --bin cloud-hypervisor -- --api-socket /tmp/jschindel_chv1.sock --kernel result/linux_6_19.bzImage --cmdline "console=ttyS0" --serial tty --console off --initramfs result/initrd_default --seccomp log -vv --memory size=32G --cpus boot=4
> cargo run --release --bin ch-remote -- --api-socket /tmp/jschindel_chv1.sock send-migration tcp:192.168.123.2:7868 --downtime 200 --migration-timeout 12000
> memtouch --rw_ratio 100 --thread_mem 8128 --num_threads 4 --once

32GiB, 4 vCPU, with 100% memtouch, with skip, 1 connection

> cargo run --release --bin cloud-hypervisor -- --api-socket /tmp/jschindel_chv1.sock --kernel result/linux_6_19.bzImage --cmdline "console=ttyS0" --serial tty --console off --initramfs result/initrd_default --seccomp log -vv --memory size=32G --cpus boot=4
> cargo run --release --bin ch-remote -- --api-socket /tmp/jschindel_chv1.sock send-migration tcp:192.168.123.2:7868 --downtime 200 --migration-timeout 12000 --skip-zero-pages
> memtouch --rw_ratio 100 --thread_mem 8128 --num_threads 4 --once

1TiB, 32vCPU, with 100% memtouch, no skip, 8 connections

> cargo run --release --bin cloud-hypervisor -- --api-socket /tmp/jschindel_chv1.sock --kernel result/linux_6_19.bzImage --cmdline "console=ttyS0" --serial tty --console off --initramfs result/initrd_default --seccomp log -vv --memory size=1024G --cpus boot=32
> cargo run --release --bin ch-remote -- --api-socket /tmp/jschindel_chv1.sock send-migration tcp:192.168.123.2:7868 --downtime 200 --migration-timeout 12000 --connections 8
> memtouch --rw_ratio 100 --thread_mem 32512 --num_threads 32 --once

1TiB, 32vCPU, with 100% memtouch, with skip, 8 connections

> cargo run --release --bin cloud-hypervisor -- --api-socket /tmp/jschindel_chv1.sock --kernel result/linux_6_19.bzImage --cmdline "console=ttyS0" --serial tty --console off --initramfs result/initrd_default --seccomp log -vv --memory size=1024G --cpus boot=32
> cargo run --release --bin ch-remote -- --api-socket /tmp/jschindel_chv1.sock send-migration tcp:192.168.123.2:7868 --downtime 200 --migration-timeout 12000 --connections 8 --skip-zero-pages
> memtouch --rw_ratio 100 --thread_mem 32512 --num_threads 32 --once

32GiB, 4 vCPU, 100% memtouch, no skip, 2 connection

> cargo run --release --bin cloud-hypervisor -- --api-socket /tmp/jschindel_chv1.sock --kernel result/linux_6_19.bzImage --cmdline "console=ttyS0" --serial tty --console off --initramfs result/initrd_default --seccomp log -vv --memory size=32G --cpus boot=4
> cargo run --release --bin ch-remote -- --api-socket /tmp/jschindel_chv1.sock send-migration tcp:192.168.123.2:7868 --downtime 200 --migration-timeout 12000 --connections 2
> memtouch --rw_ratio 100 --thread_mem 8128 --num_threads 4 --once

32GiB, 4 vCPU, 100% memtouch, with skip, 2 connection

> cargo run --release --bin cloud-hypervisor -- --api-socket /tmp/jschindel_chv1.sock --kernel result/linux_6_19.bzImage --cmdline "console=ttyS0" --serial tty --console off --initramfs result/initrd_default --seccomp log -vv --memory size=32G --cpus boot=4
> cargo run --release --bin ch-remote -- --api-socket /tmp/jschindel_chv1.sock send-migration tcp:192.168.123.2:7868 --downtime 200 --migration-timeout 12000 --connections 2 --skip-zero-pages
> memtouch --rw_ratio 100 --thread_mem 8128 --num_threads 4 --once

amphi

Did you check whether this still works with the live-migration statistics? bytes_to_transmit is calculated from the iteration_table before the zero pages are removed.

I think this could even break the downtime cutoff, because sending a lot of zero pages could lead to a very high bytes_per_sec value.

We should talk to @phip1611 about this.

amphi · 2026-03-24T08:39:24Z

vm-migration/src/protocol.rs

+                // Amount of bytes by which the gpa undershoots the page boundary.
+                let gpa_page_undershoot = {
+                    // Amount of bytes by which the gpa overshoots the page boundary.
+                    let offset = memory_range.gpa % page_size_u64;
+                    if offset > 0 {
+                        page_size_u64 - offset
+                    } else {
+                        0
+                    }
+                };
+


Did you take a look whether this really happens? To me it would be super weird if those ranges are not page aligned.

The obvious example is the MemoryRangeTableIterator, that will create MemoryRanges that aren't page aligned if the chunk size isn't a multiple of the page size (which isn't enforced).

In general, I don't want to introduce a panic for something we can handle instead. Adding a warn level log message might be a good idea though.

Without knowing all the details right now. How about a debug_assert!? This way, we catch things in libvirt-tests, see https://github.com/cyberus-technology/libvirt-tests/blob/059637c2128db9e4d0a37cf2c34f810ad6ce959b/flake.nix#L70

We don't use cloud hypervisor with debug assertions in the customer, so it should be fine.

How about just not checking the overshoot and undershoot for being zero filled? We can just cut it off and create a MemoryRange from it. This way, the zero page checking less complex and we don't panic.

amphi · 2026-03-24T09:28:33Z

vm-migration/src/protocol.rs

+                if gpa_page_undershoot != 0
+                    && !guest_memory_is_equal(
+                        current_gpa,
+                        &ZERO_PAGE[..gpa_page_undershoot as usize],
+                        guest_memory,
+                    )?
+                {
+                    current_length += gpa_page_undershoot;
+                }
+
+                for page_start in (0..page_amount)
+                    .map(|page_index| page_index * page_size_u64 + first_page_boundary)
+                {
+                    // If the current page is zero, we push all previous non-zero pages to
+                    // `processed_data` and set `current_gpa` to the end of the zero page while
+                    // resetting the length.
+                    if guest_memory_is_equal(page_start, &ZERO_PAGE, guest_memory)? {
+                        if current_length != 0 {
+                            processed_data.push(MemoryRange {
+                                gpa: current_gpa,
+                                length: current_length,
+                            });
+                        }
+                        current_gpa += current_length + page_size_u64;
+                        current_length = 0;
+                    } else {
+                        current_length += page_size_u64;
+                    }
+                }


Let's assume we have a range with:

2kB zeroes,

4kB dirty,

4kB zeroes,

current_gpa = 0x800 (2048) and current_length = 0

Step current_gpa current_length

1 0x800 0

2 0x800 4096

3 0x1800 0

Where

step 1: Handling of unaligned memory

step 2: Handling of first aligned page

step 3: Handling of second aligned page

After step three, you push gpa: 0x800 and length: 4096 into processed_data, which is wrong, because you should push gpa: 0x1000 and length: 4096.

Or am I missing something here?

I think your tests do not cover this case.

You're right, thanks for catching that! Will fix and add test(s).

amphi · 2026-03-24T09:31:32Z

vmm/src/api/openapi/cloud-hypervisor.yaml

+        skip-zero-pages:
+          type: boolean


I think this should be skip_zero_pages to make it match the actual implementation.

amphi · 2026-03-24T09:33:29Z

vmm/src/api/mod.rs

+    /// Skip zero-filled pages when sending VM memory to the receiver.
+    pub skip_zero_pages: bool,


Does that need some #[serde(default)]? I am not sure what happens when you use HTTP instead of ch-remote and the field has no #[serde(default)].-

I think having #[serde(default)] is the correct approach 👍

In the `vm_send_memory` call, we're now skipping all pages completely filled with zeroes. This reduces the memory that needs to be transferred during migration if the VM has zero pages in its memory. On-behalf-of: SAP julian.schindel@sap.com Signed-off-by: Julian Schindel <julian.schindel@cyberus-technology.de>

On-behalf-of: SAP julian.schindel@sap.com Signed-off-by: Julian Schindel <julian.schindel@cyberus-technology.de>

On-behalf-of: SAP julian.schindel@sap.com Signed-off-by: Julian Schindel <julian.schindel@cyberus-technology.de> # Conflicts: # vmm/src/lib.rs

arctic-alpaca mentioned this pull request Mar 23, 2026

Add option to skip zero pages during VM migration #112

Closed

arctic-alpaca force-pushed the omit-zero-pages-2 branch from 7cf2128 to c76d937 Compare March 23, 2026 08:17

arctic-alpaca marked this pull request as ready for review March 23, 2026 08:37

arctic-alpaca requested review from Coffeeri, amphi, olivereanderson and phip1611 March 23, 2026 08:37

arctic-alpaca force-pushed the omit-zero-pages-2 branch from c76d937 to 429ba34 Compare March 23, 2026 11:06

amphi requested changes Mar 24, 2026

View reviewed changes

arctic-alpaca force-pushed the omit-zero-pages-2 branch from 429ba34 to 5ac4b1e Compare March 26, 2026 14:08

arctic-alpaca added 2 commits March 26, 2026 15:12

vm-migration: log total migration time

cad6292

On-behalf-of: SAP julian.schindel@sap.com Signed-off-by: Julian Schindel <julian.schindel@cyberus-technology.de>

arctic-alpaca force-pushed the omit-zero-pages-2 branch from 5ac4b1e to 4e05eda Compare March 26, 2026 14:12

ch-remote: add skip zero pages option to API/CLI

208c259

On-behalf-of: SAP julian.schindel@sap.com Signed-off-by: Julian Schindel <julian.schindel@cyberus-technology.de> # Conflicts: # vmm/src/lib.rs

arctic-alpaca force-pushed the omit-zero-pages-2 branch from 4e05eda to 208c259 Compare March 26, 2026 14:54

arctic-alpaca marked this pull request as draft March 26, 2026 15:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add option to skip zero pages during VM migration#117

Add option to skip zero pages during VM migration#117
arctic-alpaca wants to merge 3 commits intocyberus-technology:gardenlinuxfrom
arctic-alpaca:omit-zero-pages-2

arctic-alpaca commented Mar 23, 2026 •

edited

Loading

Uh oh!

amphi left a comment

Uh oh!

amphi Mar 24, 2026

Uh oh!

arctic-alpaca Mar 24, 2026 •

edited

Loading

Uh oh!

phip1611 Mar 24, 2026

Uh oh!

arctic-alpaca Mar 24, 2026

Uh oh!

amphi Mar 24, 2026

Uh oh!

arctic-alpaca Mar 24, 2026

Uh oh!

amphi Mar 24, 2026

Uh oh!

amphi Mar 24, 2026

Uh oh!

arctic-alpaca Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		/// Skip zero-filled pages when sending VM memory to the receiver.
		pub skip_zero_pages: bool,

Conversation

arctic-alpaca commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

amphi left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arctic-alpaca Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

arctic-alpaca commented Mar 23, 2026 •

edited

Loading

arctic-alpaca Mar 24, 2026 •

edited

Loading