Skip to content

Create dm-cache test#4093

Merged
LiliDeng merged 24 commits intomicrosoft:mainfrom
rlmenge:rlmenge/dmcache-rebase
Nov 11, 2025
Merged

Create dm-cache test#4093
LiliDeng merged 24 commits intomicrosoft:mainfrom
rlmenge:rlmenge/dmcache-rebase

Conversation

@rlmenge
Copy link
Contributor

@rlmenge rlmenge commented Nov 6, 2025

Description

This PR adds a comprehensive test suite for dm-cache functionality on Azure VMs, along with 11 new LISA tools for managing LVM and device mapper resources. This is a setup and integration test - it verifies dm-cache can be configured correctly and is operational, but it doesn't benchmark performance or stress-test the caching behavior.

This test verifies dm-cache by checking the module is loadable and then checking functionality by:

  1. Creating loopback devices to simulate slow origin and fast cache disks
  2. Setting up LVM physical volumes and volume groups
  3. Creating logical volumes for origin and cache pool
  4. Attaching cache pool to origin LV to enable caching
  5. Formatting and mounting the cached logical volume
  6. Verifying the dm-cache setup is working correctly

The test validates:
✅ Kernel support: dm-cache module loads
✅ Correct architecture: Cache on fast device, origin on slow device
✅ Cache attachment: LV has cache layout, not just plain linear
✅ Policy configuration: A valid cache policy (smq/mq/cleaner) is active
✅ Functional I/O: Can format, mount, and use the cached volume
✅ Device mapper integration: dmsetup can query cache status and configuration

What It's NOT Testing
❌ Performance: Doesn't measure if cache actually improves speed
❌ Cache hits/misses: Doesn't verify cache is being used for reads/writes
❌ Data integrity: Doesn't write/read specific data patterns
❌ Different cache modes: Only tests default mode (not explicitly testing writethrough vs writeback)

Test Suite
[dmcache.py]

  • Validates dm-cache setup and basic functionality on CBL Mariner/Azure Linux VMs
  • Creates loopback devices to simulate slow origin and fast cache disks
  • Sets up complete LVM stack (PVs, VGs, LVs) with cache-enabled logical volumes
  • Verifies cache policy configuration (smq/mq/cleaner)
  • Tests filesystem operations on cached volumes (format, mount, verify)
  • Includes comprehensive cleanup in finally block

New Tools Added
LVM Physical Volume Management

  • [losetup.py] - Loopback device management
  • [pvcreate.py] - Create LVM physical volumes
  • [pvremove.py]- Remove LVM physical volumes

LVM Volume Group Management

  • [vgcreate.py] - Create LVM volume groups
  • [vgremove.py]- Remove LVM volume groups
  • [vgs.py]- Query volume group information and status

LVM Logical Volume Management

  • [lvcreate.py] - Create logical volumes with support for cache pools
  • [lvremove.py]- Remove logical volumes
  • [lvconvert.py] - Attach cache pools to origin LVs
  • [lvs.py] - Query logical volume information and cache layout

Device Mapper Management

  • [dmsetup.py]- Query device mapper devices

Testing

Tested on local AzL3 VM running in hyperv
Test will PASS if the module is available and can be loaded or is already loaded
Test will SKIP if the module is unavailable
Test will fail if the the logical volumes cannot be created correctly or the policy is not correct

Dependencies

  • LVM2 tools (pvcreate, vgcreate, lvcreate, lvconvert, lvs, vgs, lvremove, vgremove, pvremove)
  • device-mapper utilities (dmsetup)
  • util-linux (losetup)

@rlmenge rlmenge marked this pull request as ready for review November 6, 2025 18:32
@rlmenge rlmenge requested a review from LiliDeng as a code owner November 6, 2025 18:32
@rlmenge rlmenge force-pushed the rlmenge/dmcache-rebase branch from bb462fc to 5b23a7b Compare November 6, 2025 22:37
@squirrelsc
Copy link
Member

@LiliDeng LGTM

@LiliDeng LiliDeng merged commit 850ede1 into microsoft:main Nov 11, 2025
27 checks passed
Copilot AI pushed a commit that referenced this pull request Nov 14, 2025
LiliDeng added a commit that referenced this pull request Nov 18, 2025
…#4058)

* Initial plan

* Add unified message support for iperf3 TCP and UDP performance metrics

Co-authored-by: squirrelsc <27178119+squirrelsc@users.noreply.github.com>

* Refactor unified message methods to use parsed fields and add connections_num as parameter

Co-authored-by: squirrelsc <27178119+squirrelsc@users.noreply.github.com>

* Merge main and use Parameter relativity for connections_num and buffer_size metrics

Co-authored-by: squirrelsc <27178119+squirrelsc@users.noreply.github.com>

* Remove conn_suffix from metric names

Co-authored-by: squirrelsc <27178119+squirrelsc@users.noreply.github.com>

* Revert "Move examples and microsoft directories into the Python package (#4023)" (#4063)

This reverts commit 89e7b53.

* Reapply "Move examples and microsoft directories into the Python package (#4023)" (#4063)

This reverts commit efe1cd3.

* runbook: fix path for legacy layout

* Add UnifiedMessage support for NetworkLatencyPerformanceMessage

* kdump: Replace CvmDisabled with before_case SecurityProfile check (#4032)

* kdump: Replace CvmDisabled with before_case SecurityProfile check

* kdump: Fix SecurityProfile check to skip only CVM and Stateless VMs

- Remove empty simple_requirement() calls (unnecessary)

- Optimize f-string usage (only use f-prefix where needed)

- Remove unused simple_requirement import

* Add detailed panic categorization and error code extraction

* enrich SerialConsole.check_panic() to return detailed panic

* Added tests for network related components (#4009)

* notifier: remove pytest-html dependency

Replace pytest-html dependency with custom HTML
report generator using string.Template. This
change provides better control over report
formatting and reduces external dependencies.

* runbook: fix microsoft package name for new paths.

The new path is still able to be written like
"microsoft/testsuites", so that it needs to use
"microsoft" instead of "testsuites" as the package
name.

* Remove watchdog pattern from serial console panic detection (#4075)

* fix verify_cpu_count and improve PowerShell

- Implement calculate_vcpu_count() method in
  WindowsLscpu class to fix verify_cpu_count test
  failure on Windows
- Add null check for stderr in
  PowerShell.wait_result() to prevent errors when
  PowerShell is used to run cmd commands with no
  stderr output

* iDRAC: Handle  HTTP 500 internal errors with service reset

* Fix Hyper-V Stop-VM to use TurnOff on timeout/failure

* Remove overly broad stall regex pattern causing false positive panic detections (#4082)

* Initial plan

* Remove overly broad stall regex pattern to prevent false alarms

Co-authored-by: lesscodingmorehappiness <81588170+lesscodingmorehappiness@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: lesscodingmorehappiness <81588170+lesscodingmorehappiness@users.noreply.github.com>

* Revert "skip test if hv_netvsc driver is not used"

This reverts commit f6fdcf7.

* change kselftest required /tmp/ size to 1GB for Overlake SoC limited space

* Add enabled switch for environments and nodes

This change introduces an `enabled` boolean field
at both the environment and node levels, allowing
selective loading of configurations through
runbook variables.

Example:
  environment:
    - name: my_env
      enabled: $(use_first_env)  # Variable-controlled
      nodes:
        - type: local
          name: node1
          enabled: true
        - type: local
          name: node2
          enabled: false  # Skip this node

* Process: Raise exception on timeout. (#4077)

* Skip tests on L1VH Nodes (#4078)

* mshv: skip checking logfile size on l1vh

L1VH parents by default don't have any entries in mshvlog file. Skip
checking logfile size on these nodes.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>

* mshv: skip mshvtrace test on l1vh Nodes

L1VH nodes cannot collect performance traces. Skip the related test
on the L1VH nodes.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>

---------

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>

* Set minimum TLS setting 1.2 for storage accounts

Support for TLS 1.0 and 1.1 will be discontinued for all Azure Storage
accounts. The guidance is to migrate to minumum TLS version 1.2.

https://learn.microsoft.com/en-us/azure/storage/common/transport-layer-security-configure-migrate-to-tls2#why-use-tls-12

* Fix IPTable Test (#4088)

* Add virtualization feature

* doc: fix doc path after test code moved.

* doc: fix some build warnings.

* doc: allow duplicate test case names in different test suites.

* Fix VHD schema documentation to show nested hyperv_generation field (#4100)

* changes to install xxhash tool before building kernel

* Modrpobe command update for verbose is false

* Document resource_group_tags parameter for Azure runbook (#4101)

* Add Host version tracking for baremetal and HyperV platforms

* Convert GPU Driver installation to Tool, Add amd-smi (#4080)

* ch perf: Implement comprehensive performance stabilization framework

* Classify /bin/true redirections in kernel modules as not loaded

Previously, `is_module_loaded` returned True (loaded) when `modprobe -nv`
produced a blacklist directive like 'install /bin/true', causing test
cases like verify_floppy_module_is_blacklisted although module was not
actually loaded.

Added a minimal check for the install /bin/true pattern and now treat it
as not loaded, returning False.

* Kdump: Enhnace error log for incomplete dump file

* Update Nested Feature Supported list in Azure

* Create dm-cache test (#4093)

* Fix nvme device path fetch logic

* DPDK: add netvsc rescind tests (#4076)

* Remove squirrelsc from CODEOWNERS file

Co-authored-by: squirrelsc <27178119+squirrelsc@users.noreply.github.com>

* UnifiedPerfMessage: add metric_str_value to store string value (#4107)

* UnifiedPerfMessage: add str_value to store string value

* Rename str_value to metric_str_value in UnifiedPerfMessage (#4108)

* Initial plan

* Rename str_value to metric_str_value for consistency

Co-authored-by: squirrelsc <27178119+squirrelsc@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: squirrelsc <27178119+squirrelsc@users.noreply.github.com>

---------

Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Co-authored-by: squirrelsc <27178119+squirrelsc@users.noreply.github.com>

* Pass through MIGRATABLE_VERSION from pipeline environment

* Add UnifiedMessage support for NetworkPPSPerformanceMessage (#4057)

* Initial plan

* Rebase on latest main branch

* Initial plan

* Initial plan

* Rebase on latest main branch

* Sync latest code from main branch

* Clean commit history - single commit for PR changes

* Add connections_num and buffer_size to metric names as suffix

- Remove separate connections_num and buffer_size_bytes metrics
- Add suffix format: _conn_{connections_num}_buffer_{buffer_size}
- Apply suffix to all TCP metrics: rx/tx_throughput_in_gbps, congestion_windowsize_kb, retransmitted_segments
- Apply suffix to all UDP metrics: rx/tx_throughput_in_gbps, data_loss
- This allows distinguishing results by connection count and buffer size

Co-authored-by: LiliDeng <10083705+LiliDeng@users.noreply.github.com>

* Fix flake8 errors: remove trailing whitespace from blank lines

- Remove trailing whitespace from line 492 in send_iperf3_tcp_unified_perf_messages
- Remove trailing whitespace from line 534 in send_iperf3_udp_unified_perf_messages
- Fixes W293 flake8 warnings and BLK100 black formatting issue

Co-authored-by: LiliDeng <10083705+LiliDeng@users.noreply.github.com>

---------

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: squirrelsc <27178119+squirrelsc@users.noreply.github.com>
Co-authored-by: LiliDeng <lildeng@microsoft.com>
Co-authored-by: Chi Song (from Dev Box) <chisong@microsoft.com>
Co-authored-by: Vivek Yadav <vyadav@microsoft.com>
Co-authored-by: Balashivaram Ganesan <71939272+Balashivaram@users.noreply.github.com>
Co-authored-by: lesscodingmorehappiness <81588170+lesscodingmorehappiness@users.noreply.github.com>
Co-authored-by: Panfeng Xue <paxue@microsoft.com>
Co-authored-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Co-authored-by: Sebastian Heid <8442432+s4heid@users.noreply.github.com>
Co-authored-by: Umang Francis <umfranci@microsoft.com>
Co-authored-by: rabdulfaizy <rabdulfaizy@microsoft.com>
Co-authored-by: Aditya Nagesh <adityanagesh@microsoft.com>
Co-authored-by: Rachel Menge <rachelmenge@microsoft.com>
Co-authored-by: Kanchan Sen Laskar <kasenlaskar@microsoft.com>
Co-authored-by: mcgov <mamcgove@microsoft.com>
Co-authored-by: LiliDeng <10083705+LiliDeng@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants