Create dm-cache test by rlmenge · Pull Request #4093 · microsoft/lisa

rlmenge · 2025-11-06T01:50:08Z

Description

This PR adds a comprehensive test suite for dm-cache functionality on Azure VMs, along with 11 new LISA tools for managing LVM and device mapper resources. This is a setup and integration test - it verifies dm-cache can be configured correctly and is operational, but it doesn't benchmark performance or stress-test the caching behavior.

This test verifies dm-cache by checking the module is loadable and then checking functionality by:

Creating loopback devices to simulate slow origin and fast cache disks
Setting up LVM physical volumes and volume groups
Creating logical volumes for origin and cache pool
Attaching cache pool to origin LV to enable caching
Formatting and mounting the cached logical volume
Verifying the dm-cache setup is working correctly

The test validates:
✅ Kernel support: dm-cache module loads
✅ Correct architecture: Cache on fast device, origin on slow device
✅ Cache attachment: LV has cache layout, not just plain linear
✅ Policy configuration: A valid cache policy (smq/mq/cleaner) is active
✅ Functional I/O: Can format, mount, and use the cached volume
✅ Device mapper integration: dmsetup can query cache status and configuration

What It's NOT Testing
❌ Performance: Doesn't measure if cache actually improves speed
❌ Cache hits/misses: Doesn't verify cache is being used for reads/writes
❌ Data integrity: Doesn't write/read specific data patterns
❌ Different cache modes: Only tests default mode (not explicitly testing writethrough vs writeback)

Test Suite
[dmcache.py]

Validates dm-cache setup and basic functionality on CBL Mariner/Azure Linux VMs
Creates loopback devices to simulate slow origin and fast cache disks
Sets up complete LVM stack (PVs, VGs, LVs) with cache-enabled logical volumes
Verifies cache policy configuration (smq/mq/cleaner)
Tests filesystem operations on cached volumes (format, mount, verify)
Includes comprehensive cleanup in finally block

New Tools Added
LVM Physical Volume Management

[losetup.py] - Loopback device management
[pvcreate.py] - Create LVM physical volumes
[pvremove.py]- Remove LVM physical volumes

LVM Volume Group Management

[vgcreate.py] - Create LVM volume groups
[vgremove.py]- Remove LVM volume groups
[vgs.py]- Query volume group information and status

LVM Logical Volume Management

[lvcreate.py] - Create logical volumes with support for cache pools
[lvremove.py]- Remove logical volumes
[lvconvert.py] - Attach cache pools to origin LVs
[lvs.py] - Query logical volume information and cache layout

Device Mapper Management

[dmsetup.py]- Query device mapper devices

Testing

Tested on local AzL3 VM running in hyperv
Test will PASS if the module is available and can be loaded or is already loaded
Test will SKIP if the module is unavailable
Test will fail if the the logical volumes cannot be created correctly or the policy is not correct

Dependencies

LVM2 tools (pvcreate, vgcreate, lvcreate, lvconvert, lvs, vgs, lvremove, vgremove, pvremove)
device-mapper utilities (dmsetup)
util-linux (losetup)

lisa/tools/dmsetup.py

lisa/tools/lvconvert.py

lisa/microsoft/testsuites/storage/dmcache.py

squirrelsc · 2025-11-10T18:29:15Z

@LiliDeng LGTM

…#4058) * Initial plan * Add unified message support for iperf3 TCP and UDP performance metrics Co-authored-by: squirrelsc <27178119+squirrelsc@users.noreply.github.com> * Refactor unified message methods to use parsed fields and add connections_num as parameter Co-authored-by: squirrelsc <27178119+squirrelsc@users.noreply.github.com> * Merge main and use Parameter relativity for connections_num and buffer_size metrics Co-authored-by: squirrelsc <27178119+squirrelsc@users.noreply.github.com> * Remove conn_suffix from metric names Co-authored-by: squirrelsc <27178119+squirrelsc@users.noreply.github.com> * Revert "Move examples and microsoft directories into the Python package (#4023)" (#4063) This reverts commit 89e7b53. * Reapply "Move examples and microsoft directories into the Python package (#4023)" (#4063) This reverts commit efe1cd3. * runbook: fix path for legacy layout * Add UnifiedMessage support for NetworkLatencyPerformanceMessage * kdump: Replace CvmDisabled with before_case SecurityProfile check (#4032) * kdump: Replace CvmDisabled with before_case SecurityProfile check * kdump: Fix SecurityProfile check to skip only CVM and Stateless VMs - Remove empty simple_requirement() calls (unnecessary) - Optimize f-string usage (only use f-prefix where needed) - Remove unused simple_requirement import * Add detailed panic categorization and error code extraction * enrich SerialConsole.check_panic() to return detailed panic * Added tests for network related components (#4009) * notifier: remove pytest-html dependency Replace pytest-html dependency with custom HTML report generator using string.Template. This change provides better control over report formatting and reduces external dependencies. * runbook: fix microsoft package name for new paths. The new path is still able to be written like "microsoft/testsuites", so that it needs to use "microsoft" instead of "testsuites" as the package name. * Remove watchdog pattern from serial console panic detection (#4075) * fix verify_cpu_count and improve PowerShell - Implement calculate_vcpu_count() method in WindowsLscpu class to fix verify_cpu_count test failure on Windows - Add null check for stderr in PowerShell.wait_result() to prevent errors when PowerShell is used to run cmd commands with no stderr output * iDRAC: Handle HTTP 500 internal errors with service reset * Fix Hyper-V Stop-VM to use TurnOff on timeout/failure * Remove overly broad stall regex pattern causing false positive panic detections (#4082) * Initial plan * Remove overly broad stall regex pattern to prevent false alarms Co-authored-by: lesscodingmorehappiness <81588170+lesscodingmorehappiness@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: lesscodingmorehappiness <81588170+lesscodingmorehappiness@users.noreply.github.com> * Revert "skip test if hv_netvsc driver is not used" This reverts commit f6fdcf7. * change kselftest required /tmp/ size to 1GB for Overlake SoC limited space * Add enabled switch for environments and nodes This change introduces an `enabled` boolean field at both the environment and node levels, allowing selective loading of configurations through runbook variables. Example: environment: - name: my_env enabled: $(use_first_env) # Variable-controlled nodes: - type: local name: node1 enabled: true - type: local name: node2 enabled: false # Skip this node * Process: Raise exception on timeout. (#4077) * Skip tests on L1VH Nodes (#4078) * mshv: skip checking logfile size on l1vh L1VH parents by default don't have any entries in mshvlog file. Skip checking logfile size on these nodes. Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com> * mshv: skip mshvtrace test on l1vh Nodes L1VH nodes cannot collect performance traces. Skip the related test on the L1VH nodes. Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com> --------- Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com> * Set minimum TLS setting 1.2 for storage accounts Support for TLS 1.0 and 1.1 will be discontinued for all Azure Storage accounts. The guidance is to migrate to minumum TLS version 1.2. https://learn.microsoft.com/en-us/azure/storage/common/transport-layer-security-configure-migrate-to-tls2#why-use-tls-12 * Fix IPTable Test (#4088) * Add virtualization feature * doc: fix doc path after test code moved. * doc: fix some build warnings. * doc: allow duplicate test case names in different test suites. * Fix VHD schema documentation to show nested hyperv_generation field (#4100) * changes to install xxhash tool before building kernel * Modrpobe command update for verbose is false * Document resource_group_tags parameter for Azure runbook (#4101) * Add Host version tracking for baremetal and HyperV platforms * Convert GPU Driver installation to Tool, Add amd-smi (#4080) * ch perf: Implement comprehensive performance stabilization framework * Classify /bin/true redirections in kernel modules as not loaded Previously, `is_module_loaded` returned True (loaded) when `modprobe -nv` produced a blacklist directive like 'install /bin/true', causing test cases like verify_floppy_module_is_blacklisted although module was not actually loaded. Added a minimal check for the install /bin/true pattern and now treat it as not loaded, returning False. * Kdump: Enhnace error log for incomplete dump file * Update Nested Feature Supported list in Azure * Create dm-cache test (#4093) * Fix nvme device path fetch logic * DPDK: add netvsc rescind tests (#4076) * Remove squirrelsc from CODEOWNERS file Co-authored-by: squirrelsc <27178119+squirrelsc@users.noreply.github.com> * UnifiedPerfMessage: add metric_str_value to store string value (#4107) * UnifiedPerfMessage: add str_value to store string value * Rename str_value to metric_str_value in UnifiedPerfMessage (#4108) * Initial plan * Rename str_value to metric_str_value for consistency Co-authored-by: squirrelsc <27178119+squirrelsc@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: squirrelsc <27178119+squirrelsc@users.noreply.github.com> --------- Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com> Co-authored-by: squirrelsc <27178119+squirrelsc@users.noreply.github.com> * Pass through MIGRATABLE_VERSION from pipeline environment * Add UnifiedMessage support for NetworkPPSPerformanceMessage (#4057) * Initial plan * Rebase on latest main branch * Initial plan * Initial plan * Rebase on latest main branch * Sync latest code from main branch * Clean commit history - single commit for PR changes * Add connections_num and buffer_size to metric names as suffix - Remove separate connections_num and buffer_size_bytes metrics - Add suffix format: _conn_{connections_num}_buffer_{buffer_size} - Apply suffix to all TCP metrics: rx/tx_throughput_in_gbps, congestion_windowsize_kb, retransmitted_segments - Apply suffix to all UDP metrics: rx/tx_throughput_in_gbps, data_loss - This allows distinguishing results by connection count and buffer size Co-authored-by: LiliDeng <10083705+LiliDeng@users.noreply.github.com> * Fix flake8 errors: remove trailing whitespace from blank lines - Remove trailing whitespace from line 492 in send_iperf3_tcp_unified_perf_messages - Remove trailing whitespace from line 534 in send_iperf3_udp_unified_perf_messages - Fixes W293 flake8 warnings and BLK100 black formatting issue Co-authored-by: LiliDeng <10083705+LiliDeng@users.noreply.github.com> --------- Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: squirrelsc <27178119+squirrelsc@users.noreply.github.com> Co-authored-by: LiliDeng <lildeng@microsoft.com> Co-authored-by: Chi Song (from Dev Box) <chisong@microsoft.com> Co-authored-by: Vivek Yadav <vyadav@microsoft.com> Co-authored-by: Balashivaram Ganesan <71939272+Balashivaram@users.noreply.github.com> Co-authored-by: lesscodingmorehappiness <81588170+lesscodingmorehappiness@users.noreply.github.com> Co-authored-by: Panfeng Xue <paxue@microsoft.com> Co-authored-by: Praveen K Paladugu <prapal@linux.microsoft.com> Co-authored-by: Sebastian Heid <8442432+s4heid@users.noreply.github.com> Co-authored-by: Umang Francis <umfranci@microsoft.com> Co-authored-by: rabdulfaizy <rabdulfaizy@microsoft.com> Co-authored-by: Aditya Nagesh <adityanagesh@microsoft.com> Co-authored-by: Rachel Menge <rachelmenge@microsoft.com> Co-authored-by: Kanchan Sen Laskar <kasenlaskar@microsoft.com> Co-authored-by: mcgov <mamcgove@microsoft.com> Co-authored-by: LiliDeng <10083705+LiliDeng@users.noreply.github.com>

rlmenge added 16 commits November 5, 2025 16:02

WIP hold dmcache files

3e1bfe0

wip: update the dmcache to skip the writing to file for now

8fff02d

Introduce new tools

1add2a7

Clean up files

c5771eb

Update the tools

2b0cfda

Update to use the new tools

007f74b

use modprobe tool

dcb3415

Use the lvcreate command

fccdfa1

Add clean up tools

ce26433

Add lvconvert

ef059a1

Use mkdir

1404b30

Clean up output

c4cc884

Add lvs

3a97740

Add dmsetup

187e301

Add vgs

785f28c

Formatting fixes

f385a35

squirrelsc reviewed Nov 6, 2025

View reviewed changes

lisa/tools/dmsetup.py Outdated Show resolved Hide resolved

squirrelsc reviewed Nov 6, 2025

View reviewed changes

lisa/tools/dmsetup.py Outdated Show resolved Hide resolved

squirrelsc reviewed Nov 6, 2025

View reviewed changes

lisa/tools/dmsetup.py Outdated Show resolved Hide resolved

squirrelsc reviewed Nov 6, 2025

View reviewed changes

lisa/tools/lvconvert.py Outdated Show resolved Hide resolved

rlmenge added 6 commits November 6, 2025 08:59

Update the _init_.py

facf8b3

Update parameters to always use sudo if always needed

dd53480

Remove the unused functions in dmsetup

99d7277

Reduce info log level output

e231876

Update the lvconvert tool removing unneded params and functions

4caed3c

Ensure tools are used on Linux instances

3c85bce

rlmenge marked this pull request as ready for review November 6, 2025 18:32

rlmenge requested a review from LiliDeng as a code owner November 6, 2025 18:32

squirrelsc reviewed Nov 6, 2025

View reviewed changes

lisa/microsoft/testsuites/storage/dmcache.py Show resolved Hide resolved

squirrelsc reviewed Nov 6, 2025

View reviewed changes

lisa/microsoft/testsuites/storage/dmcache.py Show resolved Hide resolved

squirrelsc reviewed Nov 6, 2025

View reviewed changes

lisa/microsoft/testsuites/storage/dmcache.py Outdated Show resolved Hide resolved

Remove redundant check

5b23a7b

rlmenge force-pushed the rlmenge/dmcache-rebase branch from bb462fc to 5b23a7b Compare November 6, 2025 22:37

LiliDeng reviewed Nov 7, 2025

View reviewed changes

lisa/microsoft/testsuites/storage/dmcache.py Outdated Show resolved Hide resolved

Update comments to remove unsused code and add clarity for numbers

7160d38

LiliDeng approved these changes Nov 11, 2025

View reviewed changes

LiliDeng merged commit 850ede1 into microsoft:main Nov 11, 2025
27 checks passed

rlmenge mentioned this pull request Nov 11, 2025

Enable dm-cache module microsoft/azurelinux#14661

Merged

12 tasks

Copilot AI pushed a commit that referenced this pull request Nov 14, 2025

Create dm-cache test (#4093)

add7955

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create dm-cache test#4093

Create dm-cache test#4093
LiliDeng merged 24 commits intomicrosoft:mainfrom
rlmenge:rlmenge/dmcache-rebase

rlmenge commented Nov 6, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

squirrelsc commented Nov 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

rlmenge commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Testing

Dependencies

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

squirrelsc commented Nov 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rlmenge commented Nov 6, 2025 •

edited

Loading