Move OSX test-only CI jobs to Linux runners#124930
Move OSX test-only CI jobs to Linux runners#124930davidwrighton wants to merge 5 commits intodotnet:mainfrom
Conversation
The job only downloads pre-built artifacts and submits tests to Helix, so it does not need a native macOS Azure DevOps runner. Use the existing hostedOs mechanism to run the orchestrating agent on Linux while keeping tests on macOS ARM64 Helix queues. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add host+packs to upstream CoreCLR_Libraries builds so installer test artifacts are produced during the initial build phase - Upload installer test artifacts (artifacts/tests/) from upstream builds - Rename Installer_Build_And_Test to Installer_Test and set skipBuild to skip the local build step since all artifacts are pre-built - Split OSX platforms (osx_x64, osx_arm64) into separate job definitions with hostedOs: linux to run the AzDO agent on Linux - Add skipBuild parameter to global-build-job.yml to support test-only jobs that download artifacts and submit to Helix without building - Guard macOS-specific install-dependencies and disk-usage steps in global-build-job.yml against hostedOs override Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Tagging subscribers to this area: @dotnet/runtime-infrastructure |
There was a problem hiding this comment.
Pull request overview
This PR updates the dotnet/runtime Azure DevOps pipeline templates to run several test-submission-only jobs on Linux agents (instead of macOS agents) while still executing the actual tests on macOS Helix queues.
Changes:
- Add a
hostedOsoverride to move certain macOS test submission jobs onto Linux runners. - Add a
skipBuildparameter to support “test-only” jobs that only download artifacts and submit to Helix. - Extend upstream
CoreCLR_Librariesbuilds to also produce and upload installer test artifacts for downstream installer Helix submissions.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| eng/pipelines/runtime.yml | Adds installer test-only jobs, artifact upload/download wiring, and hostedOs: linux overrides for macOS Helix submission jobs. |
| eng/pipelines/common/global-build-job.yml | Introduces skipBuild and guards macOS-only dependency installation when the agent OS is overridden. |
| eng/pipelines/libraries/run-test-job.yml | Adds hostedOs and guards macOS-only dependency installation for overridden agent OS scenarios. |
eng/pipelines/runtime.yml
Outdated
| jobTemplate: /eng/pipelines/common/global-build-job.yml | ||
| buildConfig: ${{ variables.debugOnPrReleaseOnRolling }} | ||
| platforms: | ||
| - windows_x86 | ||
| helixQueuesTemplate: /eng/pipelines/installer/helix-queues-setup.yml | ||
| jobParameters: | ||
| nameSuffix: Installer_Test | ||
| skipBuild: true | ||
| dependsOnGlobalBuilds: | ||
| - nameSuffix: CoreCLR_Libraries | ||
| buildConfig: release | ||
| preBuildSteps: | ||
| - template: /eng/pipelines/common/download-artifact-step.yml | ||
| parameters: | ||
| artifactName: CoreCLR_Libraries_BuildArtifacts_$(osGroup)$(osSubgroup)_$(archType)_Release | ||
| artifactFileName: CoreCLR_Libraries_BuildArtifacts_$(osGroup)$(osSubgroup)_$(archType)_Release$(archiveExtension) | ||
| unpackFolder: $(Build.SourcesDirectory)/artifacts/bin | ||
| displayName: 'unified artifacts' | ||
| - template: /eng/pipelines/common/download-artifact-step.yml | ||
| parameters: | ||
| artifactName: Installer_TestArtifacts_$(osGroup)$(osSubgroup)_$(archType)_Release | ||
| artifactFileName: Installer_TestArtifacts_$(osGroup)$(osSubgroup)_$(archType)_Release$(archiveExtension) | ||
| unpackFolder: $(Build.SourcesDirectory)/artifacts/tests | ||
| displayName: 'installer test artifacts' |
There was a problem hiding this comment.
Installer_Test (windows_x86) is configured with buildConfig ${{ variables.debugOnPrReleaseOnRolling }} but it downloads ..._Release artifacts and the Helix submission uses /p:Configuration=$(_BuildConfig) (from that buildConfig). On PRs this will likely submit with Configuration=Debug while using Release test assets, which can break helixpublish.proj expectations (payload paths include $(Configuration)). Make the job's buildConfig consistent with the artifacts it downloads (e.g., set buildConfig to release here, or switch downloads/dependsOn to use $(debugOnPrReleaseOnRolling)).
eng/pipelines/runtime.yml
Outdated
| parameters: | ||
| artifactName: Installer_TestArtifacts_$(osGroup)$(osSubgroup)_$(archType)_$(debugOnPrReleaseOnRolling) | ||
| artifactFileName: Installer_TestArtifacts_$(osGroup)$(osSubgroup)_$(archType)_$(debugOnPrReleaseOnRolling)$(archiveExtension) | ||
| unpackFolder: $(Build.SourcesDirectory)/artifacts/tests |
There was a problem hiding this comment.
Installer_Test (windows_x64/linux_x64) sets buildConfig: release but downloads artifacts using $(debugOnPrReleaseOnRolling) and depends on the corresponding CoreCLR_Libraries build config. Since Helix submission passes /p:Configuration=$(_BuildConfig), this mismatch will submit the wrong Configuration on PRs (Release) while using Debug artifacts (or vice versa), likely causing missing payload directories. Align this job's buildConfig with the artifact config being downloaded (or adjust the artifact names to always use Release).
eng/pipelines/runtime.yml
Outdated
| - template: /eng/pipelines/common/platform-matrix.yml | ||
| parameters: | ||
| jobTemplate: /eng/pipelines/common/global-build-job.yml | ||
| buildConfig: release | ||
| platforms: | ||
| - osx_arm64 | ||
| helixQueuesTemplate: /eng/pipelines/installer/helix-queues-setup.yml | ||
| jobParameters: | ||
| nameSuffix: Installer_Test | ||
| skipBuild: true | ||
| hostedOs: linux | ||
| dependsOnGlobalBuilds: | ||
| - nameSuffix: CoreCLR_Libraries | ||
| buildConfig: ${{ variables.debugOnPrReleaseOnRolling }} | ||
| preBuildSteps: | ||
| - template: /eng/pipelines/common/download-artifact-step.yml | ||
| parameters: | ||
| artifactName: CoreCLR_Libraries_BuildArtifacts_$(osGroup)$(osSubgroup)_$(archType)_$(debugOnPrReleaseOnRolling) | ||
| artifactFileName: CoreCLR_Libraries_BuildArtifacts_$(osGroup)$(osSubgroup)_$(archType)_$(debugOnPrReleaseOnRolling)$(archiveExtension) | ||
| unpackFolder: $(Build.SourcesDirectory)/artifacts/bin | ||
| displayName: 'unified artifacts' | ||
| - template: /eng/pipelines/common/download-artifact-step.yml | ||
| parameters: | ||
| artifactName: Installer_TestArtifacts_$(osGroup)$(osSubgroup)_$(archType)_$(debugOnPrReleaseOnRolling) | ||
| artifactFileName: Installer_TestArtifacts_$(osGroup)$(osSubgroup)_$(archType)_$(debugOnPrReleaseOnRolling)$(archiveExtension) | ||
| unpackFolder: $(Build.SourcesDirectory)/artifacts/tests | ||
| displayName: 'installer test artifacts' |
There was a problem hiding this comment.
Installer_Test (osx_arm64) uses hostedOs: linux and downloads artifacts suffixed with $(debugOnPrReleaseOnRolling), but the matrix buildConfig is hard-coded to release. Because /p:Configuration=$(_BuildConfig) is derived from buildConfig, this can make Helix look for artifacts/tests/host/<os>.<arch>.Release while the downloaded payload is ...Debug on PRs. Make buildConfig match the downloaded artifact configuration.
| - osx_x64 | ||
| helixQueuesTemplate: /eng/pipelines/installer/helix-queues-setup.yml | ||
| jobParameters: | ||
| nameSuffix: Installer_Build_And_Test | ||
| buildArgs: -s host+packs -c $(_BuildConfig) -lc Release -rc Release | ||
| nameSuffix: Installer_Test | ||
| skipBuild: true | ||
| hostedOs: linux |
There was a problem hiding this comment.
Setting hostedOs: linux for an osGroup: osx job currently causes conflicting pool definitions in eng/pipelines/common/xplat-setup.yml: the Linux pool block triggers when hostedOs == 'linux' (xplat-setup.yml:168-176) while the macOS vmImage block still triggers based on osGroup alone (xplat-setup.yml:178-186). This will emit both name/demands and vmImage under pool, which Azure Pipelines rejects. To actually move this job to Linux runners, update xplat-setup.yml to gate the macOS pool blocks on hostedOs (or pass an explicit pool: override here instead of hostedOs).
| jobParameters: | ||
| nameSuffix: Installer_Test | ||
| skipBuild: true | ||
| hostedOs: linux |
There was a problem hiding this comment.
hostedOs: linux is used for an osx Helix submission job, but the current pool selection logic in eng/pipelines/common/xplat-setup.yml will still also select the macOS pool based on osGroup: osx (in addition to the Linux pool selected by hostedOs). This results in an invalid pool object (both vmImage and name/demands). Adjust xplat-setup.yml to make hostedOs override mutually exclusive with osGroup-based macOS pool selection, or set pool: explicitly for this job.
| hostedOs: linux |
| jobParameters: | ||
| testScope: innerloop | ||
| liveRuntimeBuildConfig: checked | ||
| hostedOs: linux |
There was a problem hiding this comment.
hostedOs: linux is intended to move this osx test-submission job to Linux agents, but eng/pipelines/common/xplat-setup.yml currently selects a macOS pool whenever osGroup is osx regardless of hostedOs. With hostedOs=linux, both the Linux and macOS pool blocks match, producing an invalid pool definition. Update xplat-setup.yml to honor hostedOs as an override (or pass an explicit Linux pool here).
| hostedOs: linux |
The host+packs build uses different -c/-lc/-rc configs than the main clr+libs build. Build host+packs as a separate postBuildStep in each upstream CoreCLR_Libraries job to preserve the exact old behavior: - Release upstream (win_x86, osx_x64): host+packs at debugOnPrReleaseOnRolling config with -lc Release -rc Release (matches old Job 1) - debugOnPrReleaseOnRolling upstream (multi-platform): host+packs at Release config with -lc debugOnPrReleaseOnRolling -rc Release (matches old Job 2) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| buildArgs: -s host+packs -c $(_BuildConfig) -lc Release -rc Release | ||
| nameSuffix: Installer_Test | ||
| skipBuild: true | ||
| hostedOs: linux |
There was a problem hiding this comment.
The hostedOs: linux parameter is used here with osx_x64 and osx_arm64 platforms (similarly at lines 1607 and 1998), but eng/pipelines/common/xplat-setup.yml does not guard the macOS pool selection conditions against a hostedOs override. Specifically, lines 179–186 of xplat-setup.yml emit a macOS pool entry whenever osGroup is osx (or other Apple platforms), with no check for hostedOs. Lines 168–176 of that same file correctly check for hostedOs == 'linux' for the Linux pool. When both conditions fire simultaneously, the resulting pool: object contains both name/demands (Linux) and vmImage/os (macOS) fields, which is an invalid Azure Pipelines pool configuration and will likely cause a pipeline failure or still route the agent to a macOS host — defeating the PR's intent. The macOS pool conditions in xplat-setup.yml need to add not(eq(parameters.jobParameters.hostedOs, 'linux')) guards before this configuration can work correctly.
| hostedOs: linux |
| - ${{ if in(parameters.osGroup, 'osx', 'ios', 'tvos', 'android') }}: | ||
| - script: | | ||
| du -sh $(Build.SourcesDirectory)/* | ||
| df -h | ||
| displayName: Disk Usage after Build | ||
| condition: always() | ||
|
|
||
| # If intended to send extra steps after regular build add them here. |
There was a problem hiding this comment.
The comment # If intended to send extra steps after regular build add them here. on line 242 is indented inside the ${{ if ne(parameters.skipBuild, true) }}: block, but it refers to the postBuildSteps section that follows at the outer indentation level (line 243). The comment should be moved outside the skipBuild block (to the same indentation level as line 243) to accurately describe the code that follows.
| - ${{ if in(parameters.osGroup, 'osx', 'ios', 'tvos', 'android') }}: | |
| - script: | | |
| du -sh $(Build.SourcesDirectory)/* | |
| df -h | |
| displayName: Disk Usage after Build | |
| condition: always() | |
| # If intended to send extra steps after regular build add them here. | |
| - ${{ if in(parameters.osGroup, 'osx', 'tvos', 'android') }}: | |
| - script: | | |
| du -sh $(Build.SourcesDirectory)/* | |
| df -h | |
| displayName: Disk Usage after Build | |
| condition: always() | |
| # If intended to send extra steps after regular build add them here. |
eng/pipelines/runtime.yml
Outdated
| condition: | ||
| or( | ||
| eq(stageDependencies.EvaluatePaths.evaluate_paths.outputs['SetPathVars_non_mono_and_wasm.containsChange'], true), | ||
| eq(variables['isRollingBuild'], true)) |
There was a problem hiding this comment.
All four new condition: blocks (at lines 1518, 1555, 1592, 1629) omit the >- block scalar indicator that every other multi-line condition in this file uses (e.g. lines 1460, 1478, 1657, etc.). While YAML will still parse these as strings, the style is inconsistent with the rest of the file. These should use condition: >- to match the established pattern throughout runtime.yml.
When hostedOs is set to 'linux' or 'windows', the macOS pool conditions must not match, otherwise they override the Linux/Windows pool selected by the hostedOs check. Add not(in(hostedOs, 'linux', 'windows')) guard to both public and internal macOS pool conditions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
@agocke @jkoritzinsky I've made this change, which I believe to actually be working now. The goal is to reduce the load on the microsoft hosted osx pools by moving some operations to happen on the linux queues. I don't have as much visibility into the engineering systems as you do, so I'm not sure I can say with confidence that this change is a good thing. I certainly don't like that there are more lines of yaml needed. Let me know if you think I should actually try to fix the minor copilot nits on the latest iteration, and push towards merging this. |
eng/pipelines/runtime.yml
Outdated
| - script: $(Build.SourcesDirectory)$(dir)build$(scriptExt) -ci $(_archParameter) $(_osParameter) $(crossArg) -s host+packs -c Release -lc $(_BuildConfig) -rc Release $(_officialBuildParameter) | ||
| displayName: Build host+packs |
There was a problem hiding this comment.
If we're going to run the installer tests in a job different than where we build them, we should do this build as one step (build clr+libs+libs.tests+host+packs instead of two separate steps).
There was a problem hiding this comment.
This is intentionally a separate step as the host+packs are not currently built with the same debug/release flags as the rest of the build.
There was a problem hiding this comment.
I would prefer if we used the -hc flag for the installer builds instead of a separate command. It is significantly more difficult for CI investigations and binlog investigations when we have multiple commands (determining which binlog corresponds to which step, accidentally not emitting a binlog or overwriting a binlog) and makes our build scripts much more complicated.
| buildConfig: '' | ||
| nameSuffix: '' | ||
| buildArgs: '' | ||
| skipBuild: false |
There was a problem hiding this comment.
Let's introduce a run-test-job.yml file for running the installer tests on helix instead of adding a skipBuild parameter here.
…ob.yml - Replace separate 'Build host+packs' script steps with -hc flag in CoreCLR_Libraries buildArgs to produce a single build invocation - Create eng/pipelines/installer/run-test-job.yml as a dedicated test-only job template for installer Helix submissions - Remove skipBuild parameter from global-build-job.yml, restoring its original build-focused structure - Update Installer_Test jobs to use the new run-test-job.yml template - Fix condition blocks to use >- block scalar indicators consistently Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| - ${{ if in(parameters.osGroup, 'osx', 'ios', 'tvos', 'android') }}: | ||
| - script: | | ||
| du -sh $(Build.SourcesDirectory)/* | ||
| df -h | ||
| displayName: Disk Usage after Build | ||
| condition: always() |
There was a problem hiding this comment.
The "Disk Usage after Build" step on line 233 is not guarded against hostedOs: linux or hostedOs: windows overrides, while the "Disk Usage before Build" block was updated to include such a guard (line 177). When hostedOs: linux is used with an osx osGroup, the "after build" disk usage step would still execute on a Linux agent. While du and df do work on Linux so this won't cause job failures, it is inconsistent with the stated intent of guarding macOS-specific steps against hostedOs overrides, and slightly misleading in pipeline logs. Consider adding not(in(parameters.hostedOs, 'linux', 'windows')) to the "Disk Usage after Build" condition as well, similar to the guard applied to the "before build" block.
jkoritzinsky
left a comment
There was a problem hiding this comment.
Just one nit on job naming. Otherwise LGTM
| artifactName: CoreCLR_Libraries_TestArtifacts_$(osGroup)$(osSubgroup)_$(archType)_$(_BuildConfig) | ||
| - template: /eng/pipelines/common/upload-artifact-step.yml | ||
| parameters: | ||
| rootFolder: $(Build.SourcesDirectory)/artifacts/tests |
There was a problem hiding this comment.
(Not in this PR) At some point we should unify between artifacts/tests and artifacts/helix.
| @@ -250,7 +250,7 @@ extends: | |||
| - windows_x64 | |||
| jobParameters: | |||
| nameSuffix: CoreCLR_Libraries | |||
There was a problem hiding this comment.
Now that we're building all subsets in these jobs, we should rename these jobs to AllSubsets_CoreCLR to be consistent with the other jobs where we build all subsets.
Summary
Move several test-submission-only Azure DevOps jobs off macOS runners onto Linux runners. These jobs don't do any native compilation — they download pre-built artifacts and submit tests to Helix. Tests continue running on macOS Helix queues.
Changes
Libraries Test Run (checked coreclr osx arm64)
un-test-job.yml\ against hostedOs override
Installer_Build_And_Test → Installer_Test
Files Changed