Skip to content

Easier cross-compiling for level 4? #5

@stuarteberg

Description

@stuarteberg

Comment:

The conda-forge docs for the microarch-optimized builds have an example that uses microarch_level: 4. But the README for this feedstock contains the following caveat:

When building packages on CI, level=4 will not be guaranteed, so you can only use level<=3 to build.

Indeed, when I tried to use level 4, I saw failures (in my case, it was on osx).

Nonetheless, I'd like to produce optimized builds for machines that support AVX-512 (level 4). This was possible by explicitly adding the necessary build flag in build.sh and then explicitly listing the appropriate run dependency:

# conda_build_config.yaml
microarch_level:
  - 1
  - 3  # [unix and x86_64]
  - 4  # [unix and x86_64]
# build.sh
if [[ "${microarch_level}" == "4" ]]; then
    CXXFLAGS="${CXXFLAGS} -march=x86-64-v4"
fi
# meta.yaml
requirements:
  run:
    - _x86_64-microarch-level 4  # [unix and x86_64 and microarch_level == 4]

Using that workaround, we were able to produce optimized binaries (including march=x86-64-v4 in the graph-tool feedstock (conda-forge/graph-tool-feedstock#140).


Would it be possible to make that easier for feedstock maintainers, perhaps by having the microarch-level-feedstock produce yet another output?

Right now this feedstock produces two packages for each arch, such as:

  1. x86_64-microarch-level
    a. Introduces the -march=x86-64-v${level} flag in CFLAGS etc.
    b. Introduces a run_export to _x86_64-microarch-level
  2. _x86_64-microarch-level
    a. Introduces a run dependency to the appropriate __archspec virtual package.

...but it seems like cross-compilation would be easier if we were to split up the functionality from 1.a and 1.b. into two separate packages, so we could easily obtain the correct CFLAGS without pulling in the __archspec dependency. Perhaps we could offer two variants of the package: one that provides both 1.a and 1.b, and another variant that only provides 1.a. (I'm just splitballing here...)

Alternatively, we could just drop the run_exports from the {{ family }}-microarch-level recipe. In that case, feedstock maintainers could build level-4 packages without needing to add the compiler flag explicitly, but they would be forced to explicitly list the appropriate runtime dependency in their recipe, which could be annoying:

requirements:
  build:
    - x86_64-microarch-level {{ microarch_level }}  # [unix and x86_64]
    - ppc64le-microarch-level {{ microarch_level }}  # [unix and ppc64le]
  run:
    - _x86_64-microarch-level >={{ microarch_level }} # [unix and x86_64]
    - _ppc64le-microarch-level >={{ microarch_level }} # [unix and ppc64le]

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions