Check duplicate issues.
Description
On a large node (127 cores, 128 GB), I ran:
- ctest -j 32
- ctest --rerun-failed
- ctest -j 32
After 1. many test failes due to lack of resources (running out of threads, see #16552 ):
47:PyMVA-Keras-Classification
348:PyMVA-Keras-Regression
349:PyMVA-Keras-Multiclass
350:gtest-tmva-pymva-test-TestRModelParserKeras
984:tutorial-tmva-TMVA_SOFIE_GNN_Application
985:tutorial-tmva-TMVA_SOFIE_Keras
986:tutorial-tmva-TMVA_SOFIE_Keras_HiggsModel
988:tutorial-tmva-TMVA_SOFIE_RDataFrame
990:tutorial-tmva-TMVA_SOFIE_RSofieReader
1238:tutorial-tmva-RBatchGenerator_PyTorch-py
1239:tutorial-tmva-RBatchGenerator_TensorFlow-py
1246:tutorial-tmva-TMVA_SOFIE_Models-py
1247:tutorial-tmva-TMVA_SOFIE_RDataFrame-py
1252:tutorial-tmva-keras-GenerateModel-py
1253:tutorial-tmva-keras-MulticlassKeras-py
However in 2., several tests still failed (even-though resources where no longer an issue):
50:gtest-tmva-pymva-test-TestRModelParserKeras
984:tutorial-tmva-TMVA_SOFIE_GNN_Application
986:tutorial-tmva-TMVA_SOFIE_Keras_HiggsModel
988:tutorial-tmva-TMVA_SOFIE_RDataFrame
990:tutorial-tmva-TMVA_SOFIE_RSofieReader
1247:tutorial-tmva-TMVA_SOFIE_RDataFrame-py
The errors listed there included:
IncrementalExecutor::executeFunction: symbol 'saxpy_' unresolved while linking [cling interface function]!
IncrementalExecutor::executeFunction: symbol 'sgemm_' unresolved while linking [cling interface function]!
tutorials/tmva/TMVA_SOFIE_RDataFrame.C:29:10: fatal error: 'Higgs_trained_model.hxx' file not found
/tutorials/tmva/TMVA_SOFIE_GNN_Application.C:10:10: fatal error: 'encoder.hxx' file not found
From this I conclude that those tests (in particular TMVA_SOFIE_RDataFrame.C and tutorials/tmva/TMVA_SOFIE_GNN_Application.C) are missing a dependencies that failed in the first run.
Note tutorial-tmva-TMVA_SOFIE_Keras_HiggsModel and tutorial-tmva-TMVA_SOFIE_RDataFrame-py are indeed needing TMVA_Higgs_Classification.C to run first (it says so in the output! :) ).
tutorial-tmva-TMVA_SOFIE_RSofieReader is asking for Higgs_trained_model.h5
gtest-tmva-pymva-test-TestRModelParserKeras is missing the symbol sgemm_ (see below)
However when rerunning (where this time somehow there was no resource related failures), I still got several failures:
346:gtest-tmva-pymva-test-TestRModelParserPyTorch
350:gtest-tmva-pymva-test-TestRModelParserKeras
984:tutorial-tmva-TMVA_SOFIE_GNN_Application
988:tutorial-tmva-TMVA_SOFIE_RDataFrame
990:tutorial-tmva-TMVA_SOFIE_RSofieReader
all due to:
IncrementalExecutor::executeFunction: symbol 'sgemm_' unresolved while linking [cling interface function]!
or both
IncrementalExecutor::executeFunction: symbol 'saxpy_' unresolved while linking [cling interface function]!
IncrementalExecutor::executeFunction: symbol 'sgemm_' unresolved while linking [cling interface function]!
Which may be due to either a badly formed result of the failing run (1) or due to an external package that does not have the correct version number?
Reproducer
ctest -j 32 # and get lots of out of resource failures
ctest --rerun-failed
ctest -j 32
ROOT version
master
Installation method
hand build
Operating system
Alma9
Additional context
jupyter-pcanal-rootdevel:quick-devel pcanal$ bin/root-config --features
cxx17 asimage builtin_clang builtin_cling builtin_gtest builtin_llvm builtin_lz4 builtin_lzma builtin_nlohmannjson builtin_openui5 builtin_tbb builtin_vdt builtin_xxhash builtin_zlib builtin_zstd clad dataframe davix gdml http imt pyroot roofit root7 rpath runtime_cxxmodules shared sqlite ssl tmva tmva-pymva tpython spectrum vdt x11 xml xrootd
Check duplicate issues.
Description
On a large node (127 cores, 128 GB), I ran:
After 1. many test failes due to lack of resources (running out of threads, see #16552 ):
However in 2., several tests still failed (even-though resources where no longer an issue):
The errors listed there included:
From this I conclude that those tests (in particular
TMVA_SOFIE_RDataFrame.Candtutorials/tmva/TMVA_SOFIE_GNN_Application.C) are missing a dependencies that failed in the first run.Note
tutorial-tmva-TMVA_SOFIE_Keras_HiggsModelandtutorial-tmva-TMVA_SOFIE_RDataFrame-pyare indeed needingTMVA_Higgs_Classification.Cto run first (it says so in the output! :) ).tutorial-tmva-TMVA_SOFIE_RSofieReaderis asking forHiggs_trained_model.h5gtest-tmva-pymva-test-TestRModelParserKerasis missing the symbolsgemm_(see below)However when rerunning (where this time somehow there was no resource related failures), I still got several failures:
all due to:
or both
Which may be due to either a badly formed result of the failing run (1) or due to an external package that does not have the correct version number?
Reproducer
ROOT version
master
Installation method
hand build
Operating system
Alma9
Additional context