Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes for clang16, gcc14.2, HIP/AMD #1006

Merged
merged 29 commits into from
Sep 19, 2024
Merged

Fixes for clang16, gcc14.2, HIP/AMD #1006

merged 29 commits into from
Sep 19, 2024

Commits on Sep 17, 2024

  1. [amd] in gg_tt.mad and CODEGEN, fix cudacpp.mk to find the correct pa…

    …th to libamdhip64 madgraph5#998
    
    Also fix the LUMI setup to solve a second issue (move from 23.09 to 24.03)
      module load LUMI/24.03 partition/G
      module load cpeGNU/24.03
      export CC="cc --cray-bypass-pkgconfig -craype-verbose"
      export CXX="CC --cray-bypass-pkgconfig -craype-verbose"
      export FC="ftn --cray-bypass-pkgconfig -craype-verbose -ffixed-line-length-132"
    
    (I checked that gg_tt.mad is regenerated as expected)
    valassi committed Sep 17, 2024
    Configuration menu
    Copy the full SHA
    8562165 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    64249d2 View commit details
    Browse the repository at this point in the history
  3. [amd] in tput/allTees.sh clarify that -cpponly and -nocuda exist whil…

    …e -hip is no longer available
    valassi committed Sep 17, 2024
    Configuration menu
    Copy the full SHA
    98dfbad View commit details
    Browse the repository at this point in the history
  4. [amd] in tput/allTees.sh, on second thought add back -hip, but make t…

    …his identical to -nocuda for the moment (common random)
    valassi committed Sep 17, 2024
    Configuration menu
    Copy the full SHA
    c9a9ad9 View commit details
    Browse the repository at this point in the history
  5. [amd] rerun 96 tput tests on LUMI - many issues at build time and at …

    …runtime
    
    (1) Build tests on login node (~2h)
    
    ./tput/allTees.sh -makeonly
    
    STARTED  AT Mon 16 Sep 2024 08:41:05 PM EEST
    ./tput/teeThroughputX.sh -mix -hrd -makej -eemumu -ggtt -ggttg -ggttgg -gqttq -ggttggg -makeclean  -makeonly
    ENDED(1) AT Mon 16 Sep 2024 09:17:11 PM EEST [Status=1]
    ./tput/teeThroughputX.sh -flt -hrd -makej -eemumu -ggtt -ggttgg -inlonly -makeclean  -makeonly
    ENDED(2) AT Mon 16 Sep 2024 09:30:48 PM EEST [Status=0]
    ./tput/teeThroughputX.sh -makej -eemumu -ggtt -ggttg -gqttq -ggttgg -ggttggg -flt -bridge -makeclean  -makeonly
    ENDED(3) AT Mon 16 Sep 2024 09:33:43 PM EEST [Status=1]
    ./tput/teeThroughputX.sh -eemumu -ggtt -ggttgg -flt -rmbhst  -makeonly
    ENDED(4) AT Mon 16 Sep 2024 09:33:51 PM EEST [Status=0]
    ./tput/teeThroughputX.sh -eemumu -ggtt -ggttgg -flt -curhst  -makeonly
    ENDED(5) AT Mon 16 Sep 2024 09:34:00 PM EEST [Status=0]
    ./tput/teeThroughputX.sh -eemumu -ggtt -ggttgg -flt -common  -makeonly
    ENDED(6) AT Mon 16 Sep 2024 09:34:09 PM EEST [Status=0]
    ./tput/teeThroughputX.sh -mix -hrd -makej -susyggtt -susyggt1t1 -smeftggtttt -heftggbb -makeclean  -makeonly
    ENDED(7) AT Mon 16 Sep 2024 09:59:55 PM EEST [Status=0]
    
    (2) Step 2 - run tests on worker nodes (~1h)
    
    ./tput/allTees.sh -hip
    
    STARTED  AT Tue 17 Sep 2024 08:35:08 AM EEST
    ./tput/teeThroughputX.sh -mix -hrd -makej -eemumu -ggtt -ggttg -ggttgg -gqttq -ggttggg -makeclean  -nocuda
    ENDED(1) AT Tue 17 Sep 2024 09:08:52 AM EEST [Status=2]
    ./tput/teeThroughputX.sh -flt -hrd -makej -eemumu -ggtt -ggttgg -inlonly -makeclean  -nocuda
    ENDED(2) AT Tue 17 Sep 2024 09:12:28 AM EEST [Status=2]
    ./tput/teeThroughputX.sh -makej -eemumu -ggtt -ggttg -gqttq -ggttgg -ggttggg -flt -bridge -makeclean  -nocuda
    ENDED(3) AT Tue 17 Sep 2024 09:18:56 AM EEST [Status=2]
    ./tput/teeThroughputX.sh -eemumu -ggtt -ggttgg -flt -rmbhst  -nocuda
    ENDED(4) AT Tue 17 Sep 2024 09:19:30 AM EEST [Status=2]
    SKIP './tput/teeThroughputX.sh -eemumu -ggtt -ggttgg -flt -common  -nocuda'
    ENDED(5) AT Tue 17 Sep 2024 09:19:30 AM EEST [Status=0]
    ./tput/teeThroughputX.sh -eemumu -ggtt -ggttgg -flt -common  -nocuda
    ENDED(6) AT Tue 17 Sep 2024 09:20:03 AM EEST [Status=2]
    ./tput/teeThroughputX.sh -mix -hrd -makej -susyggtt -susyggt1t1 -smeftggtttt -heftggbb -makeclean  -nocuda
    ENDED(7) AT Tue 17 Sep 2024 09:26:15 AM EEST [Status=2]
    valassi committed Sep 17, 2024
    Configuration menu
    Copy the full SHA
    d12d08a View commit details
    Browse the repository at this point in the history
  6. [amd] revert 96 tput logs on LUMI

    Revert "[amd] rerun 96 tput tests on LUMI - many issues at build time and at runtime"
    This reverts commit d12d08a.
    valassi committed Sep 17, 2024
    Configuration menu
    Copy the full SHA
    734af0c View commit details
    Browse the repository at this point in the history
  7. [amd] in tput/throughputX.sh expose FPE crash madgraph5#1003 on HIP a…

    …nd improve error handling
    valassi committed Sep 17, 2024
    Configuration menu
    Copy the full SHA
    32f1cb9 View commit details
    Browse the repository at this point in the history
  8. [amd] in gg_tt.mad cudacpp.mk, try to work around the HIP crashes mad…

    …graph5#1003 by disabling SIMD in C++ objects for HIP builds - it does not help, will revert
    valassi committed Sep 17, 2024
    Configuration menu
    Copy the full SHA
    2f8e348 View commit details
    Browse the repository at this point in the history
  9. [amd] in gg_tt.mad cudacpp.mk, revert the previous commit (1)

    Revert "[amd] in gg_tt.mad cudacpp.mk, try to work around the HIP crashes madgraph5#1003 by disabling SIMD in C++ objects for HIP builds - it does not help, will revert"
    This reverts commit 2fc102767ecc6ae2e95770f4cff18e5c08d31fc1.
    valassi committed Sep 17, 2024
    Configuration menu
    Copy the full SHA
    b39b0e4 View commit details
    Browse the repository at this point in the history
  10. [amd] in gg_tt.mad cudacpp.mk, try to work around HIP crashes madgrap…

    …h5#1003 by disabling SIMD in C++ objects built with hipcc - it also does not help, will revert
    valassi committed Sep 17, 2024
    Configuration menu
    Copy the full SHA
    f6cca64 View commit details
    Browse the repository at this point in the history
  11. [amd] in gg_tt.mad cudacpp.mk, revert the previous commit (2)

    Revert "[amd] in gg_tt.mad cudacpp.mk, try to work around HIP crashes madgraph5#1003 by disabling SIMD in C++ objects built with hipcc - it also does not help, will revert"
    This reverts commit 1e225fd7068eb0c67377f55c7e910af945a4d963.
    valassi committed Sep 17, 2024
    Configuration menu
    Copy the full SHA
    4983634 View commit details
    Browse the repository at this point in the history
  12. [amd] in gg_tt.mad EventStatistics.h, try to work around HIP crashes m…

    …adgraph5#1003 by adding volatile - it does not work, will revert
    valassi committed Sep 17, 2024
    Configuration menu
    Copy the full SHA
    3c8f9ca View commit details
    Browse the repository at this point in the history
  13. [amd] in gg_tt.mad EventStatistics.h, revert the previous commit (1)

    Revert "[amd] in gg_tt.mad EventStatistics.h, try to work around HIP crashes madgraph5#1003 by adding volatile - it does not work, will revert"
    This reverts commit e2591da7b159b6d133a7cff7a4b583a8ad34d563.
    valassi committed Sep 17, 2024
    Configuration menu
    Copy the full SHA
    0a4e76d View commit details
    Browse the repository at this point in the history
  14. [amd] in gg_tt.mad EventStatistics.h, work around HIP crashes madgrap…

    …h5#1003 by printing out sum.nevtOK() - this avoids teh crash but is not practical, will revert
    valassi committed Sep 17, 2024
    Configuration menu
    Copy the full SHA
    f456668 View commit details
    Browse the repository at this point in the history
  15. [amd] in gg_tt.mad EventStatistics.h, revert the previous commit (2)

    Revert "[amd] in gg_tt.mad EventStatistics.h, work around HIP crashes madgraph5#1003 by printing out sum.nevtOK() - this avoids teh crash but is not practical, will revert"
    This reverts commit 725dae88d89a61d005a0031c9462fe95f4ec6728.
    valassi committed Sep 17, 2024
    Configuration menu
    Copy the full SHA
    a415cd7 View commit details
    Browse the repository at this point in the history
  16. [amd] in gg_tt.mad and CODEGEN EventStatistics.h, work around FPE crash

    madgraph5#1003 on hipcc by disabling optimizations for operator+=
    valassi committed Sep 17, 2024
    Configuration menu
    Copy the full SHA
    055795d View commit details
    Browse the repository at this point in the history
  17. Configuration menu
    Copy the full SHA
    11bb959 View commit details
    Browse the repository at this point in the history
  18. Configuration menu
    Copy the full SHA
    d07dcbf View commit details
    Browse the repository at this point in the history

Commits on Sep 18, 2024

  1. [gcc14] in gg_tt.mad and CODEGEN mgOnGpuVectors.h, distinguish betwee…

    …n const and non-const operator[] in cxtype_v (fix build error madgraph5#1004 on gcc14.2)
    valassi committed Sep 18, 2024
    Configuration menu
    Copy the full SHA
    79f45b6 View commit details
    Browse the repository at this point in the history
  2. [gcc14] in gg_tt.mad and CODEGEN mgOnGpuCxtypes.h, clarify that cxtyp…

    …e_ref is a const reference to two non-const fp variables
    valassi committed Sep 18, 2024
    Configuration menu
    Copy the full SHA
    70f40cc View commit details
    Browse the repository at this point in the history
  3. [clang] in gg_tt.mad and CODEGEN EventStatistics.h, work around FPE c…

    …rash madgraph5#1005 on clang16 by disabling optimizations for operator+=
    
    This extends to any clang the previous workaround for madgraph5#1003 which had been defined only for HIP clang
    valassi committed Sep 18, 2024
    Configuration menu
    Copy the full SHA
    a3b6ab4 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    da50539 View commit details
    Browse the repository at this point in the history
  5. [clang] rerun 102 tput tests on itscrd90 - all ok

    STARTED  AT Wed Sep 18 10:03:30 AM CEST 2024
    ./tput/teeThroughputX.sh -mix -hrd -makej -eemumu -ggtt -ggttg -ggttgg -gqttq -ggttggg -makeclean
    ENDED(1) AT Wed Sep 18 12:28:45 PM CEST 2024 [Status=0]
    ./tput/teeThroughputX.sh -flt -hrd -makej -eemumu -ggtt -ggttgg -inlonly -makeclean
    ENDED(2) AT Wed Sep 18 12:49:20 PM CEST 2024 [Status=0]
    ./tput/teeThroughputX.sh -makej -eemumu -ggtt -ggttg -gqttq -ggttgg -ggttggg -flt -bridge -makeclean
    ENDED(3) AT Wed Sep 18 12:58:32 PM CEST 2024 [Status=0]
    ./tput/teeThroughputX.sh -eemumu -ggtt -ggttgg -flt -rmbhst
    ENDED(4) AT Wed Sep 18 01:01:21 PM CEST 2024 [Status=0]
    ./tput/teeThroughputX.sh -eemumu -ggtt -ggttgg -flt -curhst
    ENDED(5) AT Wed Sep 18 01:04:08 PM CEST 2024 [Status=0]
    ./tput/teeThroughputX.sh -eemumu -ggtt -ggttgg -flt -common
    ENDED(6) AT Wed Sep 18 01:07:00 PM CEST 2024 [Status=0]
    ./tput/teeThroughputX.sh -mix -hrd -makej -susyggtt -susyggt1t1 -smeftggtttt -heftggbb -makeclean
    ENDED(7) AT Wed Sep 18 01:38:02 PM CEST 2024 [Status=0]
    valassi committed Sep 18, 2024
    Configuration menu
    Copy the full SHA
    a726ec7 View commit details
    Browse the repository at this point in the history
  6. [clang] ** COMPLETE CLANG ** rerun 30 tmad tests on itscrd90 - all as…

    … expected
    
    STARTED  AT Wed Sep 18 01:38:02 PM CEST 2024
    (SM tests)
    ENDED(1) AT Wed Sep 18 05:31:59 PM CEST 2024 [Status=0]
    (BSM tests)
    ENDED(1) AT Wed Sep 18 05:42:22 PM CEST 2024 [Status=0]
    
    24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_eemumu_mad/log_eemumu_mad_d_inl0_hrd0.txt
    24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_eemumu_mad/log_eemumu_mad_f_inl0_hrd0.txt
    24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_eemumu_mad/log_eemumu_mad_m_inl0_hrd0.txt
    24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttggg_mad/log_ggttggg_mad_d_inl0_hrd0.txt
    24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttggg_mad/log_ggttggg_mad_f_inl0_hrd0.txt
    24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttggg_mad/log_ggttggg_mad_m_inl0_hrd0.txt
    24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttgg_mad/log_ggttgg_mad_d_inl0_hrd0.txt
    24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttgg_mad/log_ggttgg_mad_f_inl0_hrd0.txt
    24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttgg_mad/log_ggttgg_mad_m_inl0_hrd0.txt
    24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttg_mad/log_ggttg_mad_d_inl0_hrd0.txt
    24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttg_mad/log_ggttg_mad_f_inl0_hrd0.txt
    24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttg_mad/log_ggttg_mad_m_inl0_hrd0.txt
    24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggtt_mad/log_ggtt_mad_d_inl0_hrd0.txt
    24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggtt_mad/log_ggtt_mad_f_inl0_hrd0.txt
    24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggtt_mad/log_ggtt_mad_m_inl0_hrd0.txt
    24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_gqttq_mad/log_gqttq_mad_d_inl0_hrd0.txt
    24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_gqttq_mad/log_gqttq_mad_f_inl0_hrd0.txt
    24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_gqttq_mad/log_gqttq_mad_m_inl0_hrd0.txt
    24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_heftggbb_mad/log_heftggbb_mad_d_inl0_hrd0.txt
    1 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_heftggbb_mad/log_heftggbb_mad_f_inl0_hrd0.txt
    24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_heftggbb_mad/log_heftggbb_mad_m_inl0_hrd0.txt
    24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_smeftggtttt_mad/log_smeftggtttt_mad_d_inl0_hrd0.txt
    24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_smeftggtttt_mad/log_smeftggtttt_mad_f_inl0_hrd0.txt
    24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_smeftggtttt_mad/log_smeftggtttt_mad_m_inl0_hrd0.txt
    24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_susyggt1t1_mad/log_susyggt1t1_mad_d_inl0_hrd0.txt
    24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_susyggt1t1_mad/log_susyggt1t1_mad_f_inl0_hrd0.txt
    24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_susyggt1t1_mad/log_susyggt1t1_mad_m_inl0_hrd0.txt
    24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_susyggtt_mad/log_susyggtt_mad_d_inl0_hrd0.txt
    24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_susyggtt_mad/log_susyggtt_mad_f_inl0_hrd0.txt
    24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_susyggtt_mad/log_susyggtt_mad_m_inl0_hrd0.txt
    valassi committed Sep 18, 2024
    Configuration menu
    Copy the full SHA
    dbbadab View commit details
    Browse the repository at this point in the history

Commits on Sep 19, 2024

  1. [amd] rerun 96 tput builds and tests on LUMI worker node (small-g 72h…

    …) - all as expected
    
    STARTED  AT Wed 18 Sep 2024 03:07:46 PM EEST
    ./tput/teeThroughputX.sh -mix -hrd -makej -eemumu -ggtt -ggttg -ggttgg -gqttq -ggttggg -makeclean  -nocuda
    ENDED(1) AT Wed 18 Sep 2024 05:21:56 PM EEST [Status=2]
    ./tput/teeThroughputX.sh -flt -hrd -makej -eemumu -ggtt -ggttgg -inlonly -makeclean  -nocuda
    ENDED(2) AT Wed 18 Sep 2024 06:00:57 PM EEST [Status=0]
    ./tput/teeThroughputX.sh -makej -eemumu -ggtt -ggttg -gqttq -ggttgg -ggttggg -flt -bridge -makeclean  -nocuda
    ENDED(3) AT Wed 18 Sep 2024 06:09:10 PM EEST [Status=2]
    ./tput/teeThroughputX.sh -eemumu -ggtt -ggttgg -flt -rmbhst  -nocuda
    ENDED(4) AT Wed 18 Sep 2024 06:11:01 PM EEST [Status=0]
    SKIP './tput/teeThroughputX.sh -eemumu -ggtt -ggttgg -flt -common  -nocuda'
    ENDED(5) AT Wed 18 Sep 2024 06:11:01 PM EEST [Status=0]
    ./tput/teeThroughputX.sh -eemumu -ggtt -ggttgg -flt -common  -nocuda
    ENDED(6) AT Wed 18 Sep 2024 06:12:50 PM EEST [Status=0]
    ./tput/teeThroughputX.sh -mix -hrd -makej -susyggtt -susyggt1t1 -smeftggtttt -heftggbb -makeclean  -nocuda
    ENDED(7) AT Wed 18 Sep 2024 07:30:15 PM EEST [Status=0]
    
    ./tput/logs_gqttq_mad/log_gqttq_mad_d_inl0_hrd0_bridge.txt:/users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/gq_ttq.mad/SubProcesses/P1_gux_ttxux/build.hip_d_inl0_hrd0/check_hip.exe: Segmentation fault
    ./tput/logs_gqttq_mad/log_gqttq_mad_d_inl0_hrd0_bridge.txt:/users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/gq_ttq.mad/SubProcesses/P1_gux_ttxux/build.hip_d_inl0_hrd0/check_hip.exe: Segmentation fault
    ./tput/logs_gqttq_mad/log_gqttq_mad_d_inl0_hrd0_bridge.txt:ERROR! C++ calculation (C++/GPU) failed
    ./tput/logs_gqttq_mad/log_gqttq_mad_f_inl0_hrd0.txt:/users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/gq_ttq.mad/SubProcesses/P1_gux_ttxux/build.hip_f_inl0_hrd0/check_hip.exe: Segmentation fault
    ./tput/logs_gqttq_mad/log_gqttq_mad_f_inl0_hrd0.txt:/users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/gq_ttq.mad/SubProcesses/P1_gux_ttxux/build.hip_f_inl0_hrd0/check_hip.exe: Segmentation fault
    ./tput/logs_gqttq_mad/log_gqttq_mad_f_inl0_hrd0.txt:ERROR! C++ calculation (C++/GPU) failed
    ./tput/logs_gqttq_mad/log_gqttq_mad_f_inl0_hrd0_bridge.txt:/users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/gq_ttq.mad/SubProcesses/P1_gux_ttxux/build.hip_f_inl0_hrd0/check_hip.exe: Segmentation fault
    ./tput/logs_gqttq_mad/log_gqttq_mad_f_inl0_hrd0_bridge.txt:/users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/gq_ttq.mad/SubProcesses/P1_gux_ttxux/build.hip_f_inl0_hrd0/check_hip.exe: Segmentation fault
    ./tput/logs_gqttq_mad/log_gqttq_mad_f_inl0_hrd0_bridge.txt:ERROR! C++ calculation (C++/GPU) failed
    ./tput/logs_gqttq_mad/log_gqttq_mad_m_inl0_hrd0.txt:/users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/gq_ttq.mad/SubProcesses/P1_gux_ttxux/build.hip_m_inl0_hrd0/check_hip.exe: Segmentation fault
    ./tput/logs_gqttq_mad/log_gqttq_mad_m_inl0_hrd0.txt:/users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/gq_ttq.mad/SubProcesses/P1_gux_ttxux/build.hip_m_inl0_hrd0/check_hip.exe: Segmentation fault
    ./tput/logs_gqttq_mad/log_gqttq_mad_m_inl0_hrd0.txt:ERROR! C++ calculation (C++/GPU) failed
    ./tput/logs_gqttq_mad/log_gqttq_mad_d_inl0_hrd0.txt:/users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/gq_ttq.mad/SubProcesses/P1_gux_ttxux/build.hip_d_inl0_hrd0/check_hip.exe: Segmentation fault
    ./tput/logs_gqttq_mad/log_gqttq_mad_d_inl0_hrd0.txt:/users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/gq_ttq.mad/SubProcesses/P1_gux_ttxux/build.hip_d_inl0_hrd0/check_hip.exe: Segmentation fault
    ./tput/logs_gqttq_mad/log_gqttq_mad_d_inl0_hrd0.txt:ERROR! C++ calculation (C++/GPU) failed
    ./tput/logs_gqttq_mad/log_gqttq_mad_m_inl0_hrd1.txt:/users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/gq_ttq.mad/SubProcesses/P1_gux_ttxux/build.hip_m_inl0_hrd1/check_hip.exe: Segmentation fault
    ./tput/logs_gqttq_mad/log_gqttq_mad_m_inl0_hrd1.txt:/users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/gq_ttq.mad/SubProcesses/P1_gux_ttxux/build.hip_m_inl0_hrd1/check_hip.exe: Segmentation fault
    ./tput/logs_gqttq_mad/log_gqttq_mad_m_inl0_hrd1.txt:ERROR! C++ calculation (C++/GPU) failed
    ./tput/logs_gqttq_mad/log_gqttq_mad_d_inl0_hrd1.txt:/users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/gq_ttq.mad/SubProcesses/P1_gux_ttxux/build.hip_d_inl0_hrd1/check_hip.exe: Segmentation fault
    ./tput/logs_gqttq_mad/log_gqttq_mad_d_inl0_hrd1.txt:/users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/gq_ttq.mad/SubProcesses/P1_gux_ttxux/build.hip_d_inl0_hrd1/check_hip.exe: Segmentation fault
    ./tput/logs_gqttq_mad/log_gqttq_mad_d_inl0_hrd1.txt:ERROR! C++ calculation (C++/GPU) failed
    ./tput/logs_gqttq_mad/log_gqttq_mad_f_inl0_hrd1.txt:/users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/gq_ttq.mad/SubProcesses/P1_gux_ttxux/build.hip_f_inl0_hrd1/check_hip.exe: Segmentation fault
    ./tput/logs_gqttq_mad/log_gqttq_mad_f_inl0_hrd1.txt:/users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/gq_ttq.mad/SubProcesses/P1_gux_ttxux/build.hip_f_inl0_hrd1/check_hip.exe: Segmentation fault
    ./tput/logs_gqttq_mad/log_gqttq_mad_f_inl0_hrd1.txt:ERROR! C++ calculation (C++/GPU) failed
    valassi committed Sep 19, 2024
    Configuration menu
    Copy the full SHA
    0c947d1 View commit details
    Browse the repository at this point in the history
  2. [amd] in gq_ttq.mad and CODEGEN cudacpp.mk add optional debug flags f…

    …or rocgdb on HIP (to debug the memory fault madgraph5#806)
    valassi committed Sep 19, 2024
    Configuration menu
    Copy the full SHA
    1b29900 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    0ce5d06 View commit details
    Browse the repository at this point in the history
  4. [amd] rerun 30 tmad tests on LUMI against AMD GPUs - all as expected …

    …(heft fail madgraph5#833, skip ggttggg madgraph5#933, gqttq crash madgraph5#806)
    
    STARTED  AT Wed 18 Sep 2024 09:02:01 PM EEST
    (SM tests)
    ENDED(1) AT Wed 18 Sep 2024 11:40:09 PM EEST [Status=0]
    (BSM tests)
    ENDED(1) AT Wed 18 Sep 2024 11:48:33 PM EEST [Status=0]
    
    16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_eemumu_mad/log_eemumu_mad_d_inl0_hrd0.txt
    16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_eemumu_mad/log_eemumu_mad_f_inl0_hrd0.txt
    16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_eemumu_mad/log_eemumu_mad_m_inl0_hrd0.txt
    12 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_ggttggg_mad/log_ggttggg_mad_d_inl0_hrd0.txt
    12 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_ggttggg_mad/log_ggttggg_mad_f_inl0_hrd0.txt
    12 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_ggttggg_mad/log_ggttggg_mad_m_inl0_hrd0.txt
    16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_ggttgg_mad/log_ggttgg_mad_d_inl0_hrd0.txt
    16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_ggttgg_mad/log_ggttgg_mad_f_inl0_hrd0.txt
    16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_ggttgg_mad/log_ggttgg_mad_m_inl0_hrd0.txt
    16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_ggttg_mad/log_ggttg_mad_d_inl0_hrd0.txt
    16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_ggttg_mad/log_ggttg_mad_f_inl0_hrd0.txt
    16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_ggttg_mad/log_ggttg_mad_m_inl0_hrd0.txt
    16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_ggtt_mad/log_ggtt_mad_d_inl0_hrd0.txt
    16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_ggtt_mad/log_ggtt_mad_f_inl0_hrd0.txt
    16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_ggtt_mad/log_ggtt_mad_m_inl0_hrd0.txt
    12 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_gqttq_mad/log_gqttq_mad_d_inl0_hrd0.txt
    12 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_gqttq_mad/log_gqttq_mad_f_inl0_hrd0.txt
    12 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_gqttq_mad/log_gqttq_mad_m_inl0_hrd0.txt
    16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_heftggbb_mad/log_heftggbb_mad_d_inl0_hrd0.txt
    1 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_heftggbb_mad/log_heftggbb_mad_f_inl0_hrd0.txt
    16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_heftggbb_mad/log_heftggbb_mad_m_inl0_hrd0.txt
    16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_smeftggtttt_mad/log_smeftggtttt_mad_d_inl0_hrd0.txt
    16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_smeftggtttt_mad/log_smeftggtttt_mad_f_inl0_hrd0.txt
    16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_smeftggtttt_mad/log_smeftggtttt_mad_m_inl0_hrd0.txt
    16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_susyggt1t1_mad/log_susyggt1t1_mad_d_inl0_hrd0.txt
    16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_susyggt1t1_mad/log_susyggt1t1_mad_f_inl0_hrd0.txt
    16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_susyggt1t1_mad/log_susyggt1t1_mad_m_inl0_hrd0.txt
    16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_susyggtt_mad/log_susyggtt_mad_d_inl0_hrd0.txt
    16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_susyggtt_mad/log_susyggtt_mad_f_inl0_hrd0.txt
    16 /users/valassia/GPU2024/madgraph4gpu/epochX/cudacpp/tmad/logs_susyggtt_mad/log_susyggtt_mad_m_inl0_hrd0.txt
    valassi committed Sep 19, 2024
    Configuration menu
    Copy the full SHA
    458b834 View commit details
    Browse the repository at this point in the history
  5. [amd] ** COMPLETE AMD ** revert to itscrd90 logs for tput/tmad tests

    Revert "[amd] rerun 30 tmad tests on LUMI against AMD GPUs - all as expected (heft fail madgraph5#833, skip ggttggg madgraph5#933, gqttq crash madgraph5#806)"
    This reverts commit 458b834.
    
    Revert "[amd] rerun 96 tput builds and tests on LUMI worker node (small-g 72h) - all as expected"
    This reverts commit 0c947d1.
    valassi committed Sep 19, 2024
    Configuration menu
    Copy the full SHA
    74608a4 View commit details
    Browse the repository at this point in the history