Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add HIP configuration for CSCS CI #846

Merged
merged 7 commits into from
Jan 9, 2024

Conversation

aurianer
Copy link
Contributor

@aurianer aurianer commented Nov 3, 2023

Fixes partially #32.

Got clang segfaulting when building dependencies with [email protected] so I now switched to [email protected]
https://gitlab.com/cscs-ci/ci-testing/webhook-ci/mirrors/479009878135925/5304355110917878/-/jobs/5651543294

Table of #938 to summarize the results with [email protected] apart from [email protected] which conflicts with gcc@12 spack/spack#42064:

HIP version \ stdexec version 5e378418 [email protected]
5.2 ✔️ logs one test timeout
5.3.3 error logs error logs
5.5 ✔️ logs ✔️ logs
5.6 ✔️ logs ✔️ logs

@aurianer aurianer added the category: CI Continuous Integration label Nov 3, 2023
@aurianer aurianer added this to the 0.21.0 milestone Nov 3, 2023
@aurianer aurianer self-assigned this Nov 3, 2023
@aurianer aurianer force-pushed the add_hip_configuration_in_cscsci branch 2 times, most recently from 0d7709f to 2164a42 Compare November 3, 2023 21:50
@aurianer aurianer force-pushed the add_hip_configuration_in_cscsci branch from 2164a42 to 07b0ad8 Compare November 15, 2023 21:28
@aurianer
Copy link
Contributor Author

cscs-ci run

@pika-bot
Copy link
Collaborator

Performance test report

pika Performance

Comparison

BENCHMARKNO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch-

Info

PropertyBeforeAfter
pika Commit02f9de829a4939
pika Datetime2023-08-21T11:44:55+00:002023-11-15T21:48:22+00:00
Clusternamedaintdaint
Envfile
Hostnamenid01181nid01193
Datetime2023-08-21T13:50:51.685166+02:002023-11-15T22:54:36.237400+01:00
Compiler/apps/daint/SSL/pika/spack/lib/spack/env/clang/clang++ 11.0.1/apps/daint/SSL/pika/spack/lib/spack/env/clang/clang++ 11.0.1

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (>10%)
++/--Large performance improvement/degradation (>10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

@aurianer aurianer force-pushed the add_hip_configuration_in_cscsci branch 4 times, most recently from 0df7aba to c053ac7 Compare November 19, 2023 20:38
@aurianer
Copy link
Contributor Author

cscs-ci run

@pika-bot
Copy link
Collaborator

Performance test report

pika Performance

Comparison

BENCHMARKNO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch-

Info

PropertyBeforeAfter
pika Datetime2023-08-21T11:44:55+00:002023-11-19T20:38:13+00:00
pika Commit02f9de822a4f81
Envfile
Compiler/apps/daint/SSL/pika/spack/lib/spack/env/clang/clang++ 11.0.1/apps/daint/SSL/pika/spack/lib/spack/env/clang/clang++ 11.0.1
Hostnamenid01181nid01180
Clusternamedaintdaint
Datetime2023-08-21T13:50:51.685166+02:002023-11-19T21:45:28.389789+01:00

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (>10%)
++/--Large performance improvement/degradation (>10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

@aurianer aurianer force-pushed the add_hip_configuration_in_cscsci branch from c053ac7 to d5de6da Compare November 20, 2023 10:52
@aurianer
Copy link
Contributor Author

cscs-ci run

@pika-bot
Copy link
Collaborator

Performance test report

pika Performance

Comparison

BENCHMARKNO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch-

Info

PropertyBeforeAfter
pika Commit02f9de8a52d000
pika Datetime2023-08-21T11:44:55+00:002023-11-20T11:45:13+00:00
Envfile
Hostnamenid01181nid01260
Clusternamedaintdaint
Compiler/apps/daint/SSL/pika/spack/lib/spack/env/clang/clang++ 11.0.1/apps/daint/SSL/pika/spack/lib/spack/env/clang/clang++ 11.0.1
Datetime2023-08-21T13:50:51.685166+02:002023-11-20T12:51:12.542937+01:00

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (>10%)
++/--Large performance improvement/degradation (>10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

@aurianer aurianer force-pushed the add_hip_configuration_in_cscsci branch from 05c7d3b to a93586c Compare November 21, 2023 07:47
@aurianer
Copy link
Contributor Author

2 hours timeout was not enough for the step installing the spack dependencies, I just increased it to 4 hours

@aurianer
Copy link
Contributor Author

cscs-ci run

@aurianer aurianer force-pushed the add_hip_configuration_in_cscsci branch 2 times, most recently from 53e7255 to db3a415 Compare November 30, 2023 09:47
@pika-bot
Copy link
Collaborator

Performance test report

pika Performance

Comparison

BENCHMARKNO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch-

Info

PropertyBeforeAfter
pika Datetime2023-08-21T11:44:55+00:002023-11-30T09:47:59+00:00
pika Commit02f9de85396e8d
Clusternamedaintdaint
Compiler/apps/daint/SSL/pika/spack/lib/spack/env/clang/clang++ 11.0.1/apps/daint/SSL/pika/spack/lib/spack/env/clang/clang++ 11.0.1
Envfile
Hostnamenid01181nid01260
Datetime2023-08-21T13:50:51.685166+02:002023-11-30T11:00:06.160544+01:00

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (>10%)
++/--Large performance improvement/degradation (>10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

@aurianer aurianer force-pushed the add_hip_configuration_in_cscsci branch 2 times, most recently from 1e2d48c to 8072868 Compare December 1, 2023 13:20
@aurianer
Copy link
Contributor Author

aurianer commented Dec 1, 2023

cscs-ci run

@pika-bot
Copy link
Collaborator

pika-bot commented Dec 1, 2023

Performance test report

pika Performance

Comparison

BENCHMARKNO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch-

Info

PropertyBeforeAfter
pika Commit02f9de86fa0cdb
pika Datetime2023-08-21T11:44:55+00:002023-12-01T13:20:28+00:00
Datetime2023-08-21T13:50:51.685166+02:002023-12-01T14:25:57.152233+01:00
Envfile
Compiler/apps/daint/SSL/pika/spack/lib/spack/env/clang/clang++ 11.0.1/apps/daint/SSL/pika/spack/lib/spack/env/clang/clang++ 11.0.1
Hostnamenid01181nid00025
Clusternamedaintdaint

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (>10%)
++/--Large performance improvement/degradation (>10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

@aurianer
Copy link
Contributor Author

aurianer commented Jan 5, 2024

cscs-ci run

Copy link
Contributor

@msimberg msimberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a request to check for consistency with other configurations, otherwise this looks good. Could you also slightly clean up the git history, at least the Tmp! commits? And we obviously need to wait for #899 to be merged first.

.gitlab/includes/hip_pipeline.yml Outdated Show resolved Hide resolved
.gitlab/includes/hip_pipeline.yml Outdated Show resolved Hide resolved
.gitlab/includes/hip_pipeline.yml Outdated Show resolved Hide resolved
.gitlab/includes/hip_pipeline.yml Outdated Show resolved Hide resolved
@msimberg
Copy link
Contributor

msimberg commented Jan 8, 2024

Would you mind summarizing what configurations you've tried and what worked/didn't work? Is it the stdexec update that fixed compilation or the HIP version change, or a combination?

@aurianer aurianer force-pushed the add_hip_configuration_in_cscsci branch from a7e71a8 to 36c41f6 Compare January 8, 2024 21:38
@pika-bot
Copy link
Collaborator

pika-bot commented Jan 8, 2024

Performance test report

pika Performance

Comparison

BENCHMARKRESULT
Task Overhead - Create Thread Hierarchical - Latch-

Info

PropertyBeforeAfter
pika Commit02f9de85fc5ae8
pika Datetime2023-08-21T11:44:55+00:002024-01-08T21:38:34+00:00
Envfile
Datetime2023-08-21T13:50:51.685166+02:002024-01-08T22:48:03.262462+01:00
Compiler/apps/daint/SSL/pika/spack/lib/spack/env/clang/clang++ 11.0.1/apps/daint/SSL/pika/spack/lib/spack/env/clang/clang++ 11.0.1
Clusternamedaintdaint
Hostnamenid01181nid00025

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (>10%)
++/--Large performance improvement/degradation (>10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

@aurianer
Copy link
Contributor Author

aurianer commented Jan 8, 2024

Would you mind summarizing what configurations you've tried and what worked/didn't work? Is it the stdexec update that fixed compilation or the HIP version change, or a combination?

HIP 5.5 made things work, I just updated to the new nvhpc-23.09.rc4 tag as we said we would use this one from now on. Yes that's a good idea to summarize what I tried, I will do that I just need to find back the error messages in CI. I tested hip 5.3.3 and hip 5.6. Would you like me to try other versions? I can also try the nvhpc tag in stdexec with the failing hip versions because I don't remember trying that out

@aurianer aurianer force-pushed the add_hip_configuration_in_cscsci branch from 36c41f6 to 9ab105e Compare January 9, 2024 09:11
@msimberg
Copy link
Contributor

msimberg commented Jan 9, 2024

HIP 5.5 made things work, I just updated to the new nvhpc-23.09.rc4 tag as we said we would use this one from now on. Yes that's a good idea to summarize what I tried, I will do that I just need to find back the error messages in CI. I tested hip 5.3.3 and hip 5.6. Would you like me to try other versions? I can also try the nvhpc tag in stdexec with the failing hip versions because I don't remember trying that out

Thanks, that's already sufficient for this PR. I was mainly curious to know if the old or new version of stdexec made a difference, but it sounds like that's not the case. Further version testing can be done outside of this PR.

Copy link
Contributor

@msimberg msimberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@msimberg
Copy link
Contributor

msimberg commented Jan 9, 2024

cscs-ci run

@aurianer
Copy link
Contributor Author

aurianer commented Jan 9, 2024

All of these versions of HIP were tested with [email protected]

HIP version Status
5.2
5.3.3
5.5 ✔️
5.6

@aurianer aurianer force-pushed the add_hip_configuration_in_cscsci branch from 9ab105e to 9e49ebc Compare January 9, 2024 13:04
@aurianer
Copy link
Contributor Author

aurianer commented Jan 9, 2024

cscs-ci run

@aurianer aurianer force-pushed the add_hip_configuration_in_cscsci branch from 9e49ebc to 4ffbe04 Compare January 9, 2024 17:02
@aurianer
Copy link
Contributor Author

aurianer commented Jan 9, 2024

cscs-ci run

@pika-bot
Copy link
Collaborator

pika-bot commented Jan 9, 2024

Performance test report

pika Performance

Comparison

BENCHMARKRESULT
Task Overhead - Create Thread Hierarchical - Latch-

Info

PropertyBeforeAfter
pika Commit02f9de81b3e643
pika Datetime2023-08-21T11:44:55+00:002024-01-09T17:02:12+00:00
Compiler/apps/daint/SSL/pika/spack/lib/spack/env/clang/clang++ 11.0.1/apps/daint/SSL/pika/spack/lib/spack/env/clang/clang++ 11.0.1
Datetime2023-08-21T13:50:51.685166+02:002024-01-09T18:10:41.506484+01:00
Clusternamedaintdaint
Envfile
Hostnamenid01181nid00456

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (>10%)
++/--Large performance improvement/degradation (>10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

@msimberg msimberg added this pull request to the merge queue Jan 9, 2024
Merged via the queue into pika-org:main with commit f15d6c1 Jan 9, 2024
64 of 66 checks passed
@msimberg msimberg added this to the 0.22.0 milestone Jan 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: CI Continuous Integration
Projects
Status: Archive
Development

Successfully merging this pull request may close these issues.

3 participants