-
Notifications
You must be signed in to change notification settings - Fork 188
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add install instructions for ALCF's Polaris (#4636)
* add polaris machine files * add doc page for Polaris
- Loading branch information
1 parent
206b081
commit 9d8ecf9
Showing
5 changed files
with
398 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,187 @@ | ||
.. _building-polaris: | ||
|
||
Polaris (ALCF) | ||
============== | ||
|
||
The `Polaris cluster <https://docs.alcf.anl.gov/polaris/getting-started/>`__ is located at ALCF. | ||
|
||
|
||
Introduction | ||
------------ | ||
|
||
If you are new to this system, **please see the following resources**: | ||
|
||
* `ALCF user guide <https://docs.alcf.anl.gov/>`__ | ||
* Batch system: `PBS <https://docs.alcf.anl.gov/running-jobs/job-and-queue-scheduling/>`__ | ||
* `Filesystems <https://docs.alcf.anl.gov/data-management/filesystem-and-storage/file-systems/>`__ | ||
|
||
.. _building-polaris-preparation: | ||
|
||
Preparation | ||
----------- | ||
|
||
Use the following commands to download the WarpX source code: | ||
|
||
.. code-block:: bash | ||
git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx | ||
On Polaris, you can run either on GPU nodes with fast A100 GPUs (recommended) or CPU nodes. | ||
|
||
.. tab-set:: | ||
|
||
.. tab-item:: A100 GPUs | ||
|
||
We use system software modules, add environment hints and further dependencies via the file ``$HOME/polaris_gpu_warpx.profile``. | ||
Create it now: | ||
|
||
.. code-block:: bash | ||
cp $HOME/src/warpx/Tools/machines/polaris-alcf/polaris_gpu_warpx.profile.example $HOME/polaris_gpu_warpx.profile | ||
.. dropdown:: Script Details | ||
:color: light | ||
:icon: info | ||
:animate: fade-in-slide-down | ||
|
||
.. literalinclude:: ../../../../Tools/machines/polaris-alcf/polaris_gpu_warpx.profile.example | ||
:language: bash | ||
|
||
Edit the 2nd line of this script, which sets the ``export proj=""`` variable. | ||
For example, if you are member of the project ``proj_name``, then run ``nano $HOME/polaris_gpu_warpx.profile`` and edit line 2 to read: | ||
|
||
.. code-block:: bash | ||
export proj="proj_name" | ||
Exit the ``nano`` editor with ``Ctrl`` + ``O`` (save) and then ``Ctrl`` + ``X`` (exit). | ||
|
||
.. important:: | ||
|
||
Now, and as the first step on future logins to Polaris, activate these environment settings: | ||
|
||
.. code-block:: bash | ||
source $HOME/polaris_gpu_warpx.profile | ||
Finally, since Polaris does not yet provide software modules for some of our dependencies, install them once: | ||
|
||
.. code-block:: bash | ||
bash $HOME/src/warpx/Tools/machines/polaris-alcf/install_gpu_dependencies.sh | ||
source ${CFS}/${proj%_g}/${USER}/sw/polaris/gpu/venvs/warpx/bin/activate | ||
.. dropdown:: Script Details | ||
:color: light | ||
:icon: info | ||
:animate: fade-in-slide-down | ||
|
||
.. literalinclude:: ../../../../Tools/machines/polaris-alcf/install_gpu_dependencies.sh | ||
:language: bash | ||
|
||
|
||
.. tab-item:: CPU Nodes | ||
|
||
*Under construction* | ||
|
||
|
||
.. _building-polaris-compilation: | ||
|
||
Compilation | ||
----------- | ||
|
||
Use the following :ref:`cmake commands <building-cmake>` to compile the application executable: | ||
|
||
.. tab-set:: | ||
|
||
.. tab-item:: A100 GPUs | ||
|
||
.. code-block:: bash | ||
cd $HOME/src/warpx | ||
rm -rf build_pm_gpu | ||
cmake -S . -B build_pm_gpu -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_DIMS="1;2;RZ;3" | ||
cmake --build build_pm_gpu -j 16 | ||
The WarpX application executables are now in ``$HOME/src/warpx/build_pm_gpu/bin/``. | ||
Additionally, the following commands will install WarpX as a Python module: | ||
|
||
.. code-block:: bash | ||
cd $HOME/src/warpx | ||
rm -rf build_pm_gpu_py | ||
cmake -S . -B build_pm_gpu_py -DWarpX_COMPUTE=CUDA -DWarpX_PSATD=ON -DWarpX_QED_TABLE_GEN=ON -DWarpX_APP=OFF -DWarpX_PYTHON=ON -DWarpX_DIMS="1;2;RZ;3" | ||
cmake --build build_pm_gpu_py -j 16 --target pip_install | ||
.. tab-item:: CPU Nodes | ||
|
||
*Under construction* | ||
|
||
Now, you can :ref:`submit Polaris compute jobs <running-cpp-polaris>` for WarpX :ref:`Python (PICMI) scripts <usage-picmi>` (:ref:`example scripts <usage-examples>`). | ||
Or, you can use the WarpX executables to submit Polaris jobs (:ref:`example inputs <usage-examples>`). | ||
For executables, you can reference their location in your :ref:`job script <running-cpp-polaris>` or copy them to a location in ``$PSCRATCH``. | ||
|
||
|
||
.. _building-polaris-update: | ||
|
||
Update WarpX & Dependencies | ||
--------------------------- | ||
|
||
If you already installed WarpX in the past and want to update it, start by getting the latest source code: | ||
|
||
.. code-block:: bash | ||
cd $HOME/src/warpx | ||
# read the output of this command - does it look ok? | ||
git status | ||
# get the latest WarpX source code | ||
git fetch | ||
git pull | ||
# read the output of these commands - do they look ok? | ||
git status | ||
git log # press q to exit | ||
And, if needed, | ||
|
||
- :ref:`update the polaris_gpu_warpx.profile or polaris_cpu_warpx files <building-polaris-preparation>`, | ||
- log out and into the system, activate the now updated environment profile as usual, | ||
- :ref:`execute the dependency install scripts <building-polaris-preparation>`. | ||
|
||
As a last step, clean the build directory ``rm -rf $HOME/src/warpx/build_pm_*`` and rebuild WarpX. | ||
|
||
|
||
.. _running-cpp-polaris: | ||
|
||
Running | ||
------- | ||
|
||
.. tab-set:: | ||
|
||
.. tab-item:: A100 (40GB) GPUs | ||
|
||
The batch script below can be used to run a WarpX simulation on multiple nodes (change ``<NODES>`` accordingly) on the supercomputer Polaris at ALCF. | ||
|
||
Replace descriptions between chevrons ``<>`` by relevant values, for instance ``<input file>`` could be ``plasma_mirror_inputs``. | ||
Note that we run one MPI rank per GPU. | ||
|
||
.. literalinclude:: ../../../../Tools/machines/polaris-alcf/polaris_gpu.pbs | ||
:language: bash | ||
:caption: You can copy this file from ``$HOME/src/warpx/Tools/machines/polaris-alcf/polaris_gpu.pbs``. | ||
|
||
To run a simulation, copy the lines above to a file ``polaris_gpu.pbs`` and run | ||
|
||
.. code-block:: bash | ||
qsub polaris_gpu.pbs | ||
to submit the job. | ||
|
||
|
||
.. tab-item:: CPU Nodes | ||
|
||
*Under construction* |
123 changes: 123 additions & 0 deletions
123
Tools/machines/polaris-alcf/install_gpu_dependencies.sh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,123 @@ | ||
#!/bin/bash | ||
# | ||
# Copyright 2024 The WarpX Community | ||
# | ||
# This file is part of WarpX. | ||
# | ||
# Author: Axel Huebl (edited by Roelof Groenewald for Polaris) | ||
# License: BSD-3-Clause-LBNL | ||
|
||
# Exit on first error encountered ############################################# | ||
# | ||
set -eu -o pipefail | ||
|
||
# Check: ###################################################################### | ||
# | ||
# Was polaris_gpu_warpx.profile sourced and configured correctly? | ||
if [ -z ${proj-} ]; then echo "WARNING: The 'proj' variable is not yet set in your polaris_gpu_warpx.profile file! Please edit its line 2 to continue!"; exit 1; fi | ||
|
||
# Remove old dependencies ##################################################### | ||
# | ||
SW_DIR="/home/${USER}/sw/polaris/gpu" | ||
rm -rf ${SW_DIR} | ||
mkdir -p ${SW_DIR} | ||
|
||
# remove common user mistakes in python, located in .local instead of a venv | ||
python3 -m pip uninstall -qq -y pywarpx | ||
python3 -m pip uninstall -qq -y warpx | ||
python3 -m pip uninstall -qqq -y mpi4py 2>/dev/null || true | ||
|
||
# General extra dependencies ################################################## | ||
# | ||
|
||
# c-blosc (I/O compression) | ||
if [ -d $HOME/src/c-blosc ] | ||
then | ||
cd $HOME/src/c-blosc | ||
git fetch --prune | ||
git checkout v1.21.1 | ||
cd - | ||
else | ||
git clone -b v1.21.1 https://github.com/Blosc/c-blosc.git $HOME/src/c-blosc | ||
fi | ||
rm -rf $HOME/src/c-blosc-pm-gpu-build | ||
cmake -S $HOME/src/c-blosc -B $HOME/src/c-blosc-pm-gpu-build -DBUILD_TESTS=OFF -DBUILD_BENCHMARKS=OFF -DDEACTIVATE_AVX2=OFF -DCMAKE_INSTALL_PREFIX=${SW_DIR}/c-blosc-1.21.1 | ||
cmake --build $HOME/src/c-blosc-pm-gpu-build --target install --parallel 16 | ||
rm -rf $HOME/src/c-blosc-pm-gpu-build | ||
|
||
# ADIOS2 | ||
if [ -d $HOME/src/adios2 ] | ||
then | ||
cd $HOME/src/adios2 | ||
git fetch --prune | ||
git checkout v2.8.3 | ||
cd - | ||
else | ||
git clone -b v2.8.3 https://github.com/ornladios/ADIOS2.git $HOME/src/adios2 | ||
fi | ||
rm -rf $HOME/src/adios2-pm-gpu-build | ||
cmake -S $HOME/src/adios2 -B $HOME/src/adios2-pm-gpu-build -DADIOS2_USE_Blosc=ON -DADIOS2_USE_Fortran=OFF -DADIOS2_USE_Python=OFF -DADIOS2_USE_ZeroMQ=OFF -DCMAKE_INSTALL_PREFIX=${SW_DIR}/adios2-2.8.3 | ||
cmake --build $HOME/src/adios2-pm-gpu-build --target install -j 16 | ||
rm -rf $HOME/src/adios2-pm-gpu-build | ||
|
||
# BLAS++ (for PSATD+RZ) | ||
if [ -d $HOME/src/blaspp ] | ||
then | ||
cd $HOME/src/blaspp | ||
git fetch --prune | ||
git checkout master | ||
git pull | ||
cd - | ||
else | ||
git clone https://github.com/icl-utk-edu/blaspp.git $HOME/src/blaspp | ||
fi | ||
rm -rf $HOME/src/blaspp-pm-gpu-build | ||
CXX=$(which CC) cmake -S $HOME/src/blaspp -B $HOME/src/blaspp-pm-gpu-build -Duse_openmp=OFF -Dgpu_backend=cuda -DCMAKE_CXX_STANDARD=17 -DCMAKE_INSTALL_PREFIX=${SW_DIR}/blaspp-master | ||
cmake --build $HOME/src/blaspp-pm-gpu-build --target install --parallel 16 | ||
rm -rf $HOME/src/blaspp-pm-gpu-build | ||
|
||
# LAPACK++ (for PSATD+RZ) | ||
if [ -d $HOME/src/lapackpp ] | ||
then | ||
cd $HOME/src/lapackpp | ||
git fetch --prune | ||
git checkout master | ||
git pull | ||
cd - | ||
else | ||
git clone https://github.com/icl-utk-edu/lapackpp.git $HOME/src/lapackpp | ||
fi | ||
rm -rf $HOME/src/lapackpp-pm-gpu-build | ||
CXX=$(which CC) CXXFLAGS="-DLAPACK_FORTRAN_ADD_" cmake -S $HOME/src/lapackpp -B $HOME/src/lapackpp-pm-gpu-build -DCMAKE_CXX_STANDARD=17 -Dbuild_tests=OFF -DCMAKE_INSTALL_RPATH_USE_LINK_PATH=ON -DCMAKE_INSTALL_PREFIX=${SW_DIR}/lapackpp-master | ||
cmake --build $HOME/src/lapackpp-pm-gpu-build --target install --parallel 16 | ||
rm -rf $HOME/src/lapackpp-pm-gpu-build | ||
|
||
# Python ###################################################################### | ||
# | ||
python3 -m pip install --upgrade pip | ||
python3 -m pip install --upgrade virtualenv | ||
python3 -m pip cache purge | ||
rm -rf ${SW_DIR}/venvs/warpx | ||
python3 -m venv --system-site-packages ${SW_DIR}/venvs/warpx | ||
source ${SW_DIR}/venvs/warpx/bin/activate | ||
python3 -m pip install --upgrade pip | ||
python3 -m pip install --upgrade build | ||
python3 -m pip install --upgrade packaging | ||
python3 -m pip install --upgrade wheel | ||
python3 -m pip install --upgrade setuptools | ||
python3 -m pip install --upgrade cython | ||
python3 -m pip install --upgrade numpy | ||
python3 -m pip install --upgrade pandas | ||
python3 -m pip install --upgrade scipy | ||
# MPICC="cc -target-accel=nvidia80 -shared" python3 -m pip install --upgrade mpi4py --no-cache-dir --no-build-isolation --no-binary mpi4py | ||
python3 -m pip install --upgrade openpmd-api | ||
python3 -m pip install --upgrade matplotlib | ||
python3 -m pip install --upgrade yt | ||
# install or update WarpX dependencies such as picmistandard | ||
python3 -m pip install --upgrade -r $HOME/src/warpx/requirements.txt | ||
python3 -m pip install cupy-cuda11x # CUDA 11.7 compatible wheel | ||
# optional: for libEnsemble | ||
python3 -m pip install -r $HOME/src/warpx/Tools/LibEnsemble/requirements.txt | ||
# optional: for optimas (based on libEnsemble & ax->botorch->gpytorch->pytorch) | ||
python3 -m pip install --upgrade torch # CUDA 11.7 compatible wheel | ||
python3 -m pip install -r $HOME/src/warpx/Tools/optimas/requirements.txt |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
#!/bin/bash -l | ||
|
||
#PBS -A <proj> | ||
#PBS -l select=<NODES>:system=polaris | ||
#PBS -l place=scatter | ||
#PBS -l walltime=0:10:00 | ||
#PBS -l filesystems=home:eagle | ||
#PBS -q debug | ||
#PBS -N test_warpx | ||
|
||
# Set required environment variables | ||
# support gpu-aware-mpi | ||
# export MPICH_GPU_SUPPORT_ENABLED=1 | ||
|
||
# Change to working directory | ||
echo Working directory is $PBS_O_WORKDIR | ||
cd ${PBS_O_WORKDIR} | ||
|
||
echo Jobid: $PBS_JOBID | ||
echo Running on host `hostname` | ||
echo Running on nodes `cat $PBS_NODEFILE` | ||
|
||
# executable & inputs file or python interpreter & PICMI script here | ||
EXE=./warpx | ||
INPUTS=input1d | ||
|
||
# MPI and OpenMP settings | ||
NNODES=`wc -l < $PBS_NODEFILE` | ||
NRANKS_PER_NODE=4 | ||
NDEPTH=1 | ||
NTHREADS=1 | ||
|
||
NTOTRANKS=$(( NNODES * NRANKS_PER_NODE )) | ||
echo "NUM_OF_NODES= ${NNODES} TOTAL_NUM_RANKS= ${NTOTRANKS} RANKS_PER_NODE= ${NRANKS_PER_NODE} THREADS_PER_RANK= ${NTHREADS}" | ||
|
||
mpiexec -np ${NTOTRANKS} ${EXE} ${INPUTS} > output.txt |
Oops, something went wrong.