Merge pull request #202 from smathermather/hpc

OpenDroneMap · Sep 27, 2024 · f174246 · f174246
2 parents dcdaed8 + 7306905
commit f174246
Showing 1 changed file with 90 additions and 154 deletions.
diff --git a/source/tutorials.rst b/source/tutorials.rst
@@ -295,54 +295,6 @@ Cleaning up after Docker
 
 Docker has a lamentable use of space and by default does not clean up excess data and machines when processes are complete. This can be advantageous if we need to access a process that has since terminated, but carries the burden of using increasing amounts of storage over time. Maciej Łebkowski has an `excellent overview of how to manage excess disk usage in docker <https://lebkowski.name/docker-volumes/>`_.
 
-*****************
-Using Singularity
-*****************
-
-`Singularity <https://sylabs.io/>`__ is another container platform able to run Docker images. 
-Singularity can be run both on local machins and in instances where the user does not have root access. 
-Instances where a user may not have root privlidges include HPC clusters and cloud cluster resources.
-A container is a single file without anything else to install.
-
-Build Singularity image from Docker image
-=========================================
-Singularity can use Docker image to build SIF image.
-
-For latest ODM Docker image (Recommended) :
-
-.. code:: bash
-
-   singularity build --disable-cache -f odm_latest.sif docker://opendronemap/odm:latest
-
-For latest ODM GPU Docker image :
-
-.. code:: bash
-   
-   singularity build --disable-cache -f odm_gpu.sif docker://opendronemap/odm:gpu
-
-Using Singularity SIF image
-===========================
-
-
-Once you have used one of the above commands to download and create the `odm_latest.sif` image, it can be ran using singularity. 
-Place your images in a directory named “images” (for example /my/project/images) , then simply run :
-
-.. code:: bash
-
-   singularity run --bind /my/project:/datasets/code odm_latest.sif --project-path /datasets
-
-Like with docker, additional `Options and Flags <https://docs.opendronemap.org/arguments/>`_ can be added to the command :
-
-.. code:: bash
-
-   singularity run --bind /my/project:/datasets/code \
-   --writable-tmpfs odm_latest.sif \
-   --orthophoto-png --mesh-octree-depth 12 --ignore-gsd --dtm \
-   --smrf-threshold 0.4 --smrf-window 24 --dsm --pc-csv --pc-las --orthophoto-kmz \
-   --ignore-gsd --matcher-type flann --feature-quality ultra --max-concurrency 16 \
-   --use-hybrid-bundle-adjustment --build-overviews --time --min-num-features 10000 \
-   --project-path /datasets
-
 *************************************
 Using ODM from low-bandwidth location
 *************************************
@@ -869,176 +821,160 @@ For instance, point clouds properties can be modified to show elevation and also
 
 `Learn to edit <https://github.com/opendronemap/docs#how-to-make-your-first-contribution>`_ and help improve `this page <https://github.com/OpenDroneMap/docs/blob/publish/source/tutorials.rst>`_!
 
+*****************
+Using Singularity
+*****************
 
-***************************************************
-ClusterODM, NodeODM, SLURM, with Singularity on HPC
-***************************************************
+`Singularity <https://sylabs.io/>`__ is another container platform able to run Docker images. 
+Singularity can be run both on local machins and in instances where the user does not have root access. 
+Instances where a user may not have root privlidges include HPC clusters and cloud cluster resources.
+A container is a single file without anything else to install.
 
-Let's say that we will get ClusterODM and NodeODM images in the same folder
+Build Singularity image from Docker image
+=========================================
+Singularity can use Docker image to build SIF image.
 
-Downloading and installing the images
-=====================================
+For latest ODM Docker image (Recommended) :
 
-In this example ClusterODM and NodeODM will be installed in $HOME/git
+.. code:: bash
 
-ClusterODM
-----------
+   singularity build --disable-cache -f odm_latest.sif docker://opendronemap/odm:latest
 
-::
+For latest ODM GPU Docker image :
 
-   cd $HOME/git
-   git clone https://github.com/OpenDroneMap/ClusterODM
-   cd ClusterODM
-   singularity pull --force --disable-cache docker://opendronemap/clusterodm:latest
+.. code:: bash
+   
+   singularity build --disable-cache -f odm_gpu.sif docker://opendronemap/odm:gpu
 
-ClusterODM image needs to be "installed"
-::
+Using Singularity SIF image
+===========================
 
-   singularity shell --bind $PWD:/var/www clusterodm_latest.sif`
 
-And then in the Singularity shell
-::
+Once you have used one of the above commands to download and create the `odm_latest.sif` image, it can be ran using singularity. 
+Place your images in a directory named “images” (for example /my/project/images) , then simply run :
 
-   cd /var/www
-   npm install --production
-   exit
+.. code:: bash
 
-NodeODM
--------
+   singularity run --bind /my/project:/datasets/code odm_latest.sif --project-path /datasets
 
-::
+Like with docker, additional `Options and Flags <https://docs.opendronemap.org/arguments/>`_ can be added to the command :
 
-   cd $HOME/git
-   git clone https://github.com/OpenDroneMap/NodeODM
-   cd NodeODMDM
-   singularity pull --force --disable-cache docker://opendronemap/nodeodm:latest
+.. code:: bash
 
-NodeODM image needs to be "installed"
-::
+   singularity run --bind /my/project:/datasets/code \
+   --writable-tmpfs odm_latest.sif \
+   --orthophoto-png --mesh-octree-depth 12 --ignore-gsd --dtm \
+   --smrf-threshold 0.4 --smrf-window 24 --dsm --pc-csv --pc-las --orthophoto-kmz \
+   --ignore-gsd --matcher-type flann --feature-quality ultra --max-concurrency 16 \
+   --use-hybrid-bundle-adjustment --build-overviews --time --min-num-features 10000 \
+   --project-path /datasets
 
-   singularity shell --bind $PWD:/var/www nodeodm_latest.sif
 
-And then in the Singularity shell
-::
+***************************************************
+ClusterODM, NodeODM, SLURM, with Singularity on HPC
+***************************************************
 
-   cd /var/www
-   npm install --production
-   exit
 
+You can write a SLURM script to schedule and set up available nodes with NodeODM for the ClusterODM to be wired to if you are on the HPC. Using SLURM will decrease the amount of time and processes needed to set up nodes for ClusterODM each time. This provides an easier way for user to use ODM on the HPC.
 
+To setup HPC with SLURM, you must make sure SLURM is installed.
 
+SLURM script will be different from cluster to cluster, depending on which nodes in the cluster that you have. However, the main idea is we want to run NodeODM on each node once, and by default, each NodeODM will be running on port 3000. Apptainer will be taking available ports starting from port 3000, so if your node's port 3000 is open, by default NodeODM will be run on that node. After that, we want to run ClusterODM on the head node and connect the running NodeODMs to the ClusterODM. With that, we will have a functional ClusterODM running on HPC.
 
-Launching
-=========
-On two different terminals connected to the HPC , or with tmux (or screen...) a slurm script will start NodeODM instances.
-Then ClusterODM could be started
+Here is an example of SLURM script assigning nodes 48, 50, 51 to run NodeODM. You can freely change and use it depending on your system:
 
-NodeODM
--------
-Create a nodeodm.slurm script in $HOME/git/NodeODM with
 ::
 
-   #!/usr/bin/bash
-   #source .bashrc
-
+	#!/usr/bin/bash
+	#source. bashrc
+	#SBATCH --partition=8core
+	#SBATCH --nodelist-node [48,50, 51]
+	#SBATCH --time 20:00:00
 
-   #SBATCH -J NodeODM
-   #SBATCH --partition=ncpulong,ncpu
-   #SBATCH --nodes=2
-   #SBATCH --mem=10G
-   #SBATCH --output logs_nodeodm-%j.out
+	cd SHOME
+	cd ODM/NodeODM/
 
-   cd $HOME/git/NodeODM
+	#Launch on Node 48
+	srun --nodes-1 apptainer run --writable node/ &
 
-   #Launched on first node
-   srun --nodes=1 singularity run --bind $PWD:/var/www  nodeodm_latest.sif $
+	#Launch on node 50
+	srun --nodes-1 apptainer run --writable node/ &
 
-   #Launch on second node
+	#Launch on node 51
+	srun --nodes=1 apptainer run --writable node/ &
+	wait
 
-   srun --nodes=1 singularity run --bind $PWD:/var/www  nodeodm_latest.sif $
 
-   wait
+You can check for available nodes using sinfo:
 
-start this script with
 ::
 
-   sbatch $HOME/git/NodeODM/nodeodm.slurm
+	sinfo
 
-logs of this script are written in $HOME/git/NodeODM/logs_nodeodm-XXX.out XXX is the slurm job number
+Run the following command to schedule using the SLURM script:
 
+::
 
+	sbatch sample.slurm
 
-ClusterODM
-----------
-Then you can start ClusterODM on the head node with
+
+You can also check for currently running jobs using squeue:
 
 ::
 
-   cd $HOME/git/ClusterODM
-   singularity run --bind $PWD:/var/www  clusterodm_latest.sif
+	squeue -u $USER
+
+
+Unfortunately, SLURM does not handle assigning jobs to the head node. Hence, if we want to run ClusterODM on the head node, we have to run it locally. After that, you can connect to the CLI and wire the NodeODMs to the ClusterODMs. Here is an example following the sample SLURM script:
 
-Connecting Nodes to ClusterODM
-==============================
-Use the following command to get the nodes names where NodeODM is running
 ::
 
-   squeue -u $USER
+	telnet localhost 8080
+	> NODE ADD node48 3000
+	> NODE ADD node50 3000
+	> NODE ADD node51 3000
+	> NODE LIST
 
-   ex : squeue -u $USER
-                JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
-               1829323      ncpu  NodeODM  bonaime  R      24:19      2 ncpu[015-016]
 
-   In this case, NodeODM run on ncpu015 and ncpu016
+You should always check to make sure which ports are being used to run NodeODM if ClusterODM is not wired correctly.
+
+It is also possible to pre-populate nodes using JSON. If starting ClusterODM from apptainer or docker, the relevant JSON is available at `docker/data/nodes.json`. Contents might look similar to the following:
 
-Web interface
--------------
-ClusterODM administrative web interface could be used to wire NodeODMs to the ClusterODM.
-Open another shell window in your local machine and tunnel them to the HPC using the following command:
 ::
 
-   ssh -L localhost:10000:localhost:10000 yourusername@hpc-address
-Replace yourusername and hpc-address with your appropriate username and the hpc address.
+	[
+		{"hostname":"node48","port":"3000","token":""},
+		{"hostname":"node50","port":"3000","token":""},
+		{"hostname":"node51","port":"3000","token":""}
+	]
 
-Basically, this command will tunnel the port of the hpc to your local port.
-After this, open a browser in your local machine and connect to http://localhost:10000.
-Port 10000 is where ClusterODM's administrative web interface is hosted at.
-Then NodeODMs could be add/deleted to ClusterODM
-This is what it looks like :
 
-.. figure:: images/clusterodm-admin-interface.png
-   :alt: Clusterodm admin interface
-   :align: center
+After finish hosting ClusterODM on the head node and finish wiring it to the NodeODM, you can try tunneling to see if ClusterODM works as expected. Open another shell window in your local machine and tunnel them to the HPC using the following command:
 
+::
 
+	ssh -L localhost:10000:localhost:10000 user@hostname
 
-telnet
-------
-You can connect to the ClusterODM CLI and wire the NodeODMs. For the previous example :
 
-telnet localhost 8080
-> NODE ADD ncpu015 3000
-> NODE ADD ncpu016 3000
-> NODE LIST
+Replace user and hostname with your appropriate username and the hpc address. Basically, this command will tunnel the port of the hpc to your local port. After this, open a browser in your local machine and connect to `http://localhost:10000`. Port 10000 is where ClusterODM's administrative web interface is hosted at. This is what it looks like:
 
+.. figure:: https://user-images.githubusercontent.com/70782465/214938402-707bee90-ea17-4573-82f8-74096d9caf03.png
+   :alt: Screenshot of ClusterODM's administrative web interface
+   :align: center
 
 
+Here you can check the NodeODMs status and even add or delete working nodes.
 
-Using ClusterODM and its NodeODMs
-=================================
+After that, do tunneling for port 3000 of the HPC to your local machine:
 
-Open another shell window in your local machine and tunnel them to the HPC using the following command:
 ::
 
-   ssh -L localhost:10000:localhost:10000 yourusername@hpc-address
-Replace yourusername and hpc-address with your appropriate username and the hpc address.
+ssh -L localhost:3000:localhost:3000 user@hostname
 
-After this, open a browser in your local machine and connect to  http://localhost:3000 with your browser
-Here, you can Assign Tasks and observe the tasks' processes.
+Port 3000 is ClusterODM's proxy. This is the place we assign tasks to ClusterODM. Once again, connect to `http://localhost:3000` with your browser after tunneling. Here, you can Assign Tasks and observe the tasks' processes.
 
-.. figure:: images/clusterodm-user-interface.png
-   :alt: Clusterodm user interface
+.. figure:: https://user-images.githubusercontent.com/70782465/214938234-113f99dc-f69e-4e78-a782-deaf94e986b0.png
+   :alt: Screenshot of ClusterODM's jobs interface
    :align: center
 
-
-
 After adding images in this browser, you can press Start Task and see ClusterODM assigning tasks to the nodes you have wired to. Go for a walk and check the progress.