Merge branch 'development' into development

lanl · Sep 19, 2024 · a8e405b · a8e405b
2 parents e904108 + 144c160
commit a8e405b
Show file tree

Hide file tree

Showing 57 changed files with 2,136 additions and 477 deletions.
diff --git a/.github/workflows/build.yml b/.github/workflows/build.yml
@@ -18,7 +18,7 @@ jobs:
 
     - name: Install dependencies
       run: >-
-        python -m pip install --user --upgrade setuptools wheel
+        python -m pip install --user --upgrade setuptools wheel build
     - name: Build
       run: >-
-        python setup.py sdist bdist_wheel
+        python -m build
diff --git a/.github/workflows/deploy.yml b/.github/workflows/deploy.yml
@@ -27,10 +27,10 @@ jobs:
 
     - name: Install dependencies
       run: >-
-        python -m pip install --user --upgrade setuptools wheel
+        python -m pip install --user --upgrade setuptools wheel build
     - name: Build
       run: >-
-        python setup.py sdist bdist_wheel
+        python -m build
     - name: Publish distribution 📦 to PyPI
       if: startsWith(github.event.ref, 'refs/tags') || github.event_name == 'release'
       uses: pypa/gh-action-pypi-publish@release/v1
diff --git a/AUTHORS.txt b/AUTHORS.txt
@@ -19,7 +19,7 @@ Emily Shinkle (LANL)
 Michael G. Taylor (LANL)
 Jan Janssen (LANL)
 Cagri Kaymak (LANL)
-Shuhao Zhang (CMU, LANL)
+Shuhao Zhang (CMU, LANL) - Batched Optimization routines
 
 Also thanks to testing and feedback from:
 
@@ -36,3 +36,5 @@ David Rosenberger
 Michael Tynes
 Drew Rohskopf
 Neil Mehta
+Alice E A Allen
+
diff --git a/CHANGELOG.rst b/CHANGELOG.rst
@@ -3,23 +3,59 @@
 Breaking changes:
 -----------------
 
+- ``set_e0_values`` has been renamed to ``hierarchical_energy_initialization``.
+  The old name is still provided but deprecated, and will be removed.
+- The argument ``restore_db`` has been renamed to ``restart_db``. The affected
+  functions are ``load_checkpoint``, ``load_checkpoint_from_cwd``, and
+  ``restore_checkpoint``.
+- ``database.make_trainvalidtest_split`` now only takes keyword arguments to
+  avoid confusions. Use ``make_trainvalidtest_split(test_size=a, valid_size=b)``
+  instead of ``make_trainvalidtest_split(a, b)``.
+- Invalid custom kernel specifications are now errors rather than warnings.
+
+
 New Features:
 -------------
 
-- Added a new custom cuda kernel implementation using triton. These are highly performant and now the default implementation.
-- Exporting a database to NPZ or H5 format after preprocessing is now just a function call away.
-- SNAPjson format can now support an optional number of comment lines.
-- Added Batch optimizer features in order to optimize geometries in parallel on the GPU. Algorithms include FIRE and BFGS.
+- Added a new custom cuda kernel implementation using triton.
+  These are highly performant and now the default implementation.
+- Exporting any database to NPZ or H5 format after preprocessing can be done with a method call.
+- Database states can be cached to disk to simplify the restarting of training.
+- Added batch geometry optimizer features in order to optimize geometries
+  in parallel on the GPU. Algorithms include FIRE, Newton-Raphson, and BFGS.
+- Added experiment pytorch lightning trainer to provide for simple parallelized training.
+- Added a molecular dynamics engine which includes the ability to batch over systems.
+- Added examples pertaining to coarse graining.
+- Added pair finders based on scipy KDTree for training to large systems.
+- Added tool to drastically simplify creating ensemble models. The ensemblized graphs
+  are compatible with molecular dynamics codes such ASE and LAMMPS.
+- Added the ability to weight different systems/atoms/bonds in a loss function.
+- Added new function to reload library settings.
+
 
 Improvements:
 -------------
 
 - Eliminated dependency on pyanitools for loading ANI-style H5 datasets.
+- SNAPjson format can now support an optional number of comment lines.
+- Added unit conversion options to the LAMMPS interface.
+- Improved performance of bond order regression.
+- It is now possible to limit the memory usage of the MLIAP interface in LAMMPS
+  using a library setting.
+- Provide tunable regularization of HIP-NN-TS with an epsilon parameter, and
+  set the default to use a better value for epsilon.
+- Improved detection of valid custom kernel implementation.
+- Improved computational efficiency of HIP-NN-TS network.
+
+
 
 Bug Fixes:
 ----------
 
 - Fixed bug where custom kernels were not launching properly on non-default GPUs
+- Fixed error when LAMMPS interface is in kokkos mode and the kokkos device was set to CPU.
+- MLIAPInterface objects
+- Fixed bug with RDF computer automatic initialization.
 
 0.0.3
 =======

diff --git a/COPYRIGHT.txt b/COPYRIGHT.txt
@@ -0,0 +1,10 @@
+
+Copyright 2019. Triad National Security, LLC. All rights reserved.
+This program was produced under U.S. Government contract 89233218CNA000001 for Los Alamos
+National Laboratory (LANL), which is operated by Triad National Security, LLC for the U.S.
+Department of Energy/National Nuclear Security Administration. All rights in the program are
+reserved by Triad National Security, LLC, and the U.S. Department of Energy/National Nuclear
+Security Administration. The Government is granted for itself and others acting on its behalf a
+nonexclusive, paid-up, irrevocable worldwide license in this material to reproduce, prepare
+derivative works, distribute copies to the public, perform publicly and display publicly, and to permit
+others to do so.
diff --git a/LICENSE.txt b/LICENSE.txt
@@ -1,15 +1,5 @@
 
 
-Copyright 2019. Triad National Security, LLC. All rights reserved.
-This program was produced under U.S. Government contract 89233218CNA000001 for Los Alamos
-National Laboratory (LANL), which is operated by Triad National Security, LLC for the U.S.
-Department of Energy/National Nuclear Security Administration. All rights in the program are
-reserved by Triad National Security, LLC, and the U.S. Department of Energy/National Nuclear
-Security Administration. The Government is granted for itself and others acting on its behalf a
-nonexclusive, paid-up, irrevocable worldwide license in this material to reproduce, prepare
-derivative works, distribute copies to the public, perform publicly and display publicly, and to permit
-others to do so.
-
 This program is open source under the BSD-3 License.
 Redistribution and use in source and binary forms, with or without modification, are permitted
 provided that the following conditions are met:

diff --git a/README.rst b/README.rst
@@ -106,6 +106,7 @@ The Journal of chemical physics, 148(24), 241715.
 See AUTHORS.txt for information on authors.
 
 See LICENSE.txt for licensing information. hippynn is licensed under the BSD-3 license.
+See COPYRIGHT.txt for copyright information.
 
 Triad National Security, LLC (Triad) owns the copyright to hippynn, which it identifies as project number LA-CC-19-093.
 

diff --git a/conda_requirements.txt b/conda_requirements.txt
@@ -8,3 +8,4 @@ ase
 h5py
 tqdm
 python-graphviz
+lightning
diff --git a/docs/source/conf.py b/docs/source/conf.py
@@ -19,10 +19,11 @@
 
 project = "hippynn"
 copyright = "2019, Los Alamos National Laboratory"
-author = "Nicholas Lubbers"
+author = "Nicholas Lubbers et al"
 
 # The full version, including alpha/beta/rc tags
 import hippynn
+
 release = hippynn.__version__
 
 # -- General configuration ---------------------------------------------------
@@ -31,7 +32,6 @@
 # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
 # ones.
 extensions = ["sphinx.ext.autodoc", "sphinx_rtd_theme", "sphinx.ext.viewcode"]
-add_module_names = False
 
 # Add any paths that contain templates here, relative to this directory.
 templates_path = ["_templates"]
@@ -45,10 +45,13 @@
     "no-show-inheritance": True,
     "special-members": "__init__",
 }
+autodoc_member_order = "bysource"
 
-# The following are highly optional, so we mock them for doc purposes.
-autodoc_mock_imports = ["pyanitools", "seqm", "schnetpack", "cupy", "lammps", "numba"]
 
+# The following are highly optional, so we mock them for doc purposes.
+# TODO: Can we programmatically get these from our list of optional dependencies?
+autodoc_mock_imports = ["ase", "h5py", "seqm", "schnetpack", "cupy", "lammps", "numba", "triton", "pytorch_lightning", 'scipy']
+add_module_names = False
 
 # -- Options for HTML output -------------------------------------------------
 

diff --git a/docs/source/examples/controller.rst b/docs/source/examples/controller.rst
@@ -1,7 +1,6 @@
 Controller
 ==========
 
-
 How to define a controller for more customized control of the training process.
 We assume that there is a set of ``training_modules`` assembled and a ``database`` object has been constructed.
 

diff --git a/docs/source/examples/index.rst b/docs/source/examples/index.rst
@@ -3,8 +3,8 @@ Examples
 
 Here are some examples about how to use various features in
 ``hippynn``. Besides the :doc:`/examples/minimal_workflow` example,
-the examples are just snippets. For runnable example scripts, see
-`the examples at the hippynn github repository`_
+the examples are just snippets, rather than full scripts.
+For runnable example scripts, see `the examples at the hippynn github repository`_
 
 .. _`the examples at the hippynn github repository`: https://github.com/lanl/hippynn/tree/development/examples
 
@@ -23,5 +23,5 @@ the examples are just snippets. For runnable example scripts, see
     mliap_unified
     excited_states
     weighted_loss
-
+    lightning
 
diff --git a/docs/source/examples/lightning.rst b/docs/source/examples/lightning.rst
@@ -0,0 +1,20 @@
+Pytorch Lightning module
+========================
+
+
+Hippynn incldues support for distributed training using `pytorch-lightning`_.
+This can be accessed using the :class:`hippynn.experiment.HippynnLightningModule` class.
+The class has two class-methods for creating the lightning module using the same
+types of arguments that would be used for an ordinary hippynn experiment.
+These are :meth:`hippynn.experiment.HippynnLightningModule.from_experiment_setup`
+and :meth:`hippynn.experiment.HippynnLightningModule.from_train_setup`.
+Alternatively, you may construct and supply the arguments for the module yourself.
+
+Finally, in additional to the usual pytorch lightning arguments,
+the hippynn lightning module saves an additional file, `experiment_structure.pt`,
+which needs to be provided as an argument to the
+:meth:`hippynn.experiment.HippynnLightningModule.load_from_checkpoint` constructor.
+
+
+.. _pytorch-lightning: https://github.com/Lightning-AI/pytorch-lightning
+
diff --git a/docs/source/examples/mliap_unified.rst b/docs/source/examples/mliap_unified.rst
@@ -11,7 +11,7 @@ species atomic symbols (whose order must agree with the order of the training hy
 
 Example::
 
-    bundle = load_checkpoint_from_cwd(map_location="cpu", restore_db=False)
+    bundle = load_checkpoint_from_cwd(map_location="cpu", restart_db=False)
     model = bundle["training_modules"].model
     energy_node = model.node_from_name("HEnergy")
     unified = MLIAPInterface(energy_node, ["Al"], model_device=torch.device("cuda"))

diff --git a/docs/source/examples/restarting.rst b/docs/source/examples/restarting.rst
@@ -43,6 +43,14 @@ or to use the default filenames and load from the current directory::
     check = load_checkpoint_from_cwd()
     train_model(**check, callbacks=None, batch_callbacks=None)
 
+.. note::
+   In release 0.0.4, the ``restore_db`` argument has been renamed to
+   ``restart_db`` for internal consistence. ``restore_db`` in all scripts using 
+   `hippynn > 0.0.3` should be replaced with ``restart_db``. The affected
+   functions are ``load_checkpoint``, ``load_checkpoint_from_cwd``, and
+   ``restore_checkpoint``. If `hippynn <= 0.0.3` is used, please keep the
+   original ``restore_db`` keyword.
+
 If all you want to do is use a previously trained model, here is how to load the model only::
 
     from hippynn.experiment.serialization import load_model_from_cwd

diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -8,31 +8,53 @@ We hope you enjoy your stay.
 What is hippynn?
 ================
 
-`hippynn` is a python library for machine learning on atomistic systems.
+``hippynn`` is a python library for machine learning on atomistic systems
+using `pytorch`_.
 We aim to provide high-performance modular design so that different
 components can be re-used, extended, or added to. You can find more information
-at the :doc:`/user_guide/features` page. The development home is located
-at `the hippynn github repository`_, which also contains `many example files`_
+about overall library features at the :doc:`/user_guide/features` page.
+The development home is located at `the github github repository`_, which also contains `many example files`_.
+Additionally, the :doc:`user guide </user_guide/index>` aims to describe abstract
+aspects of the library, while the
+:doc:`examples documentation section </examples/index>` aims to show
+more concretely how to perform tasks with hippynn. Finally, the
+:doc:`api documentation </api_documentation/hippynn>` contains a comprehensive
+listing of the library components and their documentation.
 
 The main components of hippynn are constructing models, loading databases,
 training the models to those databases, making predictions on new databases,
-and interfacing with other atomistic codes. In particular, we provide interfaces
-to `ASE`_ (prediction), `PYSEQM`_ (training/prediction), and `LAMMPS`_ (prediction).
+and interfacing with other atomistic codes for operations such as molecular dynamics.
+In particular, we provide interfaces to `ASE`_ (prediction),
+`PYSEQM`_ (training/prediction), and `LAMMPS`_ (prediction).
 hippynn is also used within `ALF`_ for generating machine learned potentials
 along with their training data completely from scratch.
 
-Multiple formats for training data are supported, including
-Numpy arrays, the ASE Database, `fitSNAP`_ JSON format, and `ANI HDF5 files`_.
+Multiple :doc:`database formats </user_guide/databases>` for training data are supported, including
+Numpy arrays, `ASE`_-compatible formats, `FitSNAP`_ JSON format, and `ANI HDF5 files`_.
+
+``hippynn`` includes many tools, such as an :doc:`ASE calculator</examples/ase_calculator>`,
+a :doc:`LAMMPS MLIAP interface</examples/mliap_unified>`,
+:doc:`batched prediction </examples/predictor>` and batched geometry optimization,
+:doc:`automatic ensemble creation </examples/ensembles>`,
+:doc:`restarting training from checkpoints </examples/restarting>`,
+:doc:`sample-weighted loss functions </examples/weighted_loss>`,
+:doc:`distributed training with pytorch lightning </examples/lightning>`,
+and more.
+
+``hippynn`` is highly modular, and if you are a model developer, interfacing your
+pytorch model into the hippynn node/graph system will make it simple and easy for users
+to build models of energy, charge, bond order, excited state energies, and more.
 
 .. _`ASE`: https://wiki.fysik.dtu.dk/ase/
 .. _`PYSEQM`: https://github.com/lanl/PYSEQM/
 .. _`LAMMPS`: https://www.lammps.org
-.. _`fitSNAP`: https://github.com/FitSNAP/FitSNAP
+.. _`FitSNAP`: https://github.com/FitSNAP/FitSNAP
 .. _`ANI HDF5 files`: https://doi.org/10.1038/s41597-020-0473-z
 .. _`ALF`: https://github.com/lanl/ALF/
 
-.. _`the hippynn github repository`: https://github.com/lanl/hippynn/
+.. _`the github github repository`: https://github.com/lanl/hippynn/
 .. _`many example files`: https://github.com/lanl/hippynn/tree/development/examples
+.. _`pytorch`: https://pytorch.org
 
 
 .. toctree::

diff --git a/docs/source/installation.rst b/docs/source/installation.rst
@@ -2,24 +2,25 @@ Installation
 ============
 
 
-
 Requirements
 ^^^^^^^^^^^^
 
 Requirements:
     * Python_ >= 3.9
     * pytorch_ >= 1.9
     * numpy_
+
 Optional Dependencies:
     * triton_ (recommended, for improved GPU performance)
     * numba_ (recommended for improved CPU performance)
-    * cupy_ (Alternative for accelerating GPU performance)
-    * ASE_ (for usage with ase)
+    * cupy_ (alternative for accelerating GPU performance)
+    * ASE_ (for usage with ase and other misc. features)
     * matplotlib_ (for plotting)
     * tqdm_ (for progress bars)
-    * graphviz_ (for viewing model graphs as figures)
+    * graphviz_ (for visualizing model graphs)
     * h5py_ (for loading ani-h5 datasets)
     * pyanitools_ (for loading ani-h5 datasets)
+    * pytorch-lightning_ (for distributed training)
 
 Interfacing codes:
     * ASE_
@@ -40,6 +41,8 @@ Interfacing codes:
 .. _ASE: https://wiki.fysik.dtu.dk/ase/
 .. _LAMMPS: https://www.lammps.org/
 .. _PYSEQM: https://github.com/lanl/PYSEQM
+.. _pytorch-lightning: https://github.com/Lightning-AI/pytorch-lightning
+.. _hippynn: https://github.com/lanl/hippynn/
 
 
 Installation Instructions
@@ -65,11 +68,7 @@ Clone the hippynn_ repository and navigate into it, e.g.::
     $ git clone https://github.com/lanl/hippynn.git
     $ cd hippynn
 
-.. _hippynn: https://github.com/lanl/hippynn/
 
-.. note::
-  If you wish to do a cpu-only install, you may need to comment
-  out ``cupy`` from the conda_requirements.txt file.
 
 Dependencies using conda
 ........................
@@ -78,6 +77,10 @@ Install dependencies from conda using recommended channels::
 
     $ conda install -c pytorch -c conda-forge --file conda_requirements.txt
 
+.. note::
+  If you wish to do a cpu-only install, you may need to comment
+  out ``cupy`` from the conda_requirements.txt file.
+
 Dependencies using pip
 .......................
 

diff --git a/docs/source/user_guide/ckernels.rst b/docs/source/user_guide/ckernels.rst
@@ -60,7 +60,7 @@ The three custom kernels correspond to the interaction sum in hip-nn:
 
 .. math::
 
-    a'_{i,a} =  = \sum_{\nu,b} V^\nu_{a,b} e^{\nu}_{i,b}
+    a'_{i,a} = \sum_{\nu,b} V^\nu_{a,b} e^{\nu}_{i,b}
 
     e^{\nu}_{i,a} = \sum_p s^\nu_{p} z_{p_j,a}
 

diff --git a/docs/source/user_guide/concepts.rst b/docs/source/user_guide/concepts.rst
@@ -45,8 +45,9 @@ Graphs
 
 A :class:`~hippynn.graphs.GraphModule` is a 'compiled' set of nodes; a ``torch.nn.Module`` that executes the graph.
 
-GraphModules are used in a number of places within hippynn.
-
+GraphModules are used in a number of places within hippynn,
+such as the model, the loss, the evaluator, the predictor, the ASE interface,
+and the LAMMPS interface objects all use GraphModules.
 
 Experiment
 ^^^^^^^^^^