Squash and merge all AE related commits for GlueFL.

Squashed commit of the following: commit 6495844 Author: Eric Yan <[email protected]> Date: Sat Apr 1 06:01:53 2023 +0000 Update README commit ed25ceb Author: Shiqi HE <[email protected]> Date: Fri Mar 31 20:15:01 2023 +0800 Remove some redundant parameters commit 591428d Merge: 827c91a 9840e39 Author: Eric Yan <[email protected]> Date: Fri Mar 31 08:25:46 2023 +0000 Merge branch 'gluefl-ae' of https://github.com/TCtower/GlueFL into gluefl-ae commit 827c91a Author: Eric Yan <[email protected]> Date: Fri Mar 31 08:25:43 2023 +0000 Fixed device avail not being used and deleted gluefl_client_metadata commit 128b395 Author: Eric Yan <[email protected]> Date: Fri Mar 31 08:14:45 2023 +0000 Added overcommit weight commit 9840e39 Author: Eric Yan <[email protected]> Date: Thu Mar 30 21:32:14 2023 -0700 Update README.md commit 1cfac81 Author: Eric Yan <[email protected]> Date: Fri Mar 31 04:04:59 2023 +0000 Added compensation config and reweight comment commit e1ca08b Author: Eric Yan <[email protected]> Date: Fri Mar 31 03:59:32 2023 +0000 Make sure aggregator updates sticky group commit 1a692c7 Author: Shiqi HE <[email protected]> Date: Fri Mar 31 11:55:22 2023 +0800 Change ablation config files commit d673951 Author: Shiqi HE <[email protected]> Date: Fri Mar 31 11:46:15 2023 +0800 Add sticky client flag commit edeb82a Author: Eric Yan <[email protected]> Date: Wed Mar 29 07:46:55 2023 +0000 Fix references commit c46f92c Author: Eric Yan <[email protected]> Date: Wed Mar 29 07:45:27 2023 +0000 Deleted non-gluefl examples commit 5ab5333 Author: Eric Yan <[email protected]> Date: Wed Mar 29 07:38:48 2023 +0000 Added README and minor change to download.sh commit 996b06b Author: Eric Yan <[email protected]> Date: Wed Mar 29 06:34:13 2023 +0000 Moved sticky sampling to gluefl_client_manager.py commit 104df47 Author: Eric Yan <[email protected]> Date: Wed Mar 29 06:07:05 2023 +0000 Changed all configurations to use GPU by default commit d7beae8 Author: Eric Yan <[email protected]> Date: Wed Mar 29 06:06:37 2023 +0000 Added augmentation_factor note and removed .log ending for logs commit e4bf437 Author: Eric Yan <[email protected]> Date: Wed Mar 29 03:51:23 2023 +0000 Modified configs commit c378e17 Author: Eric Yan <[email protected]> Date: Tue Mar 28 08:32:17 2023 +0000 Partially added environment and baseline configs commit 479c80e Author: Eric Yan <[email protected]> Date: Tue Mar 28 07:50:30 2023 +0000 Moved configs to new folders commit 0d183fa Author: Eric Yan <[email protected]> Date: Tue Mar 28 07:47:13 2023 +0000 Added additional device profiles and updated download.sh commit a101a8a Author: Eric Yan <[email protected]> Date: Tue Mar 28 07:40:51 2023 +0000 Modified scripts and configs to work with GlueFL rename commit 2ba135c Author: Eric Yan <[email protected]> Date: Tue Mar 28 06:48:48 2023 +0000 Renamed FedDC to GlueFL commit 3a216ec Author: Eric Yan <[email protected]> Date: Tue Mar 28 05:46:17 2023 +0000 Modified GlueFL configurations
TCtower · Apr 1, 2023 · fd7f820 · fd7f820
1 parent 36b1135
commit fd7f820
Show file tree

Hide file tree

Showing 138 changed files with 3,348 additions and 5,933 deletions.
diff --git a/.gitignore b/.gitignore
@@ -11,6 +11,8 @@ dist
 logs/
 pymp*
 ./benchmark/dataset/data/*
+benchmark/logs/*
+benchmark/compensation/*
 .DS_Store
 *.egg-info
 *logging

diff --git a/README.md b/README.md
@@ -1,3 +1,75 @@
+# GlueFL Artifact
+
+This repository contains the artifact for **GlueFL: Reconciling Client Sampling and Model Masking for Bandwidth Efficient Federated Learning** accepted at the *Sixth Conference on Machine Learning and Systems* (**MLSys 2023**). 
+
+GlueFL is built as a component on top of the FedScale platform. The main GlueFL logic is located in the `./examples/gluefl` directory which contains both sticky sampling and mask shifting. GlueFL also adds several minor modifications to the `./fedscale` directory.
+
+## Getting Started and Running Experiments
+To run experiments using GlueFL, you should first set up FedScale following the standard [FedScale installation instructions](#quick-installation-linux).
+
+After setting up FedScale, you can download the datasets used for experiments in our paper with the following commands. Note: you only need to download the datasets for the experiments that you want to run.
+
+```shell
+# To donwload the FEMNIST dataset (3400 clients, 640K samples, 327 MB)
+fedscale download femnist 
+# To donwload the Google Speech dataset (2618 clients, 105K samples, 2.3 GB)
+fedscale download speech
+# To donwload the Open Images dataset (13,771 clients, 1.3M samples, 66 GB)
+fedscale download open_images
+```
+
+Since some of the datasets are quite large, you may want to download the dataset at another location. To do this, simply copy `./benchmark/dataset/download.sh` to the desired location and run the download command from there. If you do this, please take note of where you dowloaded the dataset. You will need to update your [configurations](#configurations) with that new location.
+
+You should then be able to run experiments by supplying configurationf files. An example for running the GlueFL on the FEMNIST dataset with the ShuffleNet model is shown below.
+
+```shell
+fedscale driver start ./benchmark/configs/baseline/gluefl_femnist_shf.yml
+```
+
+## Configurations
+
+In GlueFL, every experiment has their own configuration YAML file containg all the settings related to that experiment. You can find all the experiment configurations used in our paper in the `./benchmark/configs` directory. There are four sub-directories corresponding to different sections of our paper.
+
+- `./benchmark/configs/baseline` contains the experiment configurations for near-optimal settings of GlueFL, FedAvg, STC, and APF. The results from these experiment runs are used to fill Table 2
+- `./benchmark/configs/sensitivity` contains the experiment configurations used for the sensitivity analysis by differing GlueFL's hyper-parameters on the FEMNIST and Google Speech data sets.
+- `./benchmark/configs/ablation` contains the experiment configurations used for the ablation study by differing other settings (reweighting and error compensation) on the FEMNIST and Google Speech datasets.
+- `./benchmark/configs/environment` contains the experiment configurations used for the network environment study by swapping different client device profiles. 
+
+### Notes:
+- **If you downloaded your dataset to a different location**, please update the `data_dir` and `data_map_file` settings in the configuration file.
+
+- You may want to specify a different location for the `compensation_dir` which is used to store client-side error compensation data because they tend to get quite large (at least 40 GB). **Remember to periodically delete your compensation_dir after finishing experiments to release storage space!**
+
+- You can run experiments with just CPUs by setting `use-cuda` to `False`
+
+- Although you can use multiple GPUs on a single machine (e.g. `benchmark/configs/baseline/fedavg_image_shf.yml`), you should not try to use multiple machines to run experiments. We are working towards removing this limitation.
+
+## Viewing Results
+
+You can view the results for an experiment by using the following commands with the generated log file in the project root directory.
+
+```shell
+# To view the training loss
+cat job_name_logging | grep 'Training loss'
+# To view the top-1 and top-5 accuracy
+cat job_name_logging | grep 'FL Testing'
+# To view the current bandwidth usage and training time
+cat job_name_logging | grep -A 9 'Wall clock:'
+# To view the bandwidth usage and training time of a particular round (for example, 500)
+cat job_name_logging | grep -A 9 'round: 500'
+```
+
+You can also find logs just for the aggregator and executor in the directory specified by the `log_path` setting.
+
+
+<br/>
+
+---
+
+<br/>
+<br/>
+
+
 <p align="center">
 <img src="./docs/imgs/FedScale-logo.png" width="300" height="55"/>
 </p>

diff --git a/benchmark/configs/1femnist/cn12_femnist.yml b/benchmark/configs/1femnist/cn12_femnist.yml
diff --git a/benchmark/configs/1femnist/cn24_femnist.yml b/benchmark/configs/1femnist/cn24_femnist.yml
diff --git a/benchmark/configs/1femnist/sg120_femnist.yml b/benchmark/configs/1femnist/sg120_femnist.yml
diff --git a/benchmark/configs/1femnist/sg240_femnist.yml b/benchmark/configs/1femnist/sg240_femnist.yml