MARL Tricks

Our codes for RIIT: Rethinking the Importance of Implementation Tricks in Multi-AgentReinforcement Learning. We implemented and standardized the hyperparameters of the SOTA MARL algorithms.

Python MARL framework

PyMARL is WhiRL's framework for deep multi-agent reinforcement learning and includes implementations of the following algorithms:

Value-based Methods:

QMIX: QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
VDN: Value-Decomposition Networks For Cooperative Multi-Agent Learning
IQL: Independent Q-Learning
QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning
MAVEN: MAVEN: Multi-Agent Variational Exploration
Qatten: Qatten: A general framework for cooperative multiagent reinforcement learning
QPLEX: Qplex: Duplex dueling multi-agent q-learning
WQMIX: Weighted QMIX: Expanding Monotonic Value Function Factorisation

Actor Critic Methods:

COMA: Counterfactual Multi-Agent Policy Gradients
VMIX: Value-Decomposition Multi-Agent Actor-Critics
FacMADDPG: Deep Multi-Agent Reinforcement Learning for Decentralized Continuous Cooperative Control
LICA: Learning Implicit Credit Assignment for Cooperative Multi-Agent Reinforcement Learning
DOP: Off-Policy Multi-Agent Decomposed Policy Gradients
RIIT: RIIT: Rethinking the Importance of Implementation Tricks in Multi-AgentReinforcement Learning

PyMARL is written in PyTorch and uses SMAC as its environment.

Installation instructions

Install Python packages

# require Anaconda 3 or Miniconda 3
bash install_dependecies.sh

Set up StarCraft II and SMAC:

bash install_sc2.sh

This will download SC2 into the 3rdparty folder and copy the maps necessary to run over.

Run an experiment

python3 src/main.py --config=qmix --env-config=sc2 with env_args.map_name=corridor

The config files act as defaults for an algorithm or environment.

They are all located in src/config. --config refers to the config files in src/config/algs --env-config refers to the config files in src/config/envs

Run parallel experiments:

# bash run.sh config_name map_name_list (threads_num arg_list gpu_list experinments_num)
bash run.sh qmix corridor 2 epsilon_anneal_time=500000 0,1 5

xxx_list is separated by ,.

All results will be stored in the Results folder and named with map_name.

Force all processes to exit

# all python and game processes of current user will quit.
bash clean.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

MARL Tricks

Python MARL framework

Installation instructions

Run an experiment

Run parallel experiments:

Force all processes to exit

Some test results on Super Hard scenarios

Files

README.md

Latest commit

History

README.md

File metadata and controls

MARL Tricks

Python MARL framework

Installation instructions

Run an experiment

Run parallel experiments:

Force all processes to exit

Some test results on Super Hard scenarios