Leverage the use of recurrent modules #3

Fernadoo · 2022-04-25T07:19:28Z

Hi, I am wondering if the oscillation of the training phase comes from the fact that you only include down-sampling layers in your actor nets, since in partially observable domains, the information state of agents should be the whole line of history instead of the current one-shot observation. Thus, to include a recurrent module like LSTM or GRU might be helpful.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Leverage the use of recurrent modules #3

Leverage the use of recurrent modules #3

Fernadoo commented Apr 25, 2022

Leverage the use of recurrent modules #3

Leverage the use of recurrent modules #3

Comments

Fernadoo commented Apr 25, 2022