| tags | python,numpy,neural-network,reinforcement learning |
|---|---|
| mathjax | true |
- learns state to action mapping directly which is often more simple
- no model of environment dynamics needed
- allows continuous action space
- allows stochastic policy which can be a crucial advantage compared to deterministic policies
- Actor Critic RL Methods
{:.caption .img}
