Figure 12From: Robustness of quantum reinforcement learning under hardware errorsQ-learning agents trained with varying probabilities p of depolarization errors, and five layers of the circuit depicted in Fig. 2 a). Noise is simulated with 100 Monte Carlo trajectories. The noisy curves are averaged over 5 agents, the exact one is averaged over 10 agents as in previous figuresBack to article page