Figure 13From: Robustness of quantum reinforcement learning under hardware errorsPolicy gradient agents trained in the TSP environment with varying probabilities p of depolarization error, with one layer of the circuit depicted in Fig. 2 c). Noise is simulated with 1000 Monte Carlo trajectories. All curves are averaged over 10 agentsBack to article page