Figure 14From: Robustness of quantum reinforcement learning under hardware errorsQ-learning agents trained in the TSP environment with one layer of the circuit depicted in Fig. 2 c) and custom noise model, using 1000 Monte Carlo trajectories. The labels indicate the custom noise configurations defined in Table 1, results are averaged over five agents in each curve, except for the exact curve which is averaged over ten agents as done in previous figuresBack to article page