Figure 11From: Robustness of quantum reinforcement learning under hardware errorsTraining and evaluation of policy gradient agents in the TSP environment under various perturbations σ. Panel (a) shows the effect of perturbations during training, panel (b) shows results for the same agents evaluated on varying perturbation levels after training, different to those present at training timeBack to article page