Skip to main content
Figure 2 | EPJ Quantum Technology

Figure 2

From: Robustness of quantum reinforcement learning under hardware errors

Figure 2

Parameterised circuits used in this work. (a) Hardware-efficient ansatz for Q-learning in the CartPole environment from [29], (b) hardware-efficient ansatz for policy gradient method in the CartPole environment from [28], (c) equivariant quantum circuit for Q-learning and policy gradient method in the TSP environment from [39]. For (a) and (b) we use 5 repetitions of the template shown above, while for (c) we use just one layer

Back to article page