On the bias in iterative quantum amplitude estimation

Quantum amplitude estimation (QAE) is a pivotal quantum algorithm to estimate the squared amplitude $a$ of the target basis state in a quantum state $|\Phi\rangle$. Various improvements on the original quantum phase estimation-based QAE have been proposed for resource reduction. One of such improved versions is iterative quantum amplitude estimation (IQAE), which outputs an estimate $\hat{a}$ of $a$ through the iterated rounds of the measurements on the quantum states like $G^k|\Phi\rangle$, with the number $k$ of operations of the Grover operator $G$ (the Grover number) and the shot number determined adaptively. This paper investigates the bias in IQAE. Through the numerical experiments to simulate IQAE, we reveal that the estimate by IQAE is biased and the bias is enhanced for some specific values of $a$. We see that the termination criterion in IQAE that the estimated accuracy of $\hat{a}$ falls below the threshold is a source of the bias. Besides, we observe that $k_\mathrm{fin}$, the Grover number in the final round, and $f_\mathrm{fin}$, a quantity affecting the probability distribution of measurement outcomes in the final round, are the key factors to determine the bias, and the bias enhancement for specific values of $a$ is due to the skewed distribution of $(k_\mathrm{fin},f_\mathrm{fin})$. We also present a bias mitigation method: just re-executing the final round with the Grover number and the shot number fixed.


I. INTRODUCTION
Among various quantum algorithms, quantum amplitude estimation (QAE) is one of the prominent ones.It is originally proposed in [1] as a method to estimate the squared amplitude a of the target basis state |ϕ 1 ⟩ in a quantum state |Φ⟩.If we have the oracle A to generate |Φ⟩ and the reflection operator S with respect to |ϕ 1 ⟩, we can obtain an ϵ-approximation of a1 , querying A and S O(1/ϵ) times.More concretely, the method in [1] is based on quantum phase estimation (QPE) [2]: using A and S, we construct the Grover operator G (defined later), which acts on |Φ⟩ to amplify the amplitude of |ϕ 1 ⟩, and then operate the controlled version of G O(1/ϵ) times followed by an inverse quantum Fourier transform, which yields an approximation of a.
One reason why QAE is important is that it is the basis of other quantum algorithms.For example, it is used in the quantum algorithm for Monte Carlo integration (QMCI) [3], which estimates the expectation of a random variable quadratically faster than the classical counterpart.Furthermore, QMCI has many applications in industry, e.g., derivative pricing [4][5][6][7][8][9] in finance.
Partly because of such practical importance, many improvements to the original version of QAE have been proposed so far.In particular, some methods without QPE have been devised [10][11][12][13][14][15][16][17][18][19][20][21].Since the controlled G requires a larger gate cost than the uncontrolled one, replacing the former with the latter leads to gate cost reduction.The first algorithm in such a direction is QAE based on the maximum likelihood estimation (MLE) [10], which we hereafter call MLEQAE.In this method, we apply G to |Φ⟩ k times and make N measurements on G k |Φ⟩ by which we distinguish |ϕ 1 ⟩ and other basis states, increasing k according to some given schedule.We hereafter call this k, the number of operations of G, the Grover number.The outcomes of the measurements, namely the numbers of times we get |ϕ 1 ⟩ for the various k, give us the information on a k = sin 2 ((2k + 1) × arcsin( √ a)), the squared amplitude of |ϕ 1 ⟩ in G k |Φ⟩, and thus the information on a.Then, we use this to construct the likelihood function of a and obtain an estimate of a as the maximum likelihood point.
Afterward, the paper [13] proposed iterative quantum amplitude estimation (IQAE), which this paper focuses on.Like MLEQAE, in IQAE we use the outcomes of the measurements on G k |Φ⟩ with varying k, but in a different way.Starting from |Φ⟩, which corresponds to k = 0, we obtain the confidence interval (CI) of a by the MLE based on the outcomes of the measurements on |Φ⟩.We then choose the next k adaptively in the way explained later, and make the measurements on G k |Φ⟩.Via MLE, this yields the CI on a k , which is translated into the CI of a narrower than the previous one.We repeat these steps, each of which is called a round, increasing the Grover number and narrowing the CI, until the CI width reaches the required accuracy ϵ.In addition to this adaptive increment of the Grover number, another feature of IQAE is that N the number of the measurements (the shot number) in one round is increased gradually: if the CI of a derived from measurements with current k is so narrow that we can determine the next k, we stop the current round and go to the next round with the next k, and otherwise, we add more measurements with the current k.The advantage of such an adaptive increment of k and N is that it can lead to saving the total number of queries to G compared to fixing k and N in advance 2 .
In this paper, we focus on the bias in IQAE, which can be an issue in some situations but has not been focused on in previous studies.Note that the estimate â of a by IQAE is stochastic since it is derived from outcomes of measurements on quantum states, which are intrinsically random.Although IQAE guarantees that the magnitude of the error â − a is below the tolerance with high probability, it might be biased, that is, the expectation of the error might not be zero: is the expectation with respect to the randomness of the measurement outcomes.We hereafter call b the bias and the residual â − a − b as the random part.
The motivation to focus on the bias in QAE, including IQAE, is the possibility that it may matter more than random part in some cases.That is, when we want some quantities given as a combination of many outputs of different QAE runs, biases can accumulate and, even if small in each output, become significant in total.In worst cases, biases in the sum of N estimates scales as O(N ), whereas according to the central limit theorem, the random parts cancel each other and scale as O( √ N ).An example of situations where we combine many QAE outputs is calculating the total value of a portfolio of derivative contracts, where each contract is priced by individual runs of QAE.Another example is calculating Gibbs partition functions in statistical mechanics [22].In the quantum algorithm in [22], a partition function is expressed as the product of expectations of certain random variables, each of which is estimated by QAE.
As far as the author knows, there has been no study on bias in IQAE, while bias in other types of QAE has been studied so far [19,21,22].Thus, in this paper, we investigate the bias in IQAE.We conduct numerical experiments to reveal the nature of the bias in IQAE.One of our key findings from the numerical experiments is that IQAE is in fact biased, and its termination criterion induces the bias.That is, the procedure that the algorithm ends when the CI width of a reaches the required accuracy leads to the bias.This is because the CI is statistically inferred and its width depends on the realized value of the estimate â of a.The algorithm tends to end when â accidentally takes a value that yields a narrow CI.This effect affects the expectation of â and then induces the bias.
In particular, we observe that the bias is enhanced for some specific values of a.This phenomenon is explained by the distribution of (k fin , f fin ).Here, k fin is the Grover number in the final round, and f fin ∈ [0, 1] is a quantity determined by k fin (see the definition later), which rapidly varies by a slight change of k fin and largely affects the bias.Note that k fin is also a random variable depending on the measurement outcomes, and thus so is (k fin , f fin ).For a value of a other than the specific ones, f fin takes the various values distributed widely in the range [0, 1] when k fin varies.Then, over the wide distribution of realized values of (k fin , f fin ) in the 2D plane, the various values of the bias for the various values of (k fin , f fin ) are canceled out on average, which yields a small bias in total.On the other hand, for the specific values of a, the realized values of (k fin , f fin ) are not distributed widely but concentrated in a small part of the 2D plane, in fact, on a few curves.Thus, the bias cancellation does not occur and the resultant bias remains considerable.
We also propose a simple way to mitigate the bias: just re-executing the final round.If the algorithm ends at the final round with the Grover number k fin and the shot number N fin , we perform another round with the same Grover number and shot number and obtain an estimate of a from the measurement outcomes in this additional round.Now, the gradual increment of the shot number and the termination criterion on the CI width are no longer adopted: we perform just N fin shots and stop.This largely diminishes the bias by the termination criterion, as confirmed by the numerical experiments.We also confirm that, although this re-executing solution definitely increases the total number of queries to G, the increase rate is modest -about 25% in our experiment.This is because the final round does not dominate the other rounds in terms of the query number.
The rest of this paper is organized as follows.Section II is a preliminary one, where we outline QAE and IQAE.Section III is the main part of this paper, where the results of our numerical experiments are presented.We first show the magnitude of the bias for various values of a, which is enhanced for some specific values of a, and then elaborate the aforementioned understanding of such a phenomenon.We finally propose the bias mitigation method by re-executing the final round along with the numerical result.Section IV summarizes this paper.

II. PRELIMINARIES A. Quantum amplitude estimation
In this paper, QAE is a generic term that means quantum algorithms to estimate the amplitude of a target basis state in a superposition state.Concretely, our aim is described as the following problem.
Problem 1.Let ϵ, α ∈ (0, 1).Suppose that we are given access to the following two quantum circuits A and S on a n-qubit register.A acts as where a ∈ [0, 1], |0⟩ is the computational basis state in which all the n qubits take |0⟩, and |ϕ 0 ⟩ and |ϕ 1 ⟩ are quantum states orthogonal to each other.S acts as Also suppose that we can measure an observable corresponding to a Hermitian H on the same system, for which |ϕ 0 ⟩ and |ϕ 1 ⟩ are eigenstates with different eigenvalues.Then, we want to get an estimate of a with accuracy ϵ with probability at least 1 − α.
Although the setup of Problem 1 seems quite simple, previous studies [1,[10][11][12][13][14][15][16][17][18][19][20][21] have generally considered this setup, and in fact, many applications can be boiled down to this form.For example, in QMCI [3], the expected value of a random variable is encoded into a quantum state like Eq. (1) as the squared amplitude a of a basis state |ϕ 1 ⟩, and through estimating this amplitude, we get an approximation of the expectation.
In many use cases of QAE, |ϕ 1 ⟩ and |ϕ 0 ⟩ are distinguished by whether a specific qubit takes |1⟩ or |0⟩.In this case, S is the Z gate on the qubit and H is the projective measurement in the computational basis on the qubit.
[1] posed this problem and presented an algorithm for it based on QPE.We have the following theorem on its query complexity.Theorem 12,modified).Suppose that we are given access to the oracles A and S in Eqs. ( 1) and (2).Then, for any ϵ, α ∈ (0, 1), there exists a quantum algorithm that outputs â ∈ (0, 1) such that |â − a| ≤ ϵ with probability at least 1 − α calling A and times.
Although the success probability in the original algorithm in [1] is lower bounded by a constant 8/π 2 , it can be enhanced to an arbitrary value 1 − α at the expense of an O (log(1/α)) overhead in the query complexity.This is done by a trick of taking the median of the results in the multiple runs of the algorithm [3], which is based on the powering lemma in [23].
We do not give the full details of this QPE-based QAE but present its outline briefly.The key ingredient is the Grover operator G, which is defined as Here, S 0 is a unitary such that which can be implemented as a combination of X gates and a multi-controlled Z gate.The key property of G is that for any k ∈ N it acts as where θ a := arcsin ( √ a).That is, G rotates the statevector by an angle 2θ a in the 2-dimensional Hilbert space spanned by |ϕ 1 ⟩ and |ϕ 0 ⟩.Because of this property, we have the following QPE-based approach to estimate a.We prepare a register R G on which G acts and an ancillary m-qubit register R anc .Then, using G, G 2 , . . ., G 2 m−1 controlled by the first, second, ..., m-th qubits in R anc , respectively, we generate the state 1 Finally, we operate the inverse quantum Fourier transform on R anc , which, thanks to the property in Eq. ( 6), yields an m-bit precision estimate of θ a and thus that of a = sin 2 θ a .In this process, the number of uses of (controlled) G is 1 + 2 + . . .+ 2 m−1 = 2 m , and thus A and S are queried O(2 m ) times, which implies the O(1/ϵ) query complexity for accuracy ϵ as shown in Theorem 1. See [1] for more details.
Compared to a naive way for estimating a by repeating generations of |Φ⟩ and measurements on it and letting the frequency of obtaining |ϕ 1 ⟩ be an estimate of a, in which the number of queries to A scales as O(1/ϵ 2 ), QAE achieves the quadratic speedup with respect to ϵ.This is the origin of the quadratic speedups in quantum algorithms built upon QAE, e.g., QMCI in comparison to the classical Monte Carlo method.

B. Iterative quantum amplitude estimation
After the original QAE was proposed, some variants have been proposed so far, aiming at the reduction of the resource.Avoiding the use of QPE is a common approach since QPE requires the controlled version of G, for which the resource for the implementation increases compared to the uncontrolled one.
IQAE [13] is in such a direction.The basic idea is as follows.By iterating the generation of |Φ⟩ and the measurement on it, we get an estimate of a as the frequency of obtaining |ϕ 1 ⟩ in the measurements.This is in fact a kind of MLE, since for Be(p), the Bernoulli distribution with probability of 1 equal to p, the maximum likelihood estimate of p from multiple trials is nothing but the realized frequency of 1.We also have the CI of a, which contains the true value of a with high probability.Next, for some k ∈ N, we repeat generating G k |Φ⟩ and the measurement on it, and from the measurement outcomes we get an estimate of and then that of a. Here, due to the periodicity of a k as the function of a, the 2k+1 different values in [0, 1] can be the maximum likelihood estimates of a.However, since we already have the CI from the previous round consisting of the measurements on |Φ⟩, combining it with the measurements on G k |Φ⟩ yields the new CI as a single interval in [0, 1], whose width is narrower than the previous one.We repeat this procedure for a sufficient number of rounds, making the width of the CI narrower.When the CI width reaches the required accuracy ϵ, we have an estimate of a with an error below ϵ.The CI width in each 10: for j = 1, 2, ... do 11: Iterate generating G k i |Φ⟩ and measuring it n shot times.Let the number of times |ϕ1⟩ is obtained be n1. where for a ′ ∈ [0, 1] and r ∈ N.
âi,j ← sin 2 θi,j ▷ Maximum likelihood estimate of a 21: ▷ Estimated accuracy of âi,j end for 33: end for

Algorithm 2 FindNextK
Input: ki, θ l , θu, rmin K ← K − 2 11: end while 12: return ki round is taken so that the probability that in every round the obtained CI successfully encloses the true value a is at least some required success probability 1 − α.
Concretely, we present this algorithm as Algorithm 1.This is a modification of the algorithm [20], which is already a modified version of the original IQAE algorithm in [13].
Leaving the full details of this algorithm to [20], we just present a theorem on the query complexity and some comments.
(1) and S in Eq. (2).Then, Algorithm 1 outputs an ϵ-approximation â of a with probability at least 1 − α, making O 1 ϵ log 1 α queries to A and S in total.Let us make some comments that help us to understand the outline of the algorithm.In Algorithm 1, [a l ki,j , a u ki,j ] and [θ l ki,j , θ u ki,j ] are the CIs of a and the angle θ a , respectively.In the ith round, we set the Grover number to k i , and we calculate the maximum likelihood estimate âki,j of a ki and its CI [a l ki,j , a u ki,j ] from the outcomes of the measurements on G ki |Φ⟩.They are translated to the maximum likelihood estimate âi,j of a and the CIs [θ l i,j , θ u i,j ] and [a l i,j , a u i,j ].The ith round ends if we find the Grover number in the next round by the procedure FindNextK shown as Algorithm 2. In this procedure, we search the next k greedily.Namely, we set it as large as possible, requiring that [Kθ l i,j , Kθ u i,j ] the CI of θ a multiplied by K = 2k + 1 lies in a single quadrant, which enables us to determine the CIs of a and θ a as single intervals.If we cannot find such k in the region that K ≥ r min (2k i + 1), we continue the ith round.When ∆a i,j , the accuracy of the estimate of a from the measurement outcomes so far, goes below the predetermined accuracy level ϵ, the algorithm stops and outputs the estimate at that time.
In the rounds of this process, we operate G k1 , G k2 , . . .with k i increasing exponentially as K i+1 ≥ r min K i , and its value in the final round is of order O(1/ϵ).This means that G is queried O(1/ϵ) times in total, and so are A and S, as stated in Theorem 2.
Note that the parameters N shot and r min in Algorithm 1 are not mentioned in the statement of Theorem 2. Although there may be various settings on these that make the algorithm work, we adopt the following setting in this paper.According to [20], N shot can be set to 1, which means that we search the next k every time we make one measurement on G ki |Φ⟩, and it reduces the query complexity keeping the accuracy.We thus set N shot = 1 hereafter.On r min , there may be some choices such as 2 in [13] and 3 in [20], and we adopt the former hereafter.
We also note that a slight modification in Algorithm 1 from the algorithm in [20]: the former outputs the maximum likelihood estimate â when ∆a i,j becomes smaller than the required accuracy ϵ, while the latter outputs the midpoint of [a l i,j , a u i,j ] when θ u i,j − θ l i,j , the width of the CI of θ a , becomes smaller than 2ϵ.The easily checked relationship |a u i,j − a l i,j | ≤ |θ u i,j − θ l i,j | implies that, if we impose the termination criterion on θ a even though we want to guarantee the accuracy of a, we may take unnecessarily many iterations.We thus adopt the termination criterion on a, and in fact, we confirmed that this leads to a smaller query number than the criterion on θ a in the later numerical experiment, although we will not show the result in the latter setting.

III. NUMERICAL EXPERIMENTS ON THE BIAS IN IQAE
A. Biases for various a Hereafter, in order to understand the bias in IQAE and how it arises, we conduct some numerical experiments.
First of all, since the current problem is characterized by a, the amplitude we want to estimate, we run IQAE and see the bias for the various values of a. Here, "run" does not mean running Algorithm 1 on a real quantum computer or a quantum circuit simulator but a classical simulation of the algorithm based on the probability distribution of the outcomes of the measurements made in the algorithm.Concretely, we replace the step 12 in Algorithm 1 with "Draw n shot samples from Be (a ki ) and let the number of 1's be n 1 ".Note that, if we know the value of a, we know the probability distribution of the outcome in measuring G ki |Φ⟩, that is, |ϕ 1 ⟩ with probability a ki and |ϕ 0 ⟩ with probability 1 − a ki , which is equivalent to Be (a ki ).Therefore, with the above replacement, we can produce the output of Algorithm 1 under the probability distribution it obeys.We hereafter call this simulation procedure Algorithm 1 ′ .
For the estimator â of a, we define its bias as where E[•] denotes the expectation with respect to the randomness of the measurement outcomes in Algorithm 1.In order to estimate the magnitude of the bias for  13)) is plotted in blue, and 2σ(a), the standard error of this average (Eq.( 14)) times 2 is plotted in orange.

B. Reason why the bias is enhanced for specific values of a
We then investigate the reason why the bias is enhanced for specific values of a, fixing a to 0.2505, one of such values.What makes the situation that a = 0.2505 different from the others?
To make the investigation as simple as possible, we should note some points.First, we note that it is sufficient to focus on the final round in Algorithm 1.The output of Algorithm 1 is determined by the result of the MLE in the final round.Therefore, as long as the CI obtained in the round just before the final one successfully encloses a, which occurs with high probability at least 1 − α, the error is determined by the final round only.Second, we note that the final round is characterized by only k fin , the Grover number in that round: other quantities that define the procedure in the final round are automatically determined when k fin is fixed, including R fin , as long as the CI of a encloses its true value.We finally make a note on notation: here and hereafter, like k fin and R fin , when we write quantities that have i fin , the index indicating the final round, in the subscript, we replace i fin with fin for conciseness.
Based on the above discussion and noting that k fin is also a random variable affected by the measurement outcomes in the previous rounds, we write the bias as where G k ′ fin is the event that k fin takes a value k ′ fin , and p k ′ fin is its probability.

Bias conditioned on the final Grover number
We then consider how k fin affects b k fin (a), the bias conditioned on k fin .First, we rewrite â the output of Algorithm 1 as where γfin is the value of γfin,j when holds for the first time.Note that γ l fin,j , γ u fin,j and γfin,j are random variables determined by the measurement outcomes in Algorithm 1, and so is γfin .When k fin ≫ 1, which is a typical situation for small ϵ, the most sensitive dependence of â's distribution on k fin is through the distributions of γ l fin,j , γ u fin,j and γfin,j , while k fin also affects â through K fin and R fin in Eqs.(17) and (18).This is because the distributions of γfin,j etc. largely change even when k fin changes slightly.Under the definition in Eq. ( 8), the distributions of γfin,j etc. are determined by the distribution of âk fin ,j and p(R fin ), the parity of R fin .The distribution of âk fin ,j is determined by that of the outcome of the measurement on G k fin |Φ⟩, which is equivalent to Be(a k fin ), and a k fin can change largely even by shifting k fin by 1.When p(R fin ) flips, the form of γfin,j as a function of âk fin ,j also flips and so does the distribution of γfin,j .These observations imply that we should focus on a k fin and p(R fin ) as key factors for b k fin (a).However, as an equivalent to this, we instead focus on how b k fin (a) is affected by where frac(x) := x − ⌊x⌋ is the fractional part of x ∈ R. f fin has a one-to-one correspondence to the pair (a k fin , p(R fin )), and taking a single quantity f fin as a key factor rather than the pair (a k fin , p(R fin )) makes the following discussion simpler.Since f fin is also a quantity rapidly changing with respect to k fin , in order to understand how it affect b k fin (a), we temporarily deal with k fin and f fin as independent variables, even though k fin determines f fin .We set k fin and f fin separately, and, as an estimate of b k fin (a), calculate the quantity b(k fin , f fin ) by the procedure in Algorithm 3.That is, we run one round of IQAE many times for fixed k fin and f fin , and take the average of the resultant errors in the runs in which IQAE itself is terminated, neglecting the other runs, which move to the next round.We calculate the average only when at least 1,000 runs out of the 10,000 total runs yield outputs other than NaN, since averaging a small number of random results leads to an inaccurate estimate.Note that, in compensation for setting k fin and f fin independently, the true amplitude a (now, 0.2505) is slightly adjusted to ã so that Eq. ( 19) holds.Nevertheless, we expect that b(k fin , f fin ) gives us a good illustration of the behavior of b k fin (a).Also note that the setting in Eq. ( 21) corresponds to the assumption that the true amplitude ã is enclosed in the CI in the previous round.
In Figure 2, we show the results for the various values of f fin with k fin set to several values.For (k fin , f fin ) that yields NaN b(k fin , f fin ), we do not plot a point.
We can intuitively understand how the bias is generated by noting the following two points.
First, we note that ∆a i,j , the estimated accuracy in the intermediate step in Algorithm 1, is also a random variable.It is determined by âki,j , the realized frequency of |ϕ 1 ⟩ in the repeated measurements on G ki |Φ⟩, and N i,j , the number of the measurements.How ∆a i,j depends on âki,j is understood as follows.Because of the upper bound 1 and lower bound 0 of âki,j , the width of the CI [a l ki,j , a u ki,j ] of a ki given as Eq. ( 7) already depends on not only N i,j but also âki,j : if âki,j is closer to 0 or 1, the CI width becomes smaller.In addition to Algorithm 3 Estimate the bias with k fin and f fin fixed 2: for c = 1, ..., Nrun do

3:
Run one round of IQAE (Steps 5-32 in Algorithm 1 ′ ), setting ki ← k fin and and adjusting a to ã.

4:
if the round ends with ∆ai,j ≤ ϵ then

5:
Let ãc be the value of âi,j at the end of the round.this, the derivation of ∆a i,j from âki,j and [a l ki,j , a u ki,j ] by the nonlinear relationship introduce the dependence of ∆a i,j on âki,j .Besides, since âki,j has the one-to-one correspondence to âi,j , we can regard ∆a i,j as a function of âi,j .Figure 3 illustrates this. Figure 3(a) shows a u i,j and a l i,j , the upper and lower ends of the CI of a versus the realized value of âi,j after N = 100 shots in a round with the Grover number k = 200, and Figure 3(b) shows ∆a i,j versus âi,j .
The second point is the termination criterion ∆a i,j ≤ ϵ in Step 23 in Algorithm 1, which means that IQAE ends when the estimated accuracy ∆a i,j becomes smaller than ϵ.This criterion, along with the aforementioned dependence of ∆a i,j on âi,j , leads to the bias.If âi,j goes to the region where ∆a i,j is relatively small, Algorithm 1 tends to end early, and, if âi,j goes to the region with larger ∆a i,j , the algorithm tends to take more shots by the end, or even go to the next round.This makes the probability that the algorithm ends with âi,j corresponding to smaller ∆a i,j and outputs such âi,j as â higher.Such an effect yields the probability distribution of â asymmetric about the point â = a, and the nonzero bias b k fin ̸ = 0.
Although this understanding on how the bias arises is an intuitive one, we will indirectly confirm its validity by seeing in the numerical experiment in Sec.III C that re-executing the final round without the termination criterion mitigates the bias.
With the above understanding, we can also see why the bias b(k fin , f fin ) depends on f fin with k fin fixed.As explained above, changing f fin corresponds to changing a.Assuming that â mainly distributes in the neighborhood of a, the shape of ∆a i,j as a function of âi,j around the point âi,j = a affects the bias, which causes the dependence of the bias on a, and then f fin .For example, if ∆a i,j is increasing in the neighborhood of âi,j = a, the preference of âi,j to values corresponding to small ∆a i,j leads to the negative bias.
We also note that, in Figure 2, the graph of b(k fin , f fin ) is nearly antisymmetric with respect to reflection about the vertical line except the case of k fin = 300.This antisymmetricity is understood as follows.We temporarily ignore the process for the transition to the next round.Then, note that, with k fin fixed, the transform f fin → 1 − f fin conserves a k fin and flips p(R fin ).Thus, by this transform, the distribution of âk fin ,j is unchanged and the relationship between γfin,j and âk fin ,j in Eq. ( 9) is switched.Therefore, assuming that k fin ≫ 1 and neglecting the slight change of a by this transform, we see that this transform flips the distribution of the error â−a: p(â−a; k fin , f fin ) ≈ p(−(â − a); k fin , 1 − f fin ), where p(•; k fin , f fin ) is the probability density of â − a conditioned by k fin and f fin .This means that the transform f fin → 1−f fin also flips the bias as Eq. ( 23).In fact, this discussion is not rigorous, since the process for the transition to the next round breaks the symmetry with respect to f fin → 1 − f fin .This makes the antisymmetricity disappear in Figure 2(b) for k fin = 300.Nevertheless, this antisymmetricity holds for the wide region of (k fin , f fin ), and becomes a key for the phenomenon that the bias is enhanced only for the specific values of a, as seen below.
2. Distribution of (k fin , f fin ) Next, we consider the distribution of (k fin , f fin ).Since f fin is determined by k fin with a fixed and k fin takes natural numbers, (k fin , f fin ) distributes in the 2D plane not continuously but as discrete points.In Figure 4(a), we show the realized values of (k fin , f fin ) for a = 0.2505 in the 10,000 runs of Algorithm 1 ′ .We see that the plotted points are concentrated only on the three lines.This is not a phenomenon observed for general a.For example, in Figure 4(b), the similar plot for a = 0.2006, for which We now plot the distribution of (k fin , f fin ) on the heatmap of b(k fin , f fin ), which is highly illustrative for understanding why the bias is enhanced for specific values of a.In Figure 5, we show the results for a = 0.2505, 0.2006, and, in addition, a = 0.25, for which the 10,000 runs of Algorithm 1 ′ are conducted too.
We obtain b(a) by averaging the values of b(k fin , f fin ) over the realized values of (k fin , f fin ), which is represented by the black points, with the weight proportional to the realization frequency, which is represented by the darkness of the points.In the case of a = 0.2505, as seen above, the realized values of (k fin , f fin ) are not broadly distributed but located on a few specific lines.Contrary FIG. 3. In Figure 3(a), the blue (resp.red) solid line is the upper end a u i,j (resp.lower end a l i,j ) of the CI of the amplitude a versus the realized value of the estimate âi,j after N = 100 shots in an IQAE round with the Grover number k = 200.The black dashed line is a diagonal line just for reference, on which the vertical coordinate is equal to âi,j. Figure 3(b) shows the estimated accuracy ∆ai,j, which is determined by a u i,j , a l i,j , and âi,j as Eq. ( 10), in the same setting.In these figures, the curves are shown over the range of âi,j such that (2k + 1)θ âi,j /(π/2) = ⌊(2k + 1)θa=0.2505/(π/2)⌋.This is the set of the values âi,j can take in the IQAE round if the CI at the beginning of the round encloses a = 0.2505.Of course, even if Eq. (24) holds and thus the realized values of (k fin , f fin ) concentrate on a few lines, the values of b(k fin , f fin ) on the points may accidentally cancel out each other, resulting in a relatively small b(a).Conversely, even if the points are distributed broadly, the distribution is not uniform and the heatmap of b(k fin , f fin ) is not completely antisymmetric, which means that the bias   19).Now, the case of a = 0.25 is added.Again, the points are in transparent black, and thus darker regions indicate that there are many realizations in it and lighter regions indicate the opposite.In the heatmap, the color of the regions with NaN b(k fin , f fin ) is set to white.For a = 0.2505, the realized values of (k fin , f fin ) are concentrated on a few lines, whereas for a = 0.2006, (k fin , f fin ) is distributed more broadly.For a = 0.25, the distribution is concentrated on a few lines but has a reflection symmetry about fmin = 0.5.cancellation is not perfect.Nevertheless, the distribution of realized (k fin , f fin ) and the cancellation of b(k fin , f fin ) on the points roughly describe the phenomenon that the bias is enhanced for the specific values of a.
We remark on the case that θ a can be exactly written in the form of lπ/2m with small m.For example, a = 0.25 gives θ a = lπ/2m with l = 1 and m = 3.Although a = 0.25 has only a slight difference from a = 0.2505, a = 0.25 does not yield the large bias: for it, b(a = 0.25) = 3.5 × 10 −6 , which is one order of magnitude smaller than b(a = 0.2505) = 3.7 × 10 −5 .Figure 5(c) shows the distribution of realized (k fin , f fin ) on the heatmap of b(k fin , f fin ) for a = 0.25.We see that the points are distributed on the two horizontal lines 3 located at the symmetric positions with respect to reflection about the line f fin = 1/2.Since b(k fin , f fin ) is nearly antisymmetric about this line, its values at the realized points largely cancel out each other, yielding the relatively small b(a).

C. Mitigation of the bias
Finally, we propose a simple method to mitigate the bias: just re-executing the final round.Namely, we modify Algorithm 1 as follows.Suppose that we get the CI of a such that ∆a i,j ≤ ϵ in the round with the Grover number k fin and the total shot number N fin .Then, we perform just one additional round using the same k fin and N fin , and let the resultant maximum likelihood estimate â in the added round be the output of the algorithm.In this additional round, we do not impose the termination criterion ∆a i,j ≤ ϵ.We make N fin measurements on G k fin |Φ⟩ even if ∆a i,j ≤ ϵ is satisfied in the middle.We show the modified algorithm as Algorithm 4.

Algorithm 4 Modified IQAE with the final round reexecuted
Input: ϵ, α ∈ (0, 1), N shot ∈ N, rmin > 1 1: Run Algorithm 1.Let the Grover number, the total shot number, and Ri in the final round be k fin , N fin , and R fin , respectively.2: Iterate generating G k fin |Φ⟩ and measuring it N fin times.
Let the number of times |ϕ1⟩ is obtained be N1. 3 Although f fin can take 1 2 for a = 0.25, none of the 10,000 runs ended with f fin = 1 2 , and thus there is no point on the line f fin = 1 2 in Figure 5(c).The reason why we expect that this re-execution-based bias mitigation works is as follows.Recall that, in Algorithm 1, the error is generated only in the final round as long as the second-to-last round yields the CI of a enclosing the true value, and the bias is induced by the termination criterion ∆a i,j ≤ ϵ.We thus consider that running an additional round without the criterion ∆a i,j ≤ ϵ and taking the result as the final output will mitigate the bias.
In Figure 6(a), we plot b(a) in Algorithm 1, which is plotted also in Figure 1, and that in Algorithm 4. As this figure shows, the enhanced biases for the specific values of a in Algorithm 1 are largely reduced in Algorithm 4, which indicates that re-executing the final round mitigates the bias.If we take, among the examined values of a, those for which b(a) ≥ 2σ(a), the average and maximum of the bias reduction rates for these values of a are 57.8% and 99.2%, respectively.
An obvious drawback of this bias mitigation method is the increase of the total number of queries to G by running an additional round.We have confirmed that, at least in our numerical experiments, this query number increase is mild.Figure 6(b) shows the average number of queries to G in the 10,000 run of Algorithm 1 and Algorithm 4 4 .The ratio of the average query number is about 1.25 for any examined value of a.This mild increase is because the final round in Algorithm 1 is not dominant among all the rounds with respect to query number.Namely, on average, the final round accounts for about 25% of the total number of queries to G across all the rounds.

IV. SUMMARY
In this paper, we focused on the bias in IQAE, a widely studied version of QAE.We saw that the bias is enhanced for the specific values of the estimated amplitude a.The termination criterion that the estimated accuracy ∆a i,j of a falls below the predetermined accuracy ϵ is a source of the bias.Decomposing the bias into the bias conditioned by k fin the Grover number in the final round, we found a key factor that determines the magnitude of the bias: the distribution of the realized values of (k fin , f fin ) in the landscape of the conditional bias.Here, f fin is defined as Eq. ( 19), and a main factor to determine the probability distribution of the IQAE estimate â and thus its bias.We found that for a such that Eq. (24) holds with small m and tiny δ, the points of realized (k fin , f fin ) are located only on a few lines in a 2D plane, and the conditional biases at the points do not tend to cancel each other so much, resulting in the large bias.We also proposed a simple bias mitigation method by re-executing the final round with the same Grover number k fin and shot number N fin .We saw that the increase of the total number of queries to G is mild, about 25%.

FIG. 1 .
FIG. 1.For the various values of the amplitude a, | b(a)|, the absolute value of the average of the errors in the 10,000 runs of IQAE (Eq.(13)) is plotted in blue, and 2σ(a), the standard error of this average (Eq.(14)) times 2 is plotted in orange.We set ϵ = 0.001, α = 0.05.
various values of a, we ran Algorithm 1 ′ N run times, and calculate the value of the average of the errors in the runs bn) is the output in the nth run.We can grasp the magnitude of the bias by comparing b(a) and σ(a), since, if b(a) = 0, b(a) is comparable to σ(a) with high probability, e.g., | b(a)| ≤ 2σ(a) with about 95%, for sufficiently large N run .The result is shown in Figure 1.In this figure, we set a to the 201 equally spaced points in [0.001, 0.999] including the ends.The other parameters are set as N run = 10 4 , ϵ = 0.001, α = 0.05, which also applies hereafter.From this figure, we see that, for a considerable fraction of the examined values of a, | b(a)| exceeds 2σ(a), which implies â is biased.In particular, b(a) takes much larger values for specific values of a than other values.Namely, the bias is enhanced for some specific values of a.

(
ãc − ã) ; if N end ≥ 0.1 × Nrun FIG.2.b(k fin , f fin ) calculated by Algorithm 3 for various values of (k fin , f fin ).b(k fin , f fin ) indicates the bias generated in the final round of IQAE conditioned on the Grover number k fin and f fin in Eq.(19).For (k fin , f fin ) that yields NaN b(k fin , f fin ), we do not plot a point.
(a) a u i,j and a l i,j , upper and lower ends of the CI.(b) Estimated accuracy ∆a i,j .

FIG. 4 .
FIG.4.The realized values of the Grover number k fin in the final round and f fin given by Eq. (19) in the 10,000 runs of IQAE for (a) a = 0.2505 and (b) a = 0.2006 are plotted as transparent black points.Darker regions consisting of many overlapped points indicate that there are many realizations in it and lighter regions indicates the opposite.In Figure4(a), the blue, red, and orange solid lines represent frac ((2k fin + 1)θa/π) for k fin such that k fin ≡ 0, 1, and 2 modulo 3, respectively, and the blue, red, and orange horizontal dashed lines represent f fin = 1/6, 1/2, and 5/6, respectively.

FIG. 5 .
FIG. 5.The same realized values of (k fin , f fin ) as Figure 4 are plotted on the heatmap of b(k fin , f fin ) calculated by Algorithm 3. b(k fin , f fin ) indicates the bias generated in the final round of IQAE conditioned on the Grover number k fin and f fin in Eq. (19).Now, the case of a = 0.25 is added.Again, the points are in transparent black, and thus darker regions indicate that there are many realizations in it and lighter regions indicate the opposite.In the heatmap, the color of the regions with NaN b(k fin , f fin ) is set to white.For a = 0.2505, the realized values of (k fin , f fin ) are concentrated on a few lines, whereas for a = 0.2006, (k fin , f fin ) is distributed more broadly.For a = 0.25, the distribution is concentrated on a few lines but has a reflection symmetry about fmin = 0.5.
FIG.6.For the various values of the amplitude a, we plot (a) | b(a)|, the absolute value of the average of the errors in the 10,000 runs of IQAE (Eq.(13)) and (b) the average number of queries to G in the same runs of IQAE.The blue line denotes the result for Algorithm 1, which does not incorporate bias mitigation, and the orange one denotes that for Algorithm 4, which includes bias mitigation.