This section presents quantum circuits for the state preparation in two methods: the PRNtype and the AEtype methods.
4.1 Elementary gate
Before presenting our proposals, we list up elementary gates used in following discussion:

Adder: \(x\rangle y\rangle \rightarrow x+y\rangle y\rangle \)

Controlled Adder: \(c\rangle x\rangle y\rangle \rightarrow \bigl\{ \scriptsize{\begin{array}{l@{\quad}l} c\rangle x+y\rangle y\rangle ; & \text{for } c=1, \\ c\rangle x\rangle y\rangle ; & \text{for } c=0 \end{array}} \)

Multiplier: \(x\rangle y\rangle z\rangle \rightarrow x\rangle y\rangle z+xy\rangle \)

Divider: \(x\rangle y\rangle 0\rangle \rightarrow x\rangle y\rangle x/y\rangle \)
Implementation of those elementary arithmetic are studied in many works [26–46]. With these gates, we can construct the other arithmetic we use. For example, subtraction \(x\rangle y\rangle \rightarrow xy\rangle y\rangle \) can be done as addition by the 2’scomplement of y. The 2’scomplement of nbit number y is defined as \(2^{n}y\), which is equivalent to −y modulo \(2^{n}\). Moreover, comparison \(x\rangle y\rangle z\rangle \rightarrow x\rangle y\rangle z\oplus (x>y)\rangle \) can be done as subtraction in 2’scomplement method, since the most significant bit represents whether the result of subtraction is positive or negative. Thus, a comparator is constructed as two adders including uncomputation.
We also note that the above multiplier uses two registers as operands and outputs the product into another register. However, we need the selfupdate type of multiplier, which updates either of input registers with the product. Such a operation is realized by the following trick:
$$\begin{aligned} x\rangle y\rangle 0\rangle \rightarrow x\rangle y\rangle xy\rangle \rightarrow xy\rangle y\rangle x\rangle \rightarrow xy\rangle y\rangle 0\rangle . \end{aligned}$$
(16)
Here, the first step is original multiplication. The second step is swap between the first and third registers. The third step is the inverse operation of division.
4.2 PRNtype method
4.2.1 Calculation flow
We present the calculation flow of the PRNtype state preparation for pricing in the LV model. Our purpose is estimating Eq. (10) by the PRNtype Monte Carlo method presented in Sect. 3.2. We show the detailed calculation flow to realize operation (15) in the case where the desired value is given by Eq. (10).
Before presenting the calculation flow, we explain our setup. We generate \(N_{\mathrm{samp}}=2^{n_{\mathrm{samp}}} \) sample paths of length \(n_{\mathrm{t}}\) by using the PSNRNs. In this algorithm, we prepare the following registers:

\(R_{\mathrm{samp}}\) is a register for an index of the sample path and consists of \(n_{\mathrm{samp}}\) qubits.

\(R_{W}\) is a register for a PSNRN used to calculate the asset price evolution.

\(R_{S}\) is a register for the value of the asset price.

\(R_{\mathrm{payoff}}\) is a register for the payoffs.
We note that \(R_{W}\) corresponds to the PRN register, and \(R_{\mathrm{payoff}}\) corresponds to the integrand register in Sect. 3.2. On the other hand, \(R_{S}\) is a tailored ancillary register for calculating the specific integrand and has no counterpart. Although some ancillary registers are needed in addition to the above registers, we abbreviate them in the main calculation flow.
We assume that the following gates are available to generate a sequence of PSNRNs.

\(J_{W}\) acts on \(R_{\mathrm{samp}}\otimes R_{W}\) and sets the initial value of the PSNRN subsequence: \(J_{W}i\rangle 0\rangle =i\rangle x_{in_{\mathrm{t}}+1}\rangle \), where \(n_{\mathrm{t}}\) is the number of time steps.

\(P_{W}\) advances a PSNRN sequence by one step: \(P_{W}x_{j}\rangle = x_{j+1}\rangle \), where \(x_{j}\) is the jth element of the PSNRN sequence.
Applying these gates to a superposition of \(N_{\mathrm{samp}} \) states, we obtain \(N_{\mathrm{samp}} \) PSNRN subsequences:
$$\begin{aligned} \frac{1}{\sqrt{N_{\mathrm{samp}}}}\sum_{i=0}^{N_{\mathrm{samp}}1} i\rangle 0\rangle &\xrightarrow{J_{W}} \frac{1}{\sqrt{N_{\mathrm{samp}}}} \sum _{i=0}^{N_{\mathrm{samp}}1}i\rangle x_{in_{\mathrm{t}}+1}\rangle \\ & \xrightarrow{P_{W}} \frac{1}{\sqrt{N_{\mathrm{samp}}}}\sum _{i=0}^{N_{ \mathrm{samp}}1}i\rangle x_{in_{\mathrm{t}}+2}\rangle \\ & \xrightarrow{P_{W}} \dots \xrightarrow{P_{W}} \frac{1}{\sqrt{N_{\mathrm{samp}}}}\sum_{i=0}^{N_{\mathrm{samp}}1} i\rangle x_{in_{\mathrm{t}}+n_{\mathrm{t}}}\rangle \end{aligned}$$
(17)
We also use gate \(U_{j}\) acting on \(R_{W} \otimes R_{S} \otimes R_{\mathrm{payoff}}\), which calculates the jth time step of asset price evolution and the payoff as follows:
$$\begin{aligned} U_{j}x_{j}\rangle S_{t_{j1}}\rangle V_{j1}\rangle &=x_{j}\rangle S_{t_{j}}\rangle \biglV_{j1}+f^{\mathrm{pay}}_{j}(S_{t_{j}})\bigr\rangle . \end{aligned}$$
(18)
In other words, \(U_{j}\) performs the time evolution (8) by using the value on \(R_{W}\) as \(w_{j}\). After that, \(U_{j}\) calculates the payoff at time \(t_{j}\) and adds its value into \(R_{\mathrm{payoff}}\). The concrete implementation of these gates is presented in the next subsection.
The calculation flow of the PRNtype method is as follows:

1
Initialize \(R_{S}\) to \(S_{t_{0}}\rangle \) and the other registers to \(0\rangle \).

2
Generate \(\frac{1}{\sqrt{N_{\mathrm{samp}}}}\sum_{i=0}^{N_{\mathrm{samp}}1} i\rangle \) on \(R_{\mathrm{samp}}\). This is done by applying a Hadamard gate to each qubit of \(R_{\mathrm{samp}}\).

3
Apply \(J_{W}\) to \(R_{\mathrm{samp}}\otimes R_{W}\). This step sets the initial value of the PSNRN subsequence.

4
Apply \(U_{j}\) to \(R_{W} \otimes R_{S} \otimes R_{\mathrm{payoff}}\) to simulate asset price evolution.

5
Apply \(P_{W}\) to \(R_{W}\), which updates \(R_{W}\) from \(x_{in_{\mathrm{t}}+1}\) to \(x_{in_{\mathrm{t}}+2}\).

6
Iterate operations 45 \(n_{\mathrm{t}}\)times.
The flow of the corresponding state transformations is as follows:
$$\begin{aligned}& 0\rangle 0\rangle S_{t_{0}}\rangle 0\rangle \\& \quad \xrightarrow{2} \frac{1}{\sqrt{N_{\mathrm{samp}}}}\sum_{i=0}^{N_{ \mathrm{samp}}1} {i\rangle }0\rangle S_{t_{0}}\rangle 0\rangle \\& \quad \xrightarrow{3} \frac{1}{\sqrt{N_{\mathrm{samp}}}}\sum_{i=0}^{N_{ \mathrm{samp}}1} {i\rangle } x_{in_{\mathrm{t}}+1}\rangle S_{t_{0}}\rangle 0\rangle \\& \quad \xrightarrow{4} \frac{1}{\sqrt{N_{\mathrm{samp}}}}\sum_{i=0}^{N_{ \mathrm{samp}}1} {i\rangle } x_{in_{\mathrm{t}}+1}\rangle S^{(i)}_{t_{1}}\rangle {f^{\mathrm{pay}}_{1}\bigl(S^{(i)}_{t_{1}} \bigr)}\rangle \\& \quad \xrightarrow{5} \frac{1}{\sqrt{N_{\mathrm{samp}}}}\sum_{i=0}^{N_{ \mathrm{samp}}1} {i\rangle }x_{in_{\mathrm{t}}+2}\rangle S^{(i)}_{t_{1}}\rangle {f^{\mathrm{pay}}_{1}\bigl(S^{(i)}_{t_{1}} \bigr)}\rangle \\& \quad \xrightarrow{6} \dots \\& \quad \xrightarrow{6} \frac{1}{\sqrt{N_{{\mathrm{samp}}}}}\sum_{i=0}^{N_{{ \mathrm{samp}}}1} {i\rangle }x_{in_{\mathrm{t}}+n_{\mathrm{t}}}\rangle S^{(i)}_{t_{n_{\mathrm{t}}}}\rangle \Biggl\sum_{j=1}^{n_{\mathrm{t}}}{f^{\mathrm{pay}}_{j} \bigl(S^{(i)}_{t_{j}}\bigr)}\Biggr\rangle , \end{aligned}$$
(19)
where the first, second, third and fourth kets correspond to \(R_{\mathrm{samp}}\), \(R_{W}\), \(R_{S}\) and \(R_{\mathrm{payoff}}\), respectively. The quantum circuit realizing the flow (19) is schematically shown in Fig. 1.
The implementation of \(U_{j}\) is also shown in Fig. 2, where the subroutine gates \(V^{(j)}_{1},\dots ,V^{(j)}_{n_{S}}\) are used to update the asset price according to Eqs. (7) and (8), and the gate \({\mathrm{payoff}}_{j}\) calculates \(f^{\mathrm{pay}}_{j}(S^{(i)}_{t_{j}})\) and adds its value into \(R_{\mathrm{payoff}}\). The subroutine \(V^{(j)}_{k}\) realizes the following three operations:

Checks whether the asset price \(R_{S}\) is in the kth interval \([s_{j,k1},s_{j,k})\).

If that is the case, updates the asset price by Eq. (8) with \(\sigma (t_{j} ,S^{(i)}_{t_{j}}) = a_{j, k}S^{(i)}_{t_{j}} + b_{j,k}\).

Clears all the intermediate output.
This procedure requires three ancillary registers, \(R_{\mathrm{count}}\), \(R_{S^{\prime }}\) and \(R_{g}\). \(R_{\mathrm{count}}\) stores an indicator of whether the jth step of evolution has already been done. If the jth update has already been done, the asset price is not updated, which is necessary to avoid double updating in a single step. \(R_{g}\) stores the check result.
We note that there is a restriction on implementing the LV model in the PRNtype method. Through operations \(V^{(j)}_{1},\dots ,V^{(j)}_{n_{S}+1}\), the state is transformed from \(j\rangle S^{(i)}_{t_{j}}\rangle \) to \(j+1\rangle S^{(i)}_{t_{j+1}}\rangle \), where the first and second kets represents states of \(R_{\mathrm{count}}\) and \(R_{S}\), respectively, and unchanged registers are abbreviated. This map must be onetoone correspondence from the unitarity, which restricts parameters. As shown in Appendix, the unitarity is certified if we set parameters \(a_{i,j}\) and \(b_{i,j}\) so that \(\sigma (t,S)\) is continuous with respect to S and set \(\Delta t_{j}\) sufficiently small.
4.2.2 Implementation of subroutines
We now consider how to implement subroutines used in the PRNtype method by arithmetic operations in Sect. 4.1.
Implementation of \(V^{(j)}_{k}\)
At the start of \(V^{(j)}_{k}\), \(R_{\mathrm{count}}\) takes \(j\rangle \) or \(j+1\rangle \), and the other registers take \(0\rangle \). Then, the detailed calculation flow of \(V^{(j)}_{k}\) is as follows:

1.
Check whether \(R_{\mathrm{count}}\) equals j and \(R_{S}\) is in \([s_{j,k1},s_{j,k})\). If the check is passed, flip \(R_{g}\).

2.
If \(R_{g}\) is 1, update \(R_{S}\) as
$$\begin{aligned} S_{t_{j}}\rightarrow S_{t_{j+1}}=S_{t_{j}}+(a_{j,k}S_{t_{j}}+b_{j,k}) \sqrt{\Delta t_{j}}x_{in_{\mathrm{t}}+j}, \end{aligned}$$
(20)
where \(x_{in_{\mathrm{t}}+j}\) is the value on \(R_{W}\), and add 1 to \(R_{\mathrm{count}}\).

3.
Calculate
$$\begin{aligned} \frac{Sb_{j,k}\sqrt{\Delta t_{j}}x_{in_{\mathrm{t}}+j}}{1+a_{j,k}\sqrt{\Delta t_{j}}x_{in_{\mathrm{t}}+j}} \end{aligned}$$
(21)
into \(R_{S^{\prime }}\), where S is the value on \(R_{S}\).

4.
If \(R_{\mathrm{count}}\) is \(j+1\) and \(R_{S^{\prime }}\) is in \([s_{j,k1},s_{j,k})\), flip \(R_{g}\). This uncomputes \(R_{g}\).

5.
Do the inverse operation of 3.
If and only if the jth step has not been done and the asset price is in \([s_{j,k1},s_{j,k})\), the asset price is upadated with the LV function \(a_{j,k}S+b_{j,k}\). To realize this conditional update, the check result is outputted to \(R_{g}\), and the gate doing update (20) is controlled by \(R_{g}\). The increment of \(R_{\mathrm{count}}\) is also controlled by \(R_{g}\) so that \(R_{\mathrm{count}}\) indicates the completion of the jth update. Steps 35 are necessary to clear \(R_{g}\). From the result of Step 3, we can determine whether the update has been done in Step 2. In step 4, \(R_{g}\) is flipped if and only if it is \(1\rangle \), so it goes back to the initial state \(0\rangle \). In summary, through the sequential operation of \(V^{(j)}_{1},\dots ,V^{(j)}_{n_{S}+1}\), \(R_{S}\) is updated only once at the appropriate \(V^{(j)}_{k}\), \(R_{\mathrm{count}}\) is updated from \(j\rangle \) to \(j+1\rangle \), and all intermediate outputs on ancillary registers are cleared. See also Fig. 3.
Most of subparts of \(V^{(j)}_{k}\) can be constructed from arithmetic operations, addition, subtraction, multiplication, division, and comparison. For example, Let us consider the gate \(z\leftarrow z \oplus (x=j \ \mathrm{and} \ y\in I)\), which is divided to the following two parts. The first part is checking whether the value on \(R_{\mathrm{count}}\) equals j. This check can be done by the multiple control Toffoli gate, which is studied in Refs. [18, 47, 48]. The second part is checking whether the asset price is in a given interval, which can be constructed from two comparisons. Combining these, the gate \(z\leftarrow z \oplus (x=j \ \mathrm{and} \ y\in I)\) is constructed as shown in Fig. 4. Note that the bitwise flips \(X^{1j_{0}} \otimes \cdots \otimes X^{1j_{n_{x}1}}\) are operated before the multi control Toffoli. Here, \(j_{a}\) is the ath digit of the binary representation of j, so the ath qubit is flipped if and only if \(j_{a}=0\). This convert \(x\rangle \) to \(1\rangle \dots 1\rangle \) if and only if \(x=j\).
The operation \(x\leftarrow x+(ax+b)y\) in Fig. 3 can be realized as follows:
$$\begin{aligned} x\rangle y\rangle 0\rangle & \rightarrow x\rangle y\rangle 1\rangle \\ & \rightarrow x\rangle y\rangle 1+ay\rangle \\ & \rightarrow (1+ay)x\rangle y\rangle 1+ay\rangle \\ & \rightarrow (1+ay)x+by\rangle y\rangle 1+ay\rangle \\ & \rightarrow (1+ay)x+by\rangle y\rangle 0\rangle , \end{aligned}$$
(22)
where the third ket corresponds to an ancillary register. The first step is just setting a constant on the ancillary register. The second step is the multiplication by a. The third step is selfupdate multiplication. The fourth step is multiplication by b, and the final step is uncomputation of the first and second steps. Note that this is done under control by \(R_{g}\). In order for this to be controlled, it is sufficient to control only the second, fourth and final arrows because the third arrow becomes multiplication by 1 without the second. Also note that multiplication by an nbit constant can be done by nadders, that is, n shiftandadd’s: \(ax=\sum_{i=0}^{n1}{a_{i}2^{i}x}\), where \(a_{i}\) is the ith bit of a. This method saves the number of qubits compared with the case of using a multiplier, where we need to hold a on an ancillary register.
The operation \(x\leftarrow (xby)/(1+ay)\) in Fig. 3 is done as follows:
$$\begin{aligned} x\rangle y\rangle 0\rangle 0\rangle & \rightarrow x\rangle y\rangle 1\rangle 0\rangle \\ & \rightarrow x\rangle y\rangle 1+ay\rangle 0\rangle \\ & \rightarrow xby\rangle y\rangle 1+ay\rangle 0\rangle \\ & \rightarrow xby\rangle y\rangle 1+ay\rangle (xby)/(1+ay)\rangle , \end{aligned}$$
(23)
where the first, second, third and fourth states correspond to \(R_{S}\), \(R_{W}\), an ancillary register and \(R_{S^{\prime }}\), respectively. The first and second steps are the same as Eq. (22), the third step is the multiplication by −b, and the final step is division. Here, we do not have to uncompute \(R_{S}\) and the ancillary register because the whole of this operation is uncomputed soon after in \(V_{j,k}\).
Implementation of \(J_{W}\) and \(P_{W}\)
In Ref. [6], implementation of PRN on quantum circuits is based on permuted congruential generator (PCG) [49], which is a PRN generation algorithm with small memory requirements. We use the following two gates to run PCG: (i) \(J_{\mathrm{PRN}}\) lets the PRN sequence jump to the \(in_{\mathrm{t}}+1\). (ii) \(P_{\mathrm{PRN}}\) progresses the PRN sequence by a step. Since PCG basically generates uniform PRNs, we transform them to PSNRNs by adopting the inverse transform sampling. The implementation of \(J_{W}\) and \(P_{W}\) are schematically shown in Fig. 5.
Although we refer to Ref. [6] for the detail of the implementation of the PRN generator, we here briefly explain it. PCG is a combination of linear congruential generator (LCG) and permutation of bit string. For LCG, update of the PRN sequence is done by
$$\begin{aligned} x_{n+1} = ax_{n} + c \bmod N, \end{aligned}$$
(24)
where a and N are positive integers, c is a nonnegative integer. From the above equation, the nth element of the sequence is computed from the initial value \(x_{0}\) by
$$\begin{aligned} x_{n} = a^{n}x_{0} + \frac{c(a^{n}1)}{a1} \bmod N. \end{aligned}$$
(25)
We can implement Eqs. (24) and (25) using only controlled adders. According to Ref. [26], the modular adder can be constructed by 5 plain adders. Modular multiplication by a nbit constant can be done as n modular shiftandadd’s. Modular division by a constant \(a1\) can be done as modular multiplication by an integer β such that \(\beta (a1)=1 \bmod N\). Modular exponentiation \(a^{x} \bmod N\) is computed as a sequence controlled modular multiplication [26]. We do not explain permutation: see Ref. [6] for the detail. We make a comment that it is implemented by a simple circuit; for example, Xorshift is implemented as a sequence of CNOT.
The step by step transformation of the implementation of Eq. (24) is as follows:
$$\begin{aligned} x_{n}\rangle 0\rangle 0\rangle & \rightarrow x_{n}\rangle ax_{n} \bmod N\rangle 0\rangle \\ & \rightarrow 0\rangle ax_{n} \bmod N\rangle 0\rangle \\ & \rightarrow 0\rangle ax_{n} \bmod N\rangle c\rangle \\ & \rightarrow 0\rangle ax_{n}+c \bmod N\rangle c\rangle \\ & \rightarrow 0\rangle ax_{n}+c \bmod N\rangle 0\rangle \\ & = 0\rangle x_{n+1}\rangle 0\rangle . \end{aligned}$$
(26)
Here, the first register is \(R_{\mathrm{PRN}}\), and the other registers are ancillary registers. The first step is modular multiplication. The second step is the inverse modular multiplication by an integer α such that \(a\alpha = 1 \bmod N\), which is necessary to avoid the increase of ancillae. The third step is the load of c into an ancillary register, the fourth step is modular addition, and the last step is to unload. Equation (25) progresses as follows:
$$\begin{aligned} & n\rangle 0\rangle 0\rangle 0\rangle \\ & \quad \rightarrow n\rangle \bigl{a^{n} \bmod N}\bigr\rangle 0\rangle 0\rangle \\ & \quad \rightarrow n\rangle \bigl{a^{n} \bmod N}\bigr\rangle \biggl \biggl(x_{0}+\frac{c}{a1} \biggr)a^{n} \bmod N\biggr\rangle 0\rangle \\ & \quad \rightarrow n\rangle \bigl{a^{n} \bmod N}\bigr\rangle \biggl \biggl(x_{0}+\frac{c}{a1} \biggr)a^{n} \bmod N\biggr\rangle \biggl\frac{c}{a1}\biggr\rangle \\ &\quad \rightarrow n\rangle \bigl{a^{n} \bmod N}\bigr\rangle \biggl \biggl(x_{0}+\frac{c}{a1} \biggr)a^{n}  \frac{c}{a1} \bmod N\biggr\rangle \biggl\frac{c}{a1}\biggr\rangle \\ & \quad \rightarrow n\rangle 0\rangle \biggl \biggl(x_{0}+ \frac{c}{a1} \biggr)a^{n}  \frac{c}{a1} \bmod N\biggr\rangle 0\rangle \\ & \quad = n\rangle 0\rangle x_{n}\rangle 0\rangle , \end{aligned}$$
(27)
where the first and third registers are \(R_{\mathrm{samp}}\) and \(R_{\mathrm{PRN}}\), and the other registers are ancillary registers. The first step is modular exponentiation, the second step is modular multiplication, the third step is loading, the fourth step is modular addition, and the last step is uncomputation of the first and third steps.
Implementation of \(\Phi _{\mathrm{SN}}^{1}\)
We also need the gate to calculate \(\Phi _{\mathrm{SN}}^{1}\), the inverse function of the CDF of standard normal distribution. We adopt the method in Ref. [50], where \(\Phi _{\mathrm{SN}}^{1}\) is approximated by a piecewise polynomial function. Let us set the number \(n_{\mathrm{ICDF}}\) of intervals to be 109 and polynomials to be cubic, that is, \(\Phi _{\mathrm{SN}}^{1}\) is approximated as
$$\begin{aligned} \Phi _{\mathrm{SN}}^{1}(x) \approx c_{m,3}x^{3}+c_{m,2}x^{2}+c_{m,1}x+c_{m,0} \end{aligned}$$
(28)
in \(x^{\mathrm{ICDF}}_{m1}\le x< x^{\mathrm{ICDF}}_{m}\), where \(\{x^{\mathrm{ICDF}}_{m} \}_{m=0}^{n_{\mathrm{ICDF}}} \) are the end points of the intervals. This approximation realizes the error smaller than 10^{−6}. Such a piecewise cubic function can be implemented as in Fig. 6. The sequence of comparators and “Load \(c_{m,i}\)’s” gates loads appropriate values of \(c_{m,0},\dots ,c_{m,3}\) into the register \(R_{c,0},\dots ,R_{c,3}\), respectively, as explained later. After the load of coefficients, the cubic function is calculated in the Horner’s method, which is based on the following representation
$$\begin{aligned} \bigl((c_{m,3}x + c_{m,2})x + c_{m,1}\bigr)x + c_{m,0}. \end{aligned}$$
(29)
Horner’s method is realized only by adders and multipliers as presented in the latter half of the circuit in Fig. 6.
Let us explain the way to load the appropriate coefficients. Comparing the input value x of \(R_{\mathrm{PRN}}\) and the grid point \(x^{\mathrm{ICDF}}_{m}\), the mth comparator flips a qubit on \(R_{g}\) if \(x< x^{\mathrm{ICDF}}_{m}\). The register \(R_{g}\) rules the activation of “Load \(c_{m,i}\)’s” gate, that is, the “Load \(c_{m,i}\)’s” gate is activated if \(R_{g}\) is 1 at mth step. If \(x \ge x^{\mathrm{ICDF}}_{n_{\mathrm{ICDF}}}\), only “Load \(c_{n_{\mathrm{ICDF}}+1,i}\)’s” gate is activated, and \(c_{n_{\mathrm{ICDF}}+1,0},\dots ,c_{n_{\mathrm{ICDF}}+1,3}\) are loaded to the registers. However, if \(x^{\mathrm{ICDF}}_{n_{\mathrm{ICDF}}1} \le x < x^{\mathrm{ICDF}}_{n_{ \mathrm{ICDF}}}\), “Load \(c_{n_{\mathrm{ICDF}},i}\)’s” and “Load \(c_{n_{\mathrm{ICDF}}+1,i}\)’s” gates are performed. Hence, we have to set “Load \(c_{n_{\mathrm{ICDF}},i}\)’s” to compensate the effect of “Load \(c_{n_{\mathrm{ICDF}}+1,i}\)’s”. More generally, the activated gates are “Load \(c_{m,i}\)’s” of \(m=M, M+2, \dots , n_{\mathrm{ICDF}}, n_{\mathrm{ICDF}}+1\) if \(n_{\mathrm{ICDF}}M\) is even and that of \(m=M,M+2,\dots ,n_{\mathrm{ICDF}}1,n_{\mathrm{ICDF}}+1\) if \(n_{\mathrm{ICDF}}M\) is odd. This is because \(R_{g}\) is flipped by all comparators after the Mth step and alternates between 0 and 1. Considering those, we set the X gates in “Load \(c_{m,i}\)’s” as in Fig. 7, so that \(c_{m,0},\dots ,c_{m,3}\) for appropriate m are loaded after the sequence of all activated gates.
Implementation of payoff
In this paper, we do not consider gates to calculate payoffs in detail because the resource the gates require is the same in both the PRNtype method and the AEtype method. We here make just a short comment. In many cases, a payoff is expressed as
$$\begin{aligned} f^{\mathrm{pay}}_{i}=\min \bigl\{ \max \{a_{i} S_{t_{i}}+b_{i},f_{i}\},c_{i} \bigr\} , \end{aligned}$$
(30)
where \(a_{i}\), \(b_{i}\), \(c_{i}\), \(f_{i}\) are real constants. Thst is, a payoff is a linear function of the asset price with the upper bound (cap) \(c_{i}\) and the lower bound (floor) \(f_{i}\). For example, a payoff in an European call option (1) corresponds to the case of \(a_{i}=1\), \(b_{i}=K\), \(c_{i}=+\infty \), \(f_{i}=0\). The righthand side of Eq. (30) can be calculated by a combination of comparators, adders, and multipliers.
4.3 The AEtype method
4.3.1 Calculation flow
The AEtype method is simpler than the PRNtype method, but it requires more registers. In the AEtype method, we use the following registers:

\(R_{W_{i}}\) is a register for the ith SNRN (\(i=1,\dots ,n_{ \mathrm{t}}\)).

\(R_{S_{i}}\) is a register for the asset price at time \(t_{i}\) (\(i=0, \dots ,n_{\mathrm{t}}\)).

\(R_{{\mathrm{payoff}},i}\) is a register for the sum of payoffs by \(t_{i}\) (\(i=1,\dots ,n_{\mathrm{t}}\)).
We again abbreviated ancillary registers. In \(R_{W_{i}}\), an SNRN is encoded into a superposition state \(\mathrm{SN}\rangle \), which is defined as
$$\begin{aligned} \mathrm{SN}\rangle :=\sum_{i=0}^{N_{\mathrm{SN}}1}{ \sqrt{p_{{ \mathrm{SN}},i}}i\rangle }, \end{aligned}$$
(31)
where \(p_{{\mathrm{SN}},i}:=\Phi _{\mathrm{SN}}(x_{{\mathrm{SN}},i+1}) \Phi _{\mathrm{SN}}(x_{{\mathrm{SN}},i})\). Here, \(x_{{\mathrm{SN}},0}< x_{{\mathrm{SN}},1}<\cdots <x_{{\mathrm{SN}},N_{ \mathrm{SN}}}\) are the equally spaced \(N_{\mathrm{SN}}+1\) points for discretizing the distribution. We also assume \(N_{\mathrm{SN}}=2^{n_{\mathrm{dig}}}\) with the bit size \(n_{\mathrm{dig}}\) of floating point number for simplicity. We discuss a gate creating such a state in the next subsection.
The calculation flow of the AEtype method is as follows:

1
Initialize \(R_{S_{0}}\) to \(S_{t_{0}}\rangle \) and the others to \(0\rangle \).

2
Generate \(\mathrm{SN}\rangle \) on each of \(R_{W_{1}},\dots ,R_{W_{n_{\mathrm{t}}}}\).

3
Calculate \(S_{t_{1}}\) by the time evolution (8) and output the result to \(R_{S_{1}}\).

4
Calculate the payoff at time \(t_{1}\) and add its value to \(R_{{\mathrm{payoff}},i}\).

5
Iterate operations 34 \(n_{\mathrm{t}}\)times. Then, we obtain a superposition of states in which the value on \(R_{{\mathrm{payoff}},n_{\mathrm{t}}}\) is the sum of payoffs for each path.
The flow of the corresponding state transformation is as follows. Writing only \(R_{W_{1}},\dots , R_{W_{n_{\mathrm{t}}}}\), \(R_{S_{0}},R_{S_{1}},\dots ,R_{S_{n_{\mathrm{t}}}}\) and \(R_{{\mathrm{payoff}},1},\dots ,R_{{\mathrm{payoff}},n_{\mathrm{t}}}\),
$$\begin{aligned} &0\rangle ^{\otimes n_{\mathrm{t}}}S_{t_{0}}\rangle 0\rangle ^{\otimes n_{ \mathrm{t}}} 0\rangle ^{\otimes n_{\mathrm{t}}} \\ &\quad \xrightarrow{2} \mathrm{SN}\rangle ^{\otimes n_{\mathrm{t}}} S_{t_{0}}\rangle 0\rangle ^{\otimes n_{\mathrm{t}}}0\rangle ^{\otimes n_{ \mathrm{t}}} \\ &\quad \xrightarrow{3} \sum_{i_{1}=0}^{N_{\mathrm{SN}}1} \sqrt{p_{{ \mathrm{SN}},i_{1}}}i_{1}\rangle \mathrm{SN}\rangle ^{\otimes n_{ \mathrm{t}}1} S_{t_{0}}\rangle \bigl{S^{(i_{1})}_{t_{1}}}\bigr\rangle 0\rangle ^{ \otimes n_{\mathrm{t}}1}0\rangle ^{\otimes n_{\mathrm{t}}} \\ &\quad \xrightarrow{4} \sum_{i_{1}=0}^{N_{\mathrm{SN}}1} \sqrt{p_{{ \mathrm{SN}},i_{1}}}i_{1}\rangle \mathrm{SN}\rangle ^{\otimes n_{ \mathrm{t}}1} S_{t_{0}}\rangle \bigl{S^{(i_{1})}_{t_{1}}}\bigr\rangle 0\rangle ^{ \otimes n_{\mathrm{t}}1} \bigl{f^{\mathrm{pay}}_{i_{1}}\bigl(S^{(i_{1})}_{t_{1}}\bigr)}\bigr\rangle 0\rangle ^{\otimes n_{ \mathrm{t}}1} \\ &\quad \xrightarrow{5} \dots \\ &\quad \xrightarrow{5} \sum_{i_{1},\ldots ,i_{n_{\mathrm{t}}}=0}^{N_{ \mathrm{SN}}1} \sqrt{p_{{\mathrm{SN}},i_{1}}\dots p_{{\mathrm{SN}},i_{n_{ \mathrm{t}}}}}i_{1}\rangle \dots i_{n_{\mathrm{t}}}\rangle S_{t_{0}}\rangle \bigl{S^{(i_{1})}_{t_{1}}}\bigr\rangle \dots \\ &\qquad \bigl{S^{(i_{1}\cdots i_{n_{\mathrm{t}}})}_{t_{n_{\mathrm{t}}}}}\bigr\rangle \bigl{f^{\mathrm{pay}}_{i_{1}} \bigl(S^{(i_{1})}_{t_{1}}\bigr)}\bigr\rangle \dots \Biggl{\sum _{j=1}^{n_{\mathrm{t}}}{f^{\mathrm{pay}}_{j} \bigl(S^{(i_{1}\dots i_{j})}_{t_{j}}\bigr)}}\Biggr\rangle , \end{aligned}$$
(32)
where \(S^{(i_{1}\dots i_{j})}_{t_{j}}\) is the value of the asset price at time \(t_{j}\) evolved by \(w_{1}=x_{{\mathrm{SN}},i_{1}},\dots ,w_{j}=x_{{\mathrm{SN}},i_{j}}\).
The quantum circuit of the AEtype state preparation is shown in Fig. 8. First, \(\mathrm{SN}\rangle \) is created on each \(R_{W_{j}}\) by SN gate. After that, the gate \(U_{j}\) performs the jth step of asset price evolution and payoff calculation. For each evolution step, we additionally use ancillary registers \(R_{{\mathrm{flg}},j}\) and \(R_{{\mathrm{LV}},j}\), which have 1 and \(2n_{\mathrm{dig}}\) qubits, respectively. The implementation of \(U_{j}\) is shown in Fig. 9. In this gate, the sequence of comparators and “Load” gates set \(a_{j,k}\), \(b_{j,k}\) in Eq. (7) into \(R_{{\mathrm{LV}},j}\) by the trick similar to that in the circuit presented in Fig. 6. Then, operation \(x\leftarrow x+(ax+b)y\) updates the asset price according to Eq. (8). Operation \(x\leftarrow x+(ax+b)y\) can be done as follows:
$$\begin{aligned} x\rangle y\rangle a\rangle b\rangle 0\rangle 0\rangle & \rightarrow x\rangle y\rangle a\rangle b\rangle 0\rangle x\rangle \\ & \rightarrow x\rangle y\rangle a\rangle b\rangle xy\rangle x\rangle \\ & \rightarrow x\rangle y\rangle a\rangle b\rangle xy\rangle x+axy\rangle \\ & \rightarrow x\rangle y\rangle a\rangle b\rangle xy\rangle x+axy+by\rangle , \end{aligned}$$
(33)
where the first ket is the state of \(R_{S_{j1}}\), the second is the state of \(R_{W_{j}}\), the third and fourth are the state of \(R_{{\mathrm{LV}},j}\), the fifth is the state of an ancillary register, and the last is the state of \(R_{S_{j}}\). So, this operation consists of copying a state and three multiplications. At the end of \(U_{j}\), the payoff is calculated by the “\({\mathrm{payoff}}_{j}\)” gate, which performs the following operation
$$\begin{aligned} \bigl{S^{(i_{1}\dots i_{j})}_{t_{j}}}\bigr\rangle \Biggl{\sum _{k=1}^{j1}{f^{\mathrm{pay}}_{k} \bigl(S^{(i_{1}\dots i_{k})}_{t_{k}}\bigr)}}\Biggr\rangle 0\rangle \rightarrow \bigl{S^{(i_{1}\dots i_{j})}_{t_{j}}}\bigr\rangle \Biggl{\sum_{k=1}^{j1}{f^{\mathrm{pay}}_{k} \bigl(S^{(i_{1}\dots i_{k})}_{t_{k}}\bigr)}}\Biggr\rangle \Biggl{\sum _{k=1}^{j}{f^{\mathrm{pay}}_{k} \bigl(S^{(i_{1}\dots i_{k})}_{t_{k}}\bigr)}}\Biggr\rangle , \end{aligned}$$
(34)
where the first, second and third kets correspond to \(R_{S_{j}}\), \(R_{{\mathrm{payoff}},j1}\) and \(R_{{\mathrm{payoff}},j}\). This operation is done by copying \(R_{{\mathrm{payoff}},j1}\) to \(R_{{\mathrm{payoff}},j}\) and adding \(f^{\mathrm{pay}}_{j}(S^{(i_{1}\dots i_{j})}_{t_{j}})\) into \(R_{{\mathrm{payoff}},j}\).
4.3.2 Implementation of the SN gate
Let us consider the implementation of the SN gate, which creates a superposition state \(\mathrm{SN}\rangle \). Although our implementation is mainly based on Ref. [14], we use an approximate by the Taylor expansion.
We construct \(\mathrm{SN}\rangle \) in an inductive way. An intermediate state at mstep is given by
$$\begin{aligned} {\mathrm{SN}}_{m}\rangle :=\sum _{i=0}^{2^{m}1}{\sqrt{p^{(m)}_{{ \mathrm{SN}},i}} i\rangle }, \end{aligned}$$
(35)
where \(p^{(m)}_{{\mathrm{SN}},i}=\int _{x^{(m)}_{{\mathrm{SN}},i}}^{x^{(m)}_{{ \mathrm{SN}},i+1}}\phi _{\mathrm{SN}}(x)\,dx\), and \(\phi _{\mathrm{SN}}(x)\) is the probability density function of the standard normal distribution. Here, \(\{x_{{\mathrm{SN}},i} \}_{i=0}^{2^{m}}\) is the set of equallyspaced \(2^{m}+1\) points dividing the range \([x_{{\mathrm{SN}},0},x_{{\mathrm{SN}},N_{\mathrm{SN}}}]\). We assume the existence of a gate efficiently computing \(\theta ^{(m)}_{i}:=\arccos \sqrt{f^{(m)}_{i}}\) with the input i, where \(f^{(m)}_{i}\) is
$$\begin{aligned} f^{(m)}_{i}:=\frac{\int _{x^{(m)}_{{\mathrm{SN}},i}}^{ (x^{(m)}_{{\mathrm{SN}},i}+x^{(m)}_{{\mathrm{SN}},i+1} )/2}\phi _{\mathrm{SN}}(x)\,dx}{\int _{x^{(m)}_{{\mathrm{SN}},i}}^{x^{(m)}_{{\mathrm{SN}},i+1}}\phi _{\mathrm{SN}}(x)\,dx}. \end{aligned}$$
(36)
Then, the following state transformation is possible:
$$\begin{aligned} {\mathrm{SN}}_{m}\rangle 0\rangle 0\rangle & = \sum _{i=0}^{2^{m}1}{\sqrt{p^{(m)}_{{ \mathrm{SN}},i}} i\rangle 0\rangle 0\rangle } \\ & \rightarrow \sum_{i=0}^{2^{m}1}{ \sqrt{p^{(m)}_{{\mathrm{SN}},i}} i\rangle 0\rangle \bigl{\theta ^{(m)}_{i}}\bigr\rangle } \\ & \rightarrow \sum_{i=0}^{2^{m}1}{ \sqrt{p^{(m)}_{{\mathrm{SN}},i}} i\rangle \bigl(\cos \theta ^{(m)}_{i}0\rangle +\sin \theta ^{(m)}_{i} 1\rangle \bigr)\bigl{\theta ^{(m)}_{i}}\bigr\rangle } \\ & = \sum_{i=0}^{2^{m+1}1}{ \sqrt{p^{(m+1)}_{{\mathrm{SN}},i}}i\rangle \bigl{\theta ^{(m)}_{i}}\bigr\rangle } \\ & = {\mathrm{SN}}_{m+1}\rangle 0\rangle , \end{aligned}$$
(37)
where we use the gate computing \(\theta ^{(m)}_{i}\) at the first step and perform the controlled rotation at the second step. Repeating this operation until \(m=n_{\mathrm{dig}}1\), we finally obtain the desired state \(\mathrm{SN}\rangle \).
The remaining part is constructing the gate to compute \(f^{(m)}_{i}\). Here, we propose a way based on simple Taylor expansion. Let us consider function
$$\begin{aligned} g(x,\delta ):=\frac{\int _{x}^{x+\delta /2}\phi _{\mathrm{SN}}(x)\,dx}{\int _{x}^{x+\delta }\phi _{\mathrm{SN}}(x)\,dx}. \end{aligned}$$
(38)
By simple calculation, it is approximated as
$$\begin{aligned} g(x,\delta )\approx \frac{1}{2} + \frac{1}{8}\delta x + \frac{1}{16} \delta ^{2} + \mathcal{O}\bigl(\delta ^{3}\bigr). \end{aligned}$$
(39)
This result means that, for small δ, \(g(x,\delta )\) is wellapproximated by a linear function of x. We use the above approximation to compute \(f^{(m)}_{i}\), which is represented as
$$\begin{aligned} f^{(m)}_{i}=g \biggl(x^{(m)}_{{\mathrm{SN}},i}, \frac{\Delta }{2^{m}} \biggr),\quad \Delta :=x_{{\mathrm{SN}},N_{\mathrm{SN}}}x_{{ \mathrm{SN}},0}. \end{aligned}$$
(40)
If \(\Delta /2^{m}\) is sufficiently small, \(f^{(m)}_{i}\) can be approximately written as a linear function of i, which is derived from the approximation of g and \(x^{(m)}_{{\mathrm{SN}},i}=x_{{\mathrm{SN}},0}+\frac{\Delta }{2^{m}}i\). We then reach the circuit in Fig. 10 for calculation of \(f^{(m)}_{i}\). For \(m\le 6\), the above approximation yields a large error, and thus we use another method. Here, we apply the most straightforward way, loading precomputed values. The quantum circuit of this method is shown in Fig. 10(a), and it uses a similar technique to the circuit in Fig. 6. In this method, each comparator checks whether the input value i equals its inherent value, and the check result is used for activation of the Load gate. If the input value is I, “Load \(f^{(m)}_{i}\)” gates are activated for all \(i\ge I\). Therefore, each “Load” gate is set to compensate the effect of the following load gates. For \(m\ge 7\), \(f^{(m)}_{i}\) is well approximated by a linear transformation. This transformation can be implemented as bitwise flips followed by a constant multiplier. We note that, depending on the required accuracy, we should adjust the threshold value of m switching calculation method of \(f^{(m)}_{i}\) and also increase the degree of the Taylor expansion.
Then, SN gate is constructed as shown in Fig. 11. First, we operate a Hadamard gate to the most significant bit in \(R_{W_{j}}\) to assign probability 1/2 to positive and negative halves of \([x_{{\mathrm{SN}},0},x_{{\mathrm{SN}},N_{\mathrm{SN}}}]\). We next operate a sequence of gates \(U^{\mathrm{SN}}_{1},\dots , U^{\mathrm{SN}}_{n_{\mathrm{dig}}1}\). \(U^{\mathrm{SN}}_{m}\) corresponds to the mth step of the above recursive calculation and is constructed as a combination of \(f^{(m)}_{i}\) gate, gates for square root and arc cosine, and controlled rotation gate \(R(\theta )\).
Finally, we comment on the implementation of arccos and square root. Reference [51] discusses the implementation of the inverse trigonometric function by the piecewise polynomial approximation. Although they consider not arccos but arcsin, we can easily apply their result by \(\arccos (x)=\frac{\pi }{2}\arcsin (x)\). We adopt a setting with the polynomial degree 3 and 2 intervals, which leads to accuracy 10^{−5} [51]. The circuit to calculate square root is given in Ref. [52].